Ask Your Question

Is there a way to reindex error records to elasticsearch

asked 2019-07-16 11:12:53 -0500

Gk gravatar image

Hi Guys, Can please someone help me with the below issue I am looking into...

I am indexing data from MySQL to elasticsearch constantly during this process when the queued tasks are more than the capacity of the queue, I am getting a stage error for few records with the below log message.

ELASTICSEARCH_16 - Could not index record ::rowCount:9': rejected execution of org.elasticsearch.transport.TransportService$7@e9a1f2c on EsThreadPoolExecutor[name = vm0pnelstsa0002/bulk, queue capacity = 1000, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@37a3005f[Running, pool size = 12, active threads = 12, queued tasks = 1000, completed tasks = 1861602043]]

Increasing the capacity of the queue may delay the cause of this problem, but this might not be a permanent solution. So I am thinking of something like constantly looking for records in the error stage and reindexing them. Is there a way to do this in streamsets or is there any alternate solution for this?

Thank you.

edit retag flag offensive close merge delete

1 Answer

Sort by ยป oldest newest most voted

answered 2019-07-18 15:38:18 -0500

jeff gravatar image

You have two options. First, you can increase the size of your Elasticsearch cluster's bulk operation queue, so that it's large enough to handle your expected load. This is an ES level setting, completely outside the scope of Data Collector. See here for more information.

Secondly, you could implement a buffer (preferably using Kafka, which is well suited for cases like this). Your current pipeline origin will send records to Kafka in one pipeline. Then, another pipeline will read from Kafka and then send to your current Elasticsearch destination. This will allow you to more effectively throttle the pipeline batches/rate while still not losing any updates.

edit flag offensive delete link more
Login/Signup to Answer

Question Tools

1 follower


Asked: 2019-07-16 11:12:53 -0500

Seen: 101 times

Last updated: Jul 18 '19