Ask Your Question
1

Issue with kafka consumer output batch size

asked 2019-06-12 18:25:41 -0500

Deepak Tiwari gravatar image

updated 2019-06-12 19:24:07 -0500

metadaddy gravatar image

I am facing issue with kafka consumer output batch size.

No matter what value I set to below properties of a kafka consumer

  • Max Batch Size (records)
  • Batch Wait Time (ms)

I am always getting a batch size of 67.

production.maxBatchSize is set to default(1000)

How can I change my output batch size. I want to run my pipeline with smaller batch size so that as soon as a record gets processed it doesn't wait for whole batch to get processed before going to next stage.

edit retag flag offensive close merge delete

1 Answer

Sort by ยป oldest newest most voted
1

answered 2019-06-12 19:33:31 -0500

metadaddy gravatar image

It sounds like data is being written to Kafka with 67 records per message. No matter how you set Max Batch Size, Data Collector cannot part-process a Kafka message. There's no way to tell Kafka 'I'm part way through processing this message', and, by design, Data Collector does not split Kafka messages into separate batches, since that would introduce a risk of data loss. You should look at the producer writing messages to Kafka, and see if it can write fewer records per message.

edit flag offensive delete link more

Comments

Thanks it worked. But now if I am writing single record per message to kafka then I can manage batch size for my pipeline using Max Batch Size.

Deepak Tiwari gravatar imageDeepak Tiwari ( 2019-06-12 22:29:14 -0500 )edit
Login/Signup to Answer

Question Tools

1 follower

Stats

Asked: 2019-06-12 18:25:41 -0500

Seen: 55 times

Last updated: Jun 12