Ask Your Question
0

Kafka origin is lagging to read the messages

asked 2017-12-22 08:45:28 -0500

Roh gravatar image

To give a little bit of background on my pipeline, here is the link to my previous question.

My Kafka origin is running with one day lag, the messages are not getting broadcasted as I see in the Kafka consumer from the command line. Most of my settings are the default.

My messages throughput in Kafka is around 7K per second, in stream sets pipeline the throughput is 5K per second. Right now my batch size is 1000 which is the default if I increase the batch size will it help? What all the configuration options I have to consider for increasing the speed? Side Note: my pipeline is running in the standalone mode.

edit retag flag offensive close merge delete

Comments

@metadaddy tagging you to get your attention. Thanks in advance

Roh gravatar imageRoh ( 2017-12-22 08:46:12 -0500 )edit

1 Answer

Sort by ยป oldest newest most voted
1

answered 2018-01-03 14:29:04 -0500

jeff gravatar image

Check the pipeline metrics to get a sense of where the time is being taken. It could be in any of the various processor stages, or the origin or destination stages. See if there is anything that jumps out there (ex: scripting processors are often particularly slow).

Beyond that, your best bet for increasing throughput, if you want to increase the number of Kafka messages you can process per unit time, is to increase the number of consumers. You can create another pipeline running on a different SDC instance, consuming from the same topic/consumer group. If using SDC version 2.7.2.0 or later, the Kafka Multitopic Consumer origin can be used, which allows you to specify multiple consumer threads, even on a single topic having multiple partitions. That avoids the need to run additional pipelines to scale.

edit flag offensive delete link more
Login/Signup to Answer

Question Tools

1 follower

Stats

Asked: 2017-12-22 08:45:28 -0500

Seen: 91 times

Last updated: Jan 03