Ask Your Question

Where does Data Collector maintain offsets?

asked 2018-12-03 06:49:34 -0600

anonymous user


updated 2018-12-14 11:14:14 -0600

metadaddy gravatar image

In case of system failure with single Data Collector node, or even in the cluster if my entire cluster goes down, does StreamSeta maintain offsets in file or other storage?

edit retag flag offensive close merge delete

1 Answer

Sort by ยป oldest newest most voted

answered 2018-12-03 17:25:51 -0600

iamontheinet gravatar image


If you are using standalone data collector, the offsets are stored in the data directory on disk -- SDC_DATA as described in the documentation. If using Control Hub with jobs, offsets are stored in the internal database for that application. For pipelines running in cluster streaming mode on either Mesos or YARN, the offsets can be stored on HDFS or AWS S3 as described here.

Cheers, Dash

edit flag offensive delete link more
Login/Signup to Answer

Question Tools

1 follower


Asked: 2018-12-03 06:49:34 -0600

Seen: 599 times

Last updated: Dec 14 '18