Ask Your Question
1

Where does Data Collector maintain offsets?

asked 2018-12-03 06:49:34 -0500

anonymous user

Anonymous

updated 2018-12-14 11:14:14 -0500

metadaddy gravatar image

In case of system failure with single Data Collector node, or even in the cluster if my entire cluster goes down, does StreamSeta maintain offsets in file or other storage?

edit retag flag offensive close merge delete

1 Answer

Sort by » oldest newest most voted
1

answered 2018-12-03 17:25:51 -0500

iamontheinet gravatar image

Hi!

If you are using standalone data collector, the offsets are stored in the data directory on disk -- SDC_DATA as described in the documentation. If using Control Hub with jobs, offsets are stored in the internal database for that application. For pipelines running in cluster streaming mode on either Mesos or YARN, the offsets can be stored on HDFS or AWS S3 as described here.

Cheers, Dash

edit flag offensive delete link more
Login/Signup to Answer

Question Tools

1 follower

Stats

Asked: 2018-12-03 06:49:34 -0500

Seen: 88 times

Last updated: Dec 14 '18