Ask Your Question
0

Where does it maintains offsets?

asked 2018-12-03 06:49:34 -0600

anonymous user

Anonymous

In case of system failer with single data collector node or even in the cluster if my entire cluster goes does Streamset maintains offsets in file or other storage?

edit retag flag offensive close merge delete

1 Answer

Sort by ยป oldest newest most voted
0

answered 2018-12-03 17:25:51 -0600

iamontheinet gravatar image

Hi!

If you are using standalone data collector, the offsets are stored in the data directory on disk -- SDC_DATA as described in the documentation. If using Control Hub with jobs, offsets are stored in the internal database for that application. For pipelines running in cluster streaming mode on either Mesos or YARN, the offsets can be stored on HDFS or AWS S3 as described here.

Cheers, Dash

edit flag offensive delete link more
Login/Signup to Answer

Question Tools

1 follower

Stats

Asked: 2018-12-03 06:49:34 -0600

Seen: 15 times

Last updated: Dec 03