Ask Your Question
1

How does offset work in StreamSets Data Collector?

asked 2018-05-27 23:37:49 -0500

Trans gravatar image

updated 2018-05-28 09:34:45 -0500

metadaddy gravatar image

Suppose I have a pipeline with Directory as a origin, some processors in between and JDBC as destination. If the pipeline crashes at some point in the processor, will SDC able to process the failed data also after the origin reads these failed data?

edit retag flag offensive close merge delete

1 Answer

Sort by ยป oldest newest most voted
1

answered 2018-05-28 09:36:38 -0500

metadaddy gravatar image

Yes, in the default 'at least once' mode, Data Collector saves the offset after the data has been successfully sent to the destination. In the case of the Directory origin, the offset is the name of the file being processed and the byte offset into that file. When the pipeline restarts, the origin reads the last offset and begins processing there.

edit flag offensive delete link more
Login/Signup to Answer

Question Tools

1 follower

Stats

Asked: 2018-05-27 23:37:49 -0500

Seen: 39 times

Last updated: May 28