How to ensure error records been sent to specific Location?

asked 2018-02-08 02:58:39 -0600

Vivian Y gravatar image

Streamsets contains a configuration where we can choose to discard our error, or send to error either to folder or to kafka or to another pipeline etc.

I have several question to ask to confirm whether what type of error will caused the whole pipeline to be in retry loop and some error will just stop and the specific processor and raise a small red flag to indicates there is some error happen on the specific processor.

For my use case, i am trying to push any error records to a location so that it wont affects the current pipeline running and the consequences data. Sometimes when i hit error, the whole pipeline will turn into "Retry" Mode and keep on retry the same data over again, which will affect the consequence data does not able to be process. And i have to stop the pipeline manually, remove the particular data set and restart the pipeline again.

Is there any way that i can set configuration, if this particular line of data having some error in any of my processor , enforce to send this to error record (either to local folder, send to kafka, or send to another pipeline), and continue to process other data which have no problem?

edit retag flag offensive close merge delete