Is there a way to trigger the start of another batch cluster pipeline after another finishes?

asked 2018-01-18

lampshadesdrifter gravatar image

updated 2018-01-18

metadaddy gravatar image

Like the question says, would like to trigger a pipeline after another finishes. I have an initial data ingestion pipeline that uses batch cluster mode to move data from one MapR-FS location to another (this gets started periodically using a script that uses the StreamSets CLI process). I then want to move it to another location from the previous pipeline's destination. Trying to avoid using the YARN streaming mode if possible, since I'm assuming that batch processing requires less resources over time and is faster. Is there a way to do this?

answered 2018-01-19

rupal gravatar image

You can also use Notifications. When the pipeline goes to a FINISHED state, you can trigger a REST API call for the next pipeline to start.

do you have any plans to extend the functionality to send different states to different notifications ? Ex : Send email notification to dev-ops team on error & send notification to management on successful completion?

Budati ( 2018-07-05 )

answered 2018-01-18

metadaddy gravatar image

updated 2018-01-19

Unfortunately, pipeline events are not supported for cluster pipelines.

However, if this were a standalone pipeline, you would be able to use the Shell Executor to run the CLI for your second pipeline, setting it to trigger on the Pipeline Stop event.

I may be using the feature incorrectly, but for a pipeline in batch cluster mode like mine, setting a stop event (eg. Write to another pipeline) other than Discard raises a validation error: "Pipeline lifecycle events are not supported in mode: BATCH CLUSTER".

lampshadesdrifter ( 2018-01-19 )

You're right - I didn't realize that until just now. I corrected my answer.

metadaddy ( 2018-01-19 )
