How to upgrade a data pipeline running on production by OPS?

asked 2018-06-06

casel.chen

updated 2018-06-06

I want to know what's the best practice to upgrade a data pipeline running on production by OPS. For developer, he can export an updated data pipeline configuration in json file then handle it to OPS and ask latter to deploy it on production.

For OPS, he need

  1. stop & remove the old target data pipeline
  2. import the new target data pipeline, change those pipeline parameters to adapt with production environment
  3. check and run new target data pipeline

Am I right?

My questions are:

  • How the new data pipeline resume processing from the offset left from the old data pipeline?
  • Should I keep the same pipeline UUID with old one in new one?
  • What about need upgrade SDC platform itself?
Answer



answered 2018-06-06

metadaddy

updated 2018-06-06

The short answer to all of this is to use StreamSets Control Hub. Control Hub supports team development with

This is tricky to do manually in Data Collector. Importing a pipeline will assign it a new pipeline ID, so it will no longer be associated with the old offset. It would probably be better to stop the old pipeline, drop the new pipeline.json file into the data/pipelines/<pipelineId> directory, edit the pipeline ID to match the old one, then check and start the new pipeline. To be honest, you're on your own here.

As far as upgrading Data Collector goes, you can open older pipelines in newer versions of Data Collector, but not vice versa. You have to be careful that your pipeline developers aren't using a newer version of Data Collector than ops.

Asked: 2018-06-06

Last updated: Jun 06 '18