Reuse the Pipeline for different Parameters

asked 2018-01-09

jay1988

updated 2018-01-10

metadaddy

I have a Single Pipeline with a JDBC query Consumer and HDFS Target Location. According to the requirements we need to reuse the same Pipeline for loading different tables and different target locations according to table names.

Multitable Table JDBC query Consumer can't be used here as for each load/table we need to have separate load ID's created and loaded to the load tracking table. Is it possible to reuse existing pipelines by the method of parameterization in StreamSets?

answered 2018-01-10

metadaddy

updated 2018-01-10

Yes - you should use Runtime Parameters for this. Define one or more parameters in the Parameters tab of pipeline configuration, and refer to them in the JDBC stage configuration with ${PARAM_NAME}.

Now you can start the pipeline with parameter values via the UI, CLI or REST API.

Note that if you wanted to run the same pipeline for different sets of parameters, you would have to run the pipeline on multiple SDC instances, or go through a start/stop pipeline cycle for each set of parameters. You cannot run multiple instances of the same pipeline with different parameters on a single instance of SDC (SDC-6448 tracks this).

Thanks for the answer.If I’m using a scheduler like controlm can I use the same pipeline multiple times and update the runtime parameters defined?Assume I have 10 tables to load with different load ids.same pipeline will be called controlm but each time runtime parameters has to be changed for table

jay1988 ( 2018-01-10 )

Updated my answer (above).

metadaddy ( 2018-01-10 )

Hi, Could you provide example. Teradata(TD) CDC is not there. Only I have select query write. I want to get select record from TD based on CDC and pipeline to MemSQl.. Not getting How to do. I created pipeline not success. I want to get max date from MemSQL and want to use in offset for TD Query

@Arbind ( 2019-11-21 )
