Ask Your Question

Reuse the Pipeline for different Parameters

asked 2018-01-09 13:55:18 -0500

jay1988 gravatar image

updated 2018-01-10 17:22:28 -0500

metadaddy gravatar image

I have a Single Pipeline with a JDBC query Consumer and HDFS Target Location. According to the requirements we need to reuse the same Pipeline for loading different tables and different target locations according to table names.

Multitable Table JDBC query Consumer can't be used here as for each load/table we need to have separate load ID's created and loaded to the load tracking table. Is it possible to reuse existing pipelines by the method of parameterization in StreamSets?

edit retag flag offensive close merge delete

1 Answer

Sort by » oldest newest most voted

answered 2018-01-10 17:18:38 -0500

metadaddy gravatar image

updated 2018-01-10 19:27:36 -0500

Yes - you should use Runtime Parameters for this. Define one or more parameters in the Parameters tab of pipeline configuration, and refer to them in the JDBC stage configuration with ${PARAM_NAME}.

Now you can start the pipeline with parameter values via the UI, CLI or REST API.

Note that if you wanted to run the same pipeline for different sets of parameters, you would have to run the pipeline on multiple SDC instances, or go through a start/stop pipeline cycle for each set of parameters. You cannot run multiple instances of the same pipeline with different parameters on a single instance of SDC (SDC-6448 tracks this).

edit flag offensive delete link more


Thanks for the answer.If I’m using a scheduler like controlm can I use the same pipeline multiple times and update the runtime parameters defined?Assume I have 10 tables to load with different load ids.same pipeline will be called controlm but each time runtime parameters has to be changed for table

jay1988 gravatar imagejay1988 ( 2018-01-10 18:23:35 -0500 )edit

Updated my answer (above).

metadaddy gravatar imagemetadaddy ( 2018-01-10 19:27:55 -0500 )edit

Hi, Could you provide example. Teradata(TD) CDC is not there. Only I have select query write. I want to get select record from TD based on CDC and pipeline to MemSQl.. Not getting How to do. I created pipeline not success. I want to get max date from MemSQL and want to use in offset for TD Query

@Arbind gravatar image@Arbind ( 2019-11-21 03:59:56 -0500 )edit
Login/Signup to Answer

Question Tools

1 follower


Asked: 2018-01-09 13:55:18 -0500

Seen: 1,412 times

Last updated: Jan 10 '18