Scheduling the JDBC consumer job

asked 2017-11-24 09:32:08 -0500

Roh gravatar image

I need to schedule the JDBC consumer job to run everyday morning at 5 am, as far as I know, I can make the job run at 5 am when I start the job at 5 am and put 24 hours in the query interval.

But I need to schedule the first instance to start at 5 am without starting it manually (i'm lazy to woke up at 5am :P) Is there a way to achieve this?

answered 2018-11-01 10:34:20 -0500

iamontheinet gravatar image

You can't start multiple pipelines using single CLI command. Use curl instead, like so:

curl -u admin:admin -X POST http://localhost:18630/rest/v1/pipelines/start -d '["PIPELINE_ID_1","PIPELINE_ID_2"]' -H "X-Requested-By:sdc" -H "Content-Type: application/json"

Cheers, Dash

Thanks a ton ....this works like charm in my machine ...but when i am trying to run this from airflow using bash operator it gives syntax error bash_command= curl -u .......

brawal gravatar imagebrawal ( 2018-11-02 04:37:10 -0500 )edit

answered 2017-12-14 13:15:20 -0500

bob gravatar image

updated 2017-12-15 17:32:50 -0500

metadaddy gravatar image

Pipelines can be scheduled via cron or another utility and use the Data Collector's CLI to start and stop the pipelines. 

The documentation for Data Collector CLI is here:

As a reminder, the columns in a crontab entry are:

  • Minutes - 0-59. 
  • Hour 0-23.
  • Day of month 1-31
  • Month of year 1-12
  • Day of week 0-6 (0 is Sunday) 
  • The command to execute.

To run a pipeline on weekdays, at 1:00 am, your crontab entry might look like this: 

00 01 * * 1-5 bin/streamsets cli -U http://localhost:18630 manager start -n MyPipelinejf45e1f1-dfc1-402c-8587-918bc6e831db

Start the pipeline  at 1:00 and run the pipeline Monday through Friday.  Replace 1-5 with * to include weekends.

Depending on your environment, you will likely need to adjust the path above.  Perhaps writing a wrapper script that correctly sets the shell's environment and can start an arbitrary pipeline will make it more manageable.

Thanks @bob, I was able to do it with the curl command and I scheduled it through the oozie it's working like charm please add your points to the answer if something more is helpful.

Roh gravatar imageRoh ( 2017-12-19 10:21:17 -0500 )edit

answered 2017-12-19 10:18:12 -0500

Roh gravatar image

updated 2017-12-19 11:17:40 -0500

metadaddy gravatar image

We can start the pipeline using the Curl commands, As always we can schedule the Curl job from Cron job or oozie or any other workflows

Below is the simple script that can initiate the curl command to start the stream sets pipeline, hope it helps someone !!

#Shell Script to run the pipeline using Curl

#Hostname Of Stream sets

#Streamsets UI port

#Streamsets pipeline ID

#Stream sets Login User

#Stream sets password

echo "SS Host            : "${SSHost}
echo "SS Port            : "${SSPort}
echo "SS Pipe ID         : "${SSPipeID}
echo "SS User            : "${SSUser}

#Using curl to start the Stream sets pipeline
curl -H "X-Requested-By:sdc" -X POST \
    http://${SSHost}:${SSPort}/rest/v1/pipeline/${SSPipeID}/start -u ${SSUser}:${SSPW}
Would be great if you could paste in the text rather than screenshot - would greatly help anyone else wanting to do this :-)

metadaddy gravatar imagemetadaddy ( 2017-12-19 10:55:10 -0500 )edit

@metadaddy Done :)

Roh gravatar imageRoh ( 2017-12-19 11:04:46 -0500 )edit

@Roh Beautiful. BTW - check your email :-)

metadaddy gravatar imagemetadaddy ( 2017-12-19 11:16:28 -0500 )edit

answered 2017-11-27 12:08:51 -0500

metadaddy gravatar image

There is no built-in scheduler in SDC, but you could use cron and the StreamSets CLI to start the pipeline.

Yes, that’s what I found out in my research, thanks !

Roh gravatar imageRoh ( 2017-11-30 05:33:19 -0500 )edit

@metadaddy I've answered my question, please add your points if something more is helpful

Roh gravatar imageRoh ( 2017-12-19 10:19:39 -0500 )edit

answered 2018-11-01 07:53:09 -0500

brawal gravatar image

how can we trigger multiplepipeline id using same api ?

