Ask Your Question
1

Scheduling the JDBC consumer job

asked 2017-11-24 09:32:08 -0500

Roh gravatar image

I need to schedule the JDBC consumer job to run everyday morning at 5 am, as far as I know, I can make the job run at 5 am when I start the job at 5 am and put 24 hours in the query interval.

But I need to schedule the first instance to start at 5 am without starting it manually (i'm lazy to woke up at 5am :P) Is there a way to achieve this?

edit retag flag offensive close merge delete

3 Answers

Sort by » oldest newest most voted
1

answered 2017-12-19 10:18:12 -0500

Roh gravatar image

updated 2017-12-19 11:17:40 -0500

metadaddy gravatar image

We can start the pipeline using the Curl commands, As always we can schedule the Curl job from Cron job or oozie or any other workflows

Below is the simple script that can initiate the curl command to start the stream sets pipeline, hope it helps someone !!

#!bin/bash
#Shell Script to run the pipeline using Curl

#Hostname Of Stream sets
SSHost=${1}

#Streamsets UI port
SSPort=${2}

#Streamsets pipeline ID
SSPipeID=${3}

#Stream sets Login User
SSUser=${4}

#Stream sets password
SSPW=${5}

echo "SS Host            : "${SSHost}
echo "SS Port            : "${SSPort}
echo "SS Pipe ID         : "${SSPipeID}
echo "SS User            : "${SSUser}

#Using curl to start the Stream sets pipeline
curl -H "X-Requested-By:sdc" -X POST \
    http://${SSHost}:${SSPort}/rest/v1/pipeline/${SSPipeID}/start -u ${SSUser}:${SSPW}
edit flag offensive delete link more

Comments

Would be great if you could paste in the text rather than screenshot - would greatly help anyone else wanting to do this :-)

metadaddy gravatar imagemetadaddy ( 2017-12-19 10:55:10 -0500 )edit
1

@metadaddy Done :)

Roh gravatar imageRoh ( 2017-12-19 11:04:46 -0500 )edit

@Roh Beautiful. BTW - check your email :-)

metadaddy gravatar imagemetadaddy ( 2017-12-19 11:16:28 -0500 )edit
2

answered 2017-12-14 13:15:20 -0500

bob gravatar image

updated 2017-12-15 17:32:50 -0500

metadaddy gravatar image

Pipelines can be scheduled via cron or another utility and use the Data Collector's CLI to start and stop the pipelines. 

The documentation for Data Collector CLI is here:  https://streamsets.com/documentation/...

As a reminder, the columns in a crontab entry are:

  • Minutes - 0-59. 
  • Hour 0-23.
  • Day of month 1-31
  • Month of year 1-12
  • Day of week 0-6 (0 is Sunday) 
  • The command to execute.

To run a pipeline on weekdays, at 1:00 am, your crontab entry might look like this: 

00 01 * * 1-5 bin/streamsets cli -U http://localhost:18630 manager start -n MyPipelinejf45e1f1-dfc1-402c-8587-918bc6e831db

Start the pipeline  at 1:00 and run the pipeline Monday through Friday.  Replace 1-5 with * to include weekends.

Depending on your environment, you will likely need to adjust the path above.  Perhaps writing a wrapper script that correctly sets the shell's environment and can start an arbitrary pipeline will make it more manageable.

edit flag offensive delete link more

Comments

1

Thanks @bob, I was able to do it with the curl command and I scheduled it through the oozie it's working like charm please add your points to the answer if something more is helpful.

Roh gravatar imageRoh ( 2017-12-19 10:21:17 -0500 )edit
1

answered 2017-11-27 12:08:51 -0500

metadaddy gravatar image

There is no built-in scheduler in SDC, but you could use cron and the StreamSets CLI to start the pipeline.

edit flag offensive delete link more

Comments

Yes, that’s what I found out in my research, thanks !

Roh gravatar imageRoh ( 2017-11-30 05:33:19 -0500 )edit
1

@metadaddy I've answered my question, please add your points if something more is helpful

Roh gravatar imageRoh ( 2017-12-19 10:19:39 -0500 )edit
Login/Signup to Answer

Question Tools

1 follower

Stats

Asked: 2017-11-24 09:32:08 -0500

Seen: 1,025 times

Last updated: Dec 19 '17