JDBC Query Consumer - Pipeline Finisher Not Working

asked 2020-05-17 06:36:14 -0600

Souvik gravatar image

updated 2020-05-17 06:58:06 -0600

We are using ORACLE source to pull data using JDBC Query Consumer as a part of Incremental MODE. We have configured pipeline to work as batch mode (precondition of ${record:eventType() == 'no-more-data'}) and scheduled this job using Control-M. Our query interval was ${10 * SECONDS}. It was working fine as expected in lower environment (DEV/TEST).

But in production, our pipeline was consuming data continuously (Never STOP). After analyzing the production source data, we noticed that records are getting updated every 1 second or less.

We need your help to fix this issue.

edit retag flag offensive close merge delete

Comments

1

Hi Souvik, I would be more interested in knowing your functional requirement. If the data is indeed updating every 1 second or less, what is your criterion for extracting data?

uzumaki gravatar imageuzumaki ( 2020-05-17 22:29:24 -0600 )edit

Our requirement is to create batch job instead of streaming job and stop the jobs after one iteration. The version that we are using is StreamSets Control Hub 3.14.0. The source SQL query is SELECT * FROM TAB_NAME WHERE OFFSET_DATE > TO_DATE(SUBSTR('${OFFSET}',1,19),'YYYY-MM-DD HH24:MI:SS')

Souvik gravatar imageSouvik ( 2020-05-18 01:30:11 -0600 )edit

Any update @uzumaki & @iamontheinet

Souvik gravatar imageSouvik ( 2020-05-19 10:29:11 -0600 )edit
1

It's unclear to me why you would have a requirement to stop the pipeline, but I don't think there is any way to achieve this in the current system. Offsets are committed after every successful batch completion, so your query is pretty much exactly what keeps running. Nothing is "lost".

jeff gravatar imagejeff ( 2020-05-19 17:53:22 -0600 )edit

I agree with @jeff here. Streamsets is working as it should. If the source data never stops updating, pipeline would never know when to stop processing.

uzumaki gravatar imageuzumaki ( 2020-05-19 23:22:42 -0600 )edit