Streamsets - Oracle/Scheduler Questions

asked 2019-09-15 18:01:44 -0500

Naveen Y gravatar image

I have one use case for which I'm trying to use streamsets. I need know if this is possible through streamsets and how it can be done.

  1. I have source as oracle partitioned Table. Need to get data from latest partition and publish as csv file on google cloud storage. The partition names have certain naming standard. I have query to know latest partition name. I have to take this latest partition name and use dynamic sql to query from latest partition and write to GCP cloud storage. I explored JDBC query consumer but, it doesn't allow dynamic sql. Does streamsets allow passing table or partition name at run time? How can I pass partition name from first query to subsequent query which queries data?

  2. Does streamsets scheduler allow event based triggering like when trigger file is dropped job has to get kicked off?

  3. How can I run multiple pipelines in certain sequence? How a dependency can be established between pipelines?

edit retag flag offensive close merge delete