How to store status in ETL pipeline?
I can't find good example of ETL pipeline.
There are two databases: one with statuses, second data storage with data. ETL steps: 1. Get entries from statuses table with status (state/flag) 'UNPROCESSED' (JDBC Query Consumer) 2. Execute some queries against data storage database to reprocess data (JDBC Query) 3. Mark entry as 'FINISHED''
There is no anything like 'JDBC Query processor'. How should I update entry status as soon as some query had finished processing data in the data storage database?
Sincerely, Sergey.
What exactly are you doing in step 2 -- JDBC Query?
Have you looked at the JDBC Lookup processor? https://streamsets.com/documentation/datacollector/latest/help/datacollector/UserGuide/Processors/JDBCLookup.html#concept_ysc_ccy_hw