Batch Processing

asked 2019-04-11 02:35:49 -0500

ankitbeohar90 gravatar image

Hi Guys,

How can I fetch the data from any database (mysql, oracle etc) in batches use case is to fetch data for every 5 minutes? Also I want to run my processing in parallel/distributed means read data in one sort and process in parallel e.g. read 1000 record in 5 minute but divide 200 and process (cleaning, apply business rules etc) in parallel and then load into another db. Why I want to do this because want to complete my whole pipeline within 5 minute so that while running next batch there will not be any discrepancy also if any point load is high like weekends or in festival season then again I can run pipeline seamlessly. So reading is fine but processing is the task

Can anyone help how can I do this?

