Delivery guarantee, each one every time

asked 2018-04-24 06:27:11 -0500

WendyLG gravatar image

I have only just started reading the overview documentation and spotted what seems a huge hole. If this is covered later, please just say that and leave me to get there.

I see there are 2 options: At least once - meaning if there is an error the whole batch will be reworked meaning some duplication may happen At most once - meaning on an error the whole batch is dropped and the next batch is started, meaning some records may never get processed.

I imagine the most common requirement is for each record to be processed just once, so does "At most once" write the records that never got processed so they can be reworked? does "at least once" mark the records that are being double processed.

I can see how in a pipeline it might be easier to check for duplicates, but how are you meant to notice entire records just going missing?

edit retag flag offensive close merge delete