Deduplicate all duplicate records in a stream

asked 2019-04-05 15:53:29 -0500

tlochner95 gravatar image

updated 2019-04-08 10:40:09 -0500


I'm trying to figure out how I can implement something similar to this loop (apparently I do not have enough points to upload an image, never mind). I will instead try and describe my stream:

UPDATE (thanks for upvoting this so I can upload the picture!): image description

For testing purposes, I have a dev raw data source for my origin in the stream, which goes through a couple of transformation stages (processors), and the records then arrive at a "record deduplicator" stream. I was attempting to have the "unique" records continue to a Salesforce destination, which works fine. I wanted the duplicate records to then continue to another "records deduplicator" however, which would then send its unique records to another Salesforce destination (which is a copy of the other Salesforce destination stage). The part that apparently is not allowed though (according to this post), is that I would like the 2nd record deduplicator stage to send its duplicates back to the 1st record deduplicators, thus forming a loop to deduplicate all duplicate records.

I would like to "deduplicate" all duplicate records, before sending to my Salesforce destination.

So, essentially, say I have three duplicate records (rec1, rec2, and rec3). I would like to loop through all records and deduplicate all. The first iteration of the loop would separate "rec1" from "rec2" and "rec3". Then, "rec1" would continue to my Salesforce destination. The second iteration of the loop would then separate "rec2" from "rec3" and send "rec2" to my Salesforce destination. Finally, "rec3" would continue to the Salesforce destination by itself (with out any other duplicates accompanying it).

If anyone has any ideas/insight on how I could achieve this, that would be greatly appreciated. Thank you!

edit retag flag offensive close merge delete



Upvoted your question, so you should be able to edit it and add a pic!

metadaddy gravatar imagemetadaddy ( 2019-04-05 18:11:52 -0500 )edit

Thank you for that! It has been updated now!

tlochner95 gravatar imagetlochner95 ( 2019-04-08 10:41:01 -0500 )edit

Does anyone have any solutions for this?

tlochner95 gravatar imagetlochner95 ( 2019-05-14 13:03:23 -0500 )edit