Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

Unique Record ID or Index from Processor Output

I have searched through the Streamsets documentation in search of this answer, but I have had no luck.

I have an example processor that outputs 100 records with similar data (in my case, the records have no unique data or fields to filter by), but I only need the first 50 records. Would it be possible to write a conditional statement to split the first 50 records to one output stream, sending the rest to another (trash) stream? Or utilize another processor to perform this action?

Another workaround that would work for me would be the ability for my processors to read records in batches - say 50 records at a time.

To summarize: When a processor (or origin) outputs records, does each record have a unique ID or row number associated with it that I can use to add more logic to my processing?