How to Capture a column value from 1st record & last record?

asked 2020-07-22

Carol

I have a pipeilne whose Origin is SFTP/FTP/FTPS Client. Pipeline works fine reading in one more files and processing. Here's the question.... I need to capture one column of information (From the file) on the very FIRST record in the pipeline run, as well as the very LAST record in the pipeline run. There could be thousands or millions or records in these files. The volume varies each time.

I know that I can check the Event records, and on the 'finished-file' event record, i have a few pieces of information available to me such as Record Count, and FileName. I am using those pieces of information, but i need more.

How do i capture a value from the first record of the file, and the last record of the file? Any ideas or suggestions?

1 Answer

answered 2020-07-23

amnopro

You can use any "Javascript evaluator" processor and just write first and last record as output of the processor. in the script section of javascript evaluator place the following code:- sdc.output.write(records[0]); sdc.output.write(records[len(records)-1]); Streamsets gives proper examples of how to use Javascript evaluator in the code itself as comments. You can straight away add the processor and work with it. You can also use Groovy evaluator, spark evaluator in a similar fashion.

Hope this helps! :-)

Thanks for the suggestion. I've not used 'Javascript evaluator' before but I will check it out.

Carol ( 2020-07-23 )
Asked: 2020-07-22

Last updated: Jul 23