Ask Your Question

How to Capture a column value from 1st record & last record?

asked 2020-07-22 12:16:16 -0500

Carol gravatar image

I have a pipeilne whose Origin is SFTP/FTP/FTPS Client. Pipeline works fine reading in one more files and processing. Here's the question.... I need to capture one column of information (From the file) on the very FIRST record in the pipeline run, as well as the very LAST record in the pipeline run. There could be thousands or millions or records in these files. The volume varies each time.

I know that I can check the Event records, and on the 'finished-file' event record, i have a few pieces of information available to me such as Record Count, and FileName. I am using those pieces of information, but i need more.

How do i capture a value from the first record of the file, and the last record of the file? Any ideas or suggestions?

edit retag flag offensive close merge delete

1 Answer

Sort by » oldest newest most voted

answered 2020-07-23 03:40:58 -0500

amnopro gravatar image

You can use any "Javascript evaluator" processor and just write first and last record as output of the processor. in the script section of javascript evaluator place the following code:- sdc.output.write(records[0]); sdc.output.write(records[len(records)-1]); Streamsets gives proper examples of how to use Javascript evaluator in the code itself as comments. You can straight away add the processor and work with it. You can also use Groovy evaluator, spark evaluator in a similar fashion.

Hope this helps! :-)

edit flag offensive delete link more


Thanks for the suggestion. I've not used 'Javascript evaluator' before but I will check it out.

Carol gravatar imageCarol ( 2020-07-23 07:17:38 -0500 )edit
Login/Signup to Answer

Question Tools

1 follower


Asked: 2020-07-22 12:16:16 -0500

Seen: 119 times

Last updated: Jul 23