Ask Your Question
1

delete records

asked 2019-06-27 07:11:38 -0500

Vss@2019 gravatar image

updated 2019-06-28 07:44:36 -0500

How to delete n lines at top=2 and bottom=1 using jython which is saved in sample.txt sample1.txt

Sample.txt

1,a
2,b
3,c
4,d
5,e
6,f

Sample1.txt

9,g
10,h
13,i
12,j
2,k

Expected Output Sample.txt should contain

3,c
4,d
5,e

Sample1.txt should contain

13,i
12,j
edit retag flag offensive close merge delete

1 Answer

Sort by ยป oldest newest most voted
1

answered 2019-06-27 11:27:18 -0500

jeff gravatar image

Your Jython evaluator can operate at a record-by-record level, or a batch-by-batch level. It sounds like you will want batch level. With that, you can access the records by index from a list that is made available to your script (see the code snippet that is pasted in automatically when you create the stage). You will simply not copy certain records to the output stream, by virtue of their index in that list (which is functionally deleting them). However, this has some caveats. It assumes that your batches have the same structure and same size consistently (ex: same number of lines and semantics about which ones should be deleted). You will need to ensure your origin is consistently producing the complete batch/set of lines.

edit flag offensive delete link more

Comments

It is kind of complete batch. With different filenames. How can we achieve it. It should not make any performance issue if the file size is GB/MB's

Vss@2019 gravatar imageVss@2019 ( 2019-06-27 11:49:38 -0500 )edit

Can you share the code how to achieve it

Vss@2019 gravatar imageVss@2019 ( 2019-06-27 12:25:10 -0500 )edit

The big problem here is recognizing the last line in the file. I can't think of a way to reliably do this in Data Collector.

metadaddy gravatar imagemetadaddy ( 2019-07-02 22:57:09 -0500 )edit

About the only way I see that working is to parse the full file as a single record at the origin (either using whole file format, or text format with a null character line delimiter), then parsing separately, to ensure you have the full contents in a single batch. That is risky, though (memory).

jeff gravatar imagejeff ( 2019-07-03 11:16:18 -0500 )edit

Thanks Meta & Jeff for your inputs

Vss@2019 gravatar imageVss@2019 ( 2019-07-05 01:32:25 -0500 )edit
Login/Signup to Answer

Question Tools

1 follower

Stats

Asked: 2019-06-27 07:11:38 -0500

Seen: 557 times

Last updated: Jun 28