How to merge CSV files from a given folder?

asked 2018-02-14 22:42:33 -0500

I am very new to streamsets. I need to merge data from two CSV files. Not sure whether we have any processer for merging files. Also I am trying to start with a directory, but not sure how to read two files from a directory? Any help is appreciated.

What do you mean by 'merge'? Combine on a common field, or simply append one to the other?

metadaddy ( 2018-02-16 19:36:25 -0500 )

I mean combine on a common field like a database join between two tables

Ashok ( 2018-02-16 21:31:30 -0500 )

This isn't possible in Data Collector - you would have to use something like Spark for this.

metadaddy ( 2018-06-27 18:08:46 -0500 )

1 Answer

answered 2018-02-20 03:19:40 -0500

About reading two files from a directory yes it is possible to read two or more files from a directory. Refer this:

Use a file name pattern to define the files to process. You can use either a glob pattern or a regular expression to define the file name pattern.

The origin processes files based on the file name pattern mode, file name pattern, and specified directory. For example, if you specify a /logs/weblog/ directory, glob mode, and *.json as the file name pattern, the Directory origin processes all files with the "json" extension in the /logs/weblog/ directory.

Asked: 2018-02-14 22:42:33 -0500

