How to fetch more than one type of files from a directory in StreamSets?

asked 2019-05-27

Jeyakumar

I have a folder which contains different types of the files,however i like to fetch only .txt and .pdf files.How to achieve this in the pipelines?

I have tried with the values for the parameter "File name pattern" as .txt,.pdf,mentioned the "data format" as Whole file.The validation was successful,however when i ran the pipeline and it didn't produce any output.

Please direct me to fix this issue.Thanks in advance.

1 Answer

answered 2019-05-27

Maithri

updated 2019-06-02

Use Regular Expression for file name pattern mode and give a pattern to be matched in file name pattern.

pattern looks like :

Its worked Maithri,thank you

Jeyakumar ( 2019-05-28 )

So you can use Glob patterns as mentioned above to pull multiple file formats. Note however unless you are using whole file format, using mixed types is not going to be helpful. If you select Data Format as Delimited or JSON, and then you try to read PDF you will get parse exceptions.

ak47 ( 2019-05-28 )
