Ask Your Question
1

How to fetch more than one type of files from a directory in StreamSets?

asked 2019-05-27 01:28:05 -0500

Jeyakumar gravatar image

I have a folder which contains different types of the files,however i like to fetch only .txt and .pdf files.How to achieve this in the pipelines?

I have tried with the values for the parameter "File name pattern" as .txt,.pdf,mentioned the "data format" as Whole file.The validation was successful,however when i ran the pipeline and it didn't produce any output.

Please direct me to fix this issue.Thanks in advance.

edit retag flag offensive close merge delete

1 Answer

Sort by ยป oldest newest most voted
1

answered 2019-05-27 23:54:21 -0500

Maithri gravatar image

updated 2019-06-02 23:55:52 -0500

Use Regular Expression for file name pattern mode and give a pattern to be matched in file name pattern.

pattern looks like :

                   ^.*\.(pdf|txt)$
edit flag offensive delete link more

Comments

Its worked Maithri,thank you

Jeyakumar gravatar imageJeyakumar ( 2019-05-28 01:03:43 -0500 )edit

So you can use Glob patterns as mentioned above to pull multiple file formats. Note however unless you are using whole file format, using mixed types is not going to be helpful. If you select Data Format as Delimited or JSON, and then you try to read PDF you will get parse exceptions.

ak47 gravatar imageak47 ( 2019-05-28 01:31:09 -0500 )edit
Login/Signup to Answer

Question Tools

2 followers

Stats

Asked: 2019-05-27 01:28:05 -0500

Seen: 74 times

Last updated: Jun 02