How to pull the data from only the specified sub directories in a directory structure?

asked 2019-05-13 01:24:12 -0500

Jeyakumar gravatar image

updated 2019-05-13 09:21:49 -0500

metadaddy gravatar image

Hello,

I am newbie to StreamSets and one of the requirement is to pull the data from only the specified sub directories.How this can be achieved in StreamSets?

Origin Directory
Example
    SampleData - Parent directory
        Data1 - Sub directory
            Test1.txt
            Test2.txt
        Data2 - Sub directory
            Test3.txt
        Data3 - Sub directory
        Data4 - Sub directory
        Data5 - Sub directory

As in above we need to create the same structure in HDFS but only the sub directories Data1 and Data2 to be pulled and the output should be like

Destination Directory
SampleData - Parent directory
        Data1 - Sub directory
            Test1.txt
            Test2.txt
        Data2 - Sub directory
            Test3.txt

Thanks in advance

Thanks Jeyakumar

edit retag flag offensive close merge delete