How to use Apache POI in custom processors to manipulate Excel files from directory origin?

asked 2018-08-31 03:33:56 -0600

updated 2018-08-31 09:30:39 -0600

Hi, I want to read excel files from directory and do some manipulation on it using apache poi and convert it into records. How can I achieve this ?

Thanks, Ruchita

1 Answer

answered 2018-08-31 09:30:26 -0600

Starting with StreamSets Data Collector 3.4.0 the Directory origin can read Excel files. The data is parsed into records by the origin, and can then be manipulated in the pipeline and written out as a supported data format in a destination - Avro, Delimited, JSON etc. Note that, at present, Excel is not supported as an output data format.

I tried that option with only one sheet, but there was NumberFormatException as the data were of different types such as Currency, Percentage. I don't see any option to select particular sheet from the excel.

Ruchita ( 2018-09-03 01:30:09 -0600 )

can we use fileRef to read the exel file in custom processor and use POI in custom processor ?

Ruchita ( 2018-09-03 04:52:11 -0600 )
1 follower


Asked: 2018-08-31 03:33:56 -0600

Seen: 136 times

Last updated: Aug 31 '18