Ask Your Question
1

How to use Apache POI in custom processors to manipulate Excel files from directory origin?

asked 2018-08-31 03:33:56 -0500

Ruchita gravatar image

updated 2018-08-31 09:30:39 -0500

metadaddy gravatar image

Hi, I want to read excel files from directory and do some manipulation on it using apache poi and convert it into records. How can I achieve this ?

Thanks, Ruchita

edit retag flag offensive close merge delete

1 Answer

Sort by ยป oldest newest most voted
1

answered 2018-08-31 09:30:26 -0500

metadaddy gravatar image

Starting with StreamSets Data Collector 3.4.0 the Directory origin can read Excel files. The data is parsed into records by the origin, and can then be manipulated in the pipeline and written out as a supported data format in a destination - Avro, Delimited, JSON etc. Note that, at present, Excel is not supported as an output data format.

edit flag offensive delete link more

Comments

I tried that option with only one sheet, but there was NumberFormatException as the data were of different types such as Currency, Percentage. I don't see any option to select particular sheet from the excel.

Ruchita gravatar imageRuchita ( 2018-09-03 01:30:09 -0500 )edit

can we use fileRef to read the exel file in custom processor and use POI in custom processor ?

Ruchita gravatar imageRuchita ( 2018-09-03 04:52:11 -0500 )edit
Login/Signup to Answer

Question Tools

1 follower

Stats

Asked: 2018-08-31 03:33:56 -0500

Seen: 63 times

Last updated: Aug 31