Ask Your Question

HDFS Standalone origin failing with invalid char between encapsulated token and pipe

asked 2018-07-03 11:47:38 -0500

anonymous user


updated 2018-07-03 14:34:23 -0500

metadaddy gravatar image

HDFS standalone origin is failing with error invalid char between encapsulated token and pipe. Here are more details:

Source - text file with pipe delimited


  • Delimiter Character |
  • Escape Character \
  • Quote Character "

Test data -

1|Test data line 1
2|"Test" data line 2.

FIle is coming exactly as above, please help me how I can process this file.

edit retag flag offensive close merge delete


What else do you have in the pipeline -- processor(s), destination? And when do you get/see the error?

iamontheinet gravatar imageiamontheinet ( 2018-07-03 12:08:06 -0500 )edit

1 Answer

Sort by ยป oldest newest most voted

answered 2018-07-03 14:33:58 -0500

metadaddy gravatar image

That is not valid delimited data:

1|Test data line 1
2|"Test" data line 2

If you quote a field, the quotes have to enclose the entire field, like this:

1|Test data line 1
2|"Test data line 2"

If you want to read in your data regardless, you can set the quote character to some value that doesn't appear in the input data, for example, \u0000. If you do this, that data will be read in, and the quote characters will appear in the field value in Data Collector, i.e. "Test" data line 2.

You should only use quotes if the data includes the delimiter character, or a newline - e.g.

1|"Test data | line 1"
2|"Test data
line 2"
edit flag offensive delete link more


Thank you! This worked!!!!!

Robot gravatar imageRobot ( 2018-07-08 20:37:47 -0500 )edit
Login/Signup to Answer

Question Tools

1 follower


Asked: 2018-07-03 11:47:38 -0500

Seen: 28 times

Last updated: Jul 03