Ask Your Question
0

HDFS Standalone origin failing with invalid char between encapsulated token and pipe

asked 2018-07-03 11:47:38 -0600

anonymous user

Anonymous

updated 2018-07-03 14:34:23 -0600

metadaddy gravatar image

HDFS standalone origin is failing with error invalid char between encapsulated token and pipe. Here are more details:

Source - text file with pipe delimited

Configuration:

  • Delimiter Character |
  • Escape Character \
  • Quote Character "

Test data -

1|Test data line 1
2|"Test" data line 2.

FIle is coming exactly as above, please help me how I can process this file.

edit retag flag offensive close merge delete

Comments

What else do you have in the pipeline -- processor(s), destination? And when do you get/see the error?

iamontheinet gravatar imageiamontheinet ( 2018-07-03 12:08:06 -0600 )edit

1 Answer

Sort by ยป oldest newest most voted
0

answered 2018-07-03 14:33:58 -0600

metadaddy gravatar image

That is not valid delimited data:

1|Test data line 1
2|"Test" data line 2

If you quote a field, the quotes have to enclose the entire field, like this:

1|Test data line 1
2|"Test data line 2"

If you want to read in your data regardless, you can set the quote character to some value that doesn't appear in the input data, for example, \u0000. If you do this, that data will be read in, and the quote characters will appear in the field value in Data Collector, i.e. "Test" data line 2.

You should only use quotes if the data includes the delimiter character, or a newline - e.g.

1|"Test data | line 1"
2|"Test data
line 2"
edit flag offensive delete link more

Comments

Thank you! This worked!!!!!

Robot gravatar imageRobot ( 2018-07-08 20:37:47 -0600 )edit
Login/Signup to Answer

Question Tools

1 follower

Stats

Asked: 2018-07-03 11:47:38 -0600

Seen: 43 times

Last updated: Jul 03