How can I check for ascii '0' on input stream and sent to error?

asked 2019-03-04 15:19:41 -0500

freddy gravatar image

Hi...

I have a Apache Kafka topic as an origin, providing me some text. This is basic text, with no real format. I want to write it out to an HDFS destination. This simple case works very well.

Occasionally, however, the input contains ascii values I don't want... specifically ascii "0". I want to write these records out to error... but cannot figure out how to check for it.

My latest attempt is to try the JavaScript Evaluator with something like this:

if (records[i].value['text'].contains(0)) {
   throw records[i]
 }

This does not work... I have tried different things here, like "null", "0x00", "\u0000". None of these things seem to work. I have also tried similar things with the Stream Selector. Still having trouble.

Any suggestions on how to best do this?

Thank you

Freddy

edit retag flag offensive close merge delete

Comments

Can you provide sample input record that contains ascii "0"?

iamontheinet gravatar imageiamontheinet ( 2019-03-05 00:39:58 -0500 )edit

Well.. not really. Ascii 0 is non-printable. Maybe I confused by putting quotes around the 0. I do not mean the ascii equivelent of the number 0, which is 48. I mean an ascii value of 0 or null. Viewed with a binary editor, this would show as "00" where the number 0 would show as 48 (or 30 in hex).

freddy gravatar imagefreddy ( 2019-03-05 09:55:13 -0500 )edit

Sample Input record will still help in order to suggest solution.

iamontheinet gravatar imageiamontheinet ( 2019-03-05 10:37:59 -0500 )edit

How do you suggest I do that? I don't see an ability to upload a file, which is the only way I can think of. Again, that character is not printable, so there is no way for me to type it. It is an ascii 0 (or null).

freddy gravatar imagefreddy ( 2019-03-05 10:51:02 -0500 )edit

A normal record would look like this: "37389438 Smith John". An account number, lastname and firstname. A few records have a null character in them, like between the '9' and '4' in the account number. The location of the bad record is inconsistent and won't show here since unprintable.

freddy gravatar imagefreddy ( 2019-03-05 11:19:09 -0500 )edit