Ask Your Question
1

Jython Decode Form Encoded Strings

asked 2019-01-30 19:59:15 -0500

tommy_o gravatar image

I have a data input coming in on an HTTP server origin in x-www-urlencoded format but I'm having trouble getting this split into multiple fields.

I was going to use a Jython processor to split the data, but this code throws errors. This works in Python 2 but throws errors in Jython (which I have little exposure to).

from urlparse import parse_qs
import urllib
import json

for record in records:
    try:
        parsed = parse_qs(record)
        output.write(parsed)
    except Exception as e:
        error.write(record, str(e))

If I test this locally with some dummy data:

$ curl -k http://localhost:8000/\?sdcApplicationId\=sample -H 'X-SDC-APPLICATION-ID: sample' -H 'Content-type: application/x-www-form-urlencoded' -d 'Payload=payload&Something=something'

The exception thrown is:

com.streamsets.pipeline.api.base.OnRecordErrorException: SCRIPTING_04 - Script sent record to error: 'com.streamsets.pipeline.stage.processor.scripting.' object has no attribute 'split'
    at com.streamsets.pipeline.stage.processor.scripting.AbstractScriptingProcessor$Err.write(AbstractScriptingProcessor.java:72)
    at sun.reflect.GeneratedMethodAccessor221.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
<truncated>

Before I go deep into this rabbit hole, has anyone does this previously?

edit retag flag offensive close merge delete

Comments

Hi! You might be able to use one of the existing processors. Do you mind updating your question and add sample input record and desired output? It will help come up with suggestions.

iamontheinet gravatar imageiamontheinet ( 2019-01-31 13:47:49 -0500 )edit

1 Answer

Sort by ยป oldest newest most voted
1

answered 2019-01-31 14:02:27 -0500

metadaddy gravatar image

The immediate problem is that you're calling parse_qs on the record, rather than the string field value, which is probably record.value['text'], depending on the origin configuration. Similarly, you'll have to put the parsed value in the record value. This works for me:

from urlparse import parse_qs
import urllib
import json

for record in records:
    try:
        record.value['parsed'] = parse_qs(record.value['text'])
        output.write(record)
    except Exception as e:
        error.write(record, str(e))

However, there is an easier way to split an HTTP query string. Use an Expression Evaluator and the str:splitKV() function. A field expression of ${str:splitKV(record:value('/text'), '&', '=')} will split the KV pairs on the & character, and the keys from the values on =:

image description

In preview:

image description

edit flag offensive delete link more

Comments

1

Perfect, thank you!

tommy_o gravatar imagetommy_o ( 2019-01-31 14:55:45 -0500 )edit
Login/Signup to Answer

Question Tools

1 follower

Stats

Asked: 2019-01-30 19:59:15 -0500

Seen: 124 times

Last updated: Jan 31