Ask Your Question
0

How to create multiple records from lines of text in a field

asked 2018-02-28 15:38:40 -0500

metadaddy gravatar image

I have a field in Avro that has multiple lines of Apache log. I'm trying to split the value into multiple records. I use the CR separator and kept getting java.lang.IllegalArgumentException: The delimiter cannot be a line break in the data parser . What am I doing wrong? Is this even supported?

edit retag flag offensive close merge delete

1 Answer

Sort by ยป oldest newest most voted
0

answered 2018-02-28 15:42:07 -0500

metadaddy gravatar image

This is a limitation of the underlying parser we use for delimited data - the data parser is not the right tool for this.

You can do this in the Groovy Evaluator with the following script:

for (record in records) {
  try {
    // Assuming the field in question is '/f'
    def parts = record.value['f'].split('\n')
    0.upto(parts.length - 1) {
      // Create a new record for each line in the field
      newRecord = sdcFunctions.createRecord(record.sourceId + ':${it}')
      newRecord.value = ['f' : parts[it]]
      output.write(newRecord)
    }
  } catch (e) {
    // Write a record to the error pipeline
    log.error(e.toString(), e)
    error.write(record, e.toString())
  }
}
edit flag offensive delete link more
Login/Signup to Answer

Question Tools

1 follower

Stats

Asked: 2018-02-28 15:38:40 -0500

Seen: 201 times

Last updated: Feb 28