Ask Your Question
0

How do I parse an input file with multi-character delimiters?

asked 2017-09-12 17:09:40 -0500

metadaddy gravatar image

updated 2017-09-12 17:14:50 -0500

My input data has the delimiter || - e.g.

a||b||c
1||1||1
2||2||2
3||3||3
4||4||4
5||5||5

The delimited data format only allows single character delimiters. How do I parse it?

edit retag flag offensive close merge delete

1 Answer

Sort by ยป oldest newest most voted
0

answered 2017-09-12 17:14:07 -0500

metadaddy gravatar image

updated 2017-09-20 12:27:58 -0500

LC gravatar image

You can use the text data format to read in the data line-by-line and then parse it in a script. Here's a Groovy implementation:

// Loop through all the records
for (record in records) {
  try {
    // If this is the first record in the file
    if (record.attributes['offset'] == "0") {
      // Build an array of header names
      state['header'] = []
      for (l in record.value['text'].split('\\|\\|')) {
        state['header'].add(l)
      }
    } else {
      // Parse the values out of the /text field and 
      // add them to the record with the appropriate names
      def i = 0
      for (l in record.value['text'].split('\\|\\|')) {
        record.value[state['header'][i]] = l
        i++
      }
      // Discard the incoming text field
      record.value.remove('text')
      output.write(record)
    }
  } catch (e) {
    log.error(e.toString(), e)
    error.write(record, e.toString())
  }
}
edit flag offensive delete link more
Login/Signup to Answer

Question Tools

1 follower

Stats

Asked: 2017-09-12 17:09:40 -0500

Seen: 293 times

Last updated: Sep 20 '17