Ask Your Question
1

How to remove special characters from a field name?

asked 2017-06-27 12:56:50 -0500

rupal gravatar image

updated 2018-07-31 15:31:58 -0500

jeff gravatar image

How can a pipeline remove special characters or convert all special characters to something else in all field names?

edit retag flag offensive close merge delete

1 Answer

Sort by ยป oldest newest most voted
3

answered 2017-06-27 14:08:42 -0500

rupal gravatar image

updated 2017-06-27 17:49:26 -0500

LC gravatar image

Try using a Field Renamer processor and provide a regex pattern. It may become cumbersome with a Field Renamer and may be best using a script to do this. For example, you can use the groovy script below with the Groovy Evaluator to convert all special characters in an entire field to underscores:

import java.util.regex.*
Pattern p = Pattern.compile("[^a-zA-Z0-9]");
for (record in records) {
  try {
    def toRemove = []
    def toAdd = [:]
    record.value.each { key, value ->
      if (p.matcher(key).find()) {
        toRemove.add(key)
        toAdd.put(key.replaceAll("[^A-Za-z0-9]", "_"), value)
      }
    }

    toRemove.each { field -> record.value.remove(field) }
    toAdd.each { key, value -> record.value.put(key, value) }

    output.write(record)
  } catch (e) {
    // Write a record to the error pipeline
    log.error(e.toString(), e)
    error.write(record, e.toString())
  }
}
edit flag offensive delete link more
Login/Signup to Answer

Question Tools

Stats

Asked: 2017-06-27 12:56:50 -0500

Seen: 837 times

Last updated: Jul 31