Ask Your Question
1

How to remove special characters from a string

asked 2017-06-27 12:56:50 -0600

rupal gravatar image

updated 2017-06-27 14:04:57 -0600

How can a pipeline remove special characters or convert special characters to something else in an entire string?

edit retag flag offensive close merge delete

1 Answer

Sort by ยป oldest newest most voted
2

answered 2017-06-27 14:08:42 -0600

rupal gravatar image

updated 2017-06-27 17:49:26 -0600

LC gravatar image

Try using a Field Renamer processor and provide a regex pattern. It may become cumbersome with a Field Renamer and may be best using a script to do this. For example, you can use the groovy script below with the Groovy Evaluator to convert all special characters in an entire field to underscores:

import java.util.regex.*
Pattern p = Pattern.compile("[^a-zA-Z0-9]");
for (record in records) {
  try {
    def toRemove = []
    def toAdd = [:]
    record.value.each { key, value ->
      if (p.matcher(key).find()) {
        toRemove.add(key)
        toAdd.put(key.replaceAll("[^A-Za-z0-9]", "_"), value)
      }
    }

    toRemove.each { field -> record.value.remove(field) }
    toAdd.each { key, value -> record.value.put(key, value) }

    output.write(record)
  } catch (e) {
    // Write a record to the error pipeline
    log.error(e.toString(), e)
    error.write(record, e.toString())
  }
}
edit flag offensive delete link more
Login/Signup to Answer

Question Tools

Stats

Asked: 2017-06-27 12:56:50 -0600

Seen: 437 times

Last updated: Jun 27 '17