How to remove special characters from a field name?

How can a pipeline remove special characters or convert all special characters to something else in all field names?

1 Answer

Try using a Field Renamer processor and provide a regex pattern. It may become cumbersome with a Field Renamer and may be best using a script to do this. For example, you can use the groovy script below with the Groovy Evaluator to convert all special characters in an entire field to underscores:

import java.util.regex.*
Pattern p = Pattern.compile("[^a-zA-Z0-9]");
for (record in records) {
  try {
    def toRemove = []
    def toAdd = [:]
    record.value.each { key, value ->
      if (p.matcher(key).find()) {
        toAdd.put(key.replaceAll("[^A-Za-z0-9]", "_"), value)

    toRemove.each { field -> record.value.remove(field) }
    toAdd.each { key, value -> record.value.put(key, value) }

  } catch (e) {
    // Write a record to the error pipeline
    log.error(e.toString(), e)
    error.write(record, e.toString())
