Class RemoveValuesDataProcessor

java.lang.Object
de.cxp.ocs.preprocessor.ConfigureableDataprocessor<PatternConfiguration>
de.cxp.ocs.preprocessor.impl.RemoveValuesDataProcessor
All Implemented Interfaces:
DocumentPreProcessor

public class RemoveValuesDataProcessor extends ConfigureableDataprocessor<PatternConfiguration>
DocumentPreProcessor implementation which removes values from a fields value based on a regular expression. Will be auto configured and can be further configuration like described below:
  data-processor-configuration: 
   processors:
     - RemoveValuesDataProcessor
  configuration:
     RemoveValuesDataProcessor:
       someFieldName: ".*\\d+.*"
       someFieldName_destination: "someDestinationField"
       # Optional configuration:
       # RegEx used to split the value into chunks, //s+ if omitted
       someFieldName_wordSplitRegEx: "/"
       # join character used when combining splitted cleared chunks, default space " "
       someFieldName_wordJoinSeparator: "/"
 
This would remove all numerical values from the field with the name 'someFieldName' and write it into the field 'someDestinationField'. If no destination is specified, the destination will be the source field. This implementation splits the value into separate tokens and checks the regular expression against each token. If the regular expression matches the token, the token get's removed.