Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

HTML removal in pipeline

I am importing json data into a flat csv file from a json rest api and there is html in a few of the columns. The html is messing up the order of columns. I would like to know if streamsets has the ability to remove html tags inside a field.

I was looking at the expression evaluator and expression language documentation but couldn't connect the dots.

https://streamsets.com/documentation/datacollector/3.4.3/help/datacollector/UserGuide/Expression_Language/ExpressionLanguage_overview.html#concept_p54_4kl_vq

https://streamsets.com/documentation/datacollector/3.4.3/help/datacollector/UserGuide/Processors/Expression.html#concept_zm2_pp3_wq