Skip to content

Replace data filter

David Megginson edited this page May 12, 2017 · 3 revisions
Replace data filter

The Replace data filter allows you to replace a string or regular expression either throughout your dataset or in specific columns and rows.

If you need to make a lot of replacements, it may make sense to use the Replace data (mapping table) filter instead, so that you have an external Replacement map.

Options

Text or pattern to replace: (required) the text to replace (e.g. "Salud"). The text must match the entire cell, except that character case and whitespace are not significant.

Use a regular-expression pattern: if checked, then the HXL Proxy will interpret _Text or pattern to replace" as a regular expression rather than plain text.

New text: (required) the new text to substitute for the replacement pattern. If using regular expressions, this text may contain sequences like "\1" to include groups from the regular-expression pattern.

Replace only in these columns: a list of tag patterns controlling the columns where the HXL Proxy will replace text (if omitted, replace in all tagged columns).

Replace only in rows matching this query: a row query controlling the rows where the HXL Proxy will replace text (if omitted, replace in all rows).

Example

Replace "Coast" with "Coastal Region" in the column matching the tag pattern "#adm1+name".

Before

#org #sector #adm1+name #adm1+code #targeted
UNICEF Education Coast X001 5000
Save the Children Education Plains X002 300
IOM CCCM Coast X001 1500
UNICEF Protection Plains X002 8000

After

#org #sector #adm1+name #adm1+code #targeted
UNICEF Education Coastal Region X001 5000
Save the Children Education Plains X002 300
IOM CCCM Coastal Region X001 1500
UNICEF Protection Plains X002 8000

Use cases

This filter is especially useful for correcting predictable errors on the fly. For example, if some of your data providers use "WHO" and some use "World Health Organization", you can replace one with the other to make the dataset consistent (and repeat that correction automatically whenever the data is updated).

Not also the benefit of this filter for dealing with different types of transliterations and language conventions, e.g. "Tchad" vs "Chad", or "Taiz" vs "Ta'izz" for the city of "تعز" in Yemen.

Clone this wiki locally