-
Notifications
You must be signed in to change notification settings - Fork 4
Replacement maps
A replacement map defines a series of changes to make to a dataset automatically via the Replace data (mapping table) filter.
A simple replacement map looks like this:
#x_pattern | #x_substitution | #x_tag |
---|---|---|
Gunea | Guinea | #adm1 |
Guina | Guinea | #adm1 |
Water & Sanitation | WASH | #sector |
OMS | WHO | #org |
You can use replacement maps to correct common spelling mistakes (like "Guina"), standardise terminology ("WASH"), or even convert languages ("WHO" instead of "OMS"). The HXL Proxy can handle replacement tables with thousands of rows, so this is a an efficient way to capture common corrections and then repeat them every time someone sends you a new dataset.
- The original (
#x_pattern
) text is always case- and whitespace-insensitive. That means that if you type "wash" as the pattern, "Wash", "WASH", " WASH", and " wash" will all be replaced. - The new value (
#x_substition
) will always appear as provided. - Replacements will take place only in the columns with the tag pattern(s) (
#x_tag
) specified. - If you leave out
#x_tag
(or leave it blank), the replacement will take place in all columns.
For advanced cases, replacement maps also support regular expressions. To indicate that a pattern is a regular expression rather than literal text, you add a column with the #x_regex
tag, and then put a truthy value (like "1" or "True") in any row that uses a regular expression, like this:
#x_pattern | #x_substitution | #x_tag | #x_regex |
---|---|---|---|
^Gu.?n.?a$ | Guinea | #adm1 | True |
Water & Sanitation | WASH | #sector | |
OMS | WHO | #org |
Note the following for regular expressions:
- The regular expression (
#x_pattern
) does not need to match the whole value. If you want to make sure that it matches the whole value, use^
at the start, and$
at the end. - The replacement (
#x_substitution
) can contain references to groups from the regular expression, like\1
.
#x_pattern: the literal text or regular expression to replace.
#x_substitution: the replacement text.
#x_tag: one or more HXL tag patterns to select the columns where replacement will take place (if not provided, replace text in all columns).
#x_regex: if truthy (e.g. "1" or "True"), then #x_pattern is a regular expression, and `#x_substitution can make references to matching groups in the regular expression (e.g. "\3").
Learn more about the HXL standard at http://hxlstandard.org