-
Notifications
You must be signed in to change notification settings - Fork 4
Row formulas
A row formula takes other values in a row as input and produces output using functions and operators. Here is an example:
(#targeted+f + #targeted+m) / (#affected+f + #affected+m)
Currently, the Add column filter allows row formulas inside double braces {{..}}, but the HXL Proxy will likely support them in other contexts in the future.
Formulas recognise the following atomic values:
- numbers (e.g.
100
,-10.5
) - string literals (e.g.
"West Africa"
) - references to other columns (e.g. tag patterns like
#affected+f
)
String literals support the regular escape sequences like "\t" for tab or "\n" for newline.
Tag patterns referring to other columns expand differently depending on whether they're in a list of parameters for a function, or standing on their own. Consider the following data row:
#affected +f +children | #affected +f +adults | #affected +m +children | #affected +m +adults |
---|---|---|---|
400 | 200 | 300 | 100 |
Outside of a function's parameter list, a tag pattern expands to the first match in a row, reading left to right. So the formula
#affected+f + #affected+m
would evaluate to 700 (the first match for the tag pattern "#affected+f" is "#affected +f +children", and the first match for the tag pattern "#affected+m" is "#affected +m +children").
Inside a function's parameter list, a tag pattern expands to all matching values. So the formula
sum(#affected+f,#affected+m)
would evaluate to 1000, since all four columns match. It would be simpler, of course, to write the above as
sum(#affected)
This is equivalent (in this dataset) to writing
#affected+f+children + #affected+f+adults + #affected+m+children + #affected+m+adults
Formulas support the usual set of basic arithmetic operators, "+" (addition), "-" (subtraction), "*" (multiplication), "/" (division), and "%" (modulo divide, aka remainder), following normal order of operations. You can use parentheses for grouping to override order of operations, e.g.
(#targeted+f + #targeted+m) / (#affected+f + #affected+m)
These operators work only on numeric values: anything that can't be converted to a number is ignored. The result of these operators will always be a number ("0" by default).
As mentioned earlier, tag patterns expand to all matching values when included in a function's parameter list, so a simple formula like sum(#affected)
can operate on many columns. The following functions are currently available.
The result is the sum of all the numeric values in the expanded parameter list, or 0 if there are none.
The result is the product of all the numeric values in the expanded parameter list, or 0 if there are none.
The result is the minimum value in the expanded parameter list. If all of the non-empty values are numeric, the function will use numeric comparison; otherwise, it will use lexical comparison.
The result is the maximum value in the expanded parameter list. If all of the non-empty values are numeric, the function will use numeric comparison; otherwise, it will use lexical comparison.
The result is the average (mean) of all numeric values in the expanded parameter list, or an empty value (not 0) if there are none. The function ignores non-numeric cells, and takes the average of only the numeric ones.
If the argument is a number, round it to the nearest integer.
The result is a string joining all of the values in the expanded parameter list except for the first one, which the function uses as a separator. For example join("|", #sector)
will produce a concatenated string of values of all the #sector columns in the row, separated by "|". Unlike other functions, this one does not ignore empty values, but will also include them in the concatenated string.
Resolves to the current date in YYYY-MM-DD format. Does not make reference to any data in the row. You can use this to add a fixed column with the current date, or on the right side of a row query to test against the current date, e.g. "#date+end <= {{ today() }}"
The result is the difference between the two dates. The number is always positive and rounded to the nearest integer, and will depend on the unit specified: "D" for days, "W" for weeks, "M" for months (using 30-day periods), or "Y" for years (using 365-day periods). The default is "D" if you do not provide a unit.
TODO
Learn more about the HXL standard at http://hxlstandard.org