# Formulas

You can use formulas to create smarter forms with Survey123 for ArcGIS.

Previous questions must always be referred to in formulas with the format \${field_name}.

## Operators

Operators supported in Survey123 include the following:

OperatorDescriptionExample

.

.=1

|

Computes two node-sets

//book | //cd

+

6 + 4

-

Subtraction

6 - 4

*

Multiplication

6 * 4

div

Division

8 div 4

=

Equal

price=9.80

!=

Not equal

price!=9.80

<

Less than

price<9.80

<=

Less than or equal to

price<=9.80

>

Greater than

price>9.80

>=

Greater than or equal to

price>=9.80

or

Or

price=9.80 or price=9.70

and

And

price>9.00 and price<9.90

mod

Modulus (division remainder)

5 mod 2

Functions supported in Survey123 include the following:

FunctionDescriptionExample

selected(question, value)

Checks if answer is selected. Used for select_one and select_multiple questions.

selected(\${question_one}, 'a')

count-selected(question)

Returns the number of selected answers. Used for select_multiple questions.

count-selected(\${question_one})

string-length(question)

Returns the length of a nonempty string.

string-length(\${question_one})

substr(question, start, end)

Returns the substring beginning at the specified start and extends to the character at index end -1, where start and end begin at 0.

substr(\${question_one},1 ,2)

not(expression)

Returns a value if the expression is not evaluated.

not(selected(., 'yes'))

if(condition, a, b)

If true, returns a; otherwise, returns b.

if(selected(\${question_one}, 'yes') and selected(\${question_two}, 'yes'), 'yes', 'no')

true()

True

true()

false()

False

false()

uuid()

Returns a random UUID string.

uuid()

random()

Returns a random value between 0 (inclusive) and 1 (exclusive).

random()

today()

Returns today's date. Used in date questions.

today()

now()

Returns a time stamp for this instant. Used in time and dateTime questions.

now()

once()

If a question already has a value, returns the existing value. Useful when using random() or uuid() in a repeated question to ensure the value doesn't change when you browse through the repeat records in the form.

once(uuid())

boolean(question or value)

Returns true if the value provided is not null.

It's recommended to use boolean-from-string() instead.

boolean(\${question_one})

number(question or value)

Converts to number. Conversion varies depending on data type.

number(\${question_one})

int(question or value)

Converts to integer. Conversion varies depending on data type.

int(\${question_one})

string(question or value)

Converts to string. Conversion varies depending on data type.

string({\$question_one})

date()

Converts a number or string to a date object, without preserving time.

date('2017-05-28T04:39:02+10:00')

date-time()

Converts a number or string to a date object.

date-time('2017-05-28T04:39:02+10:00')

decimal-date-time()

Converts a date object into a decimal date-time number.

decimal-date-time(\${date_question})

decimal-time()

Converts a time object into a number representing a fractional day in the device's timezone.

decimal-time(\${time_question})

coalesce(value1, value2)

Returns the first nonempty value. Supports only two values.

coalesce(\${question_one},\${question_two})

concat(value1, value2, …)

Returns the concatenation of the string values.

concat(\${question_one}, ' and ', \${question_two})

max(value1, value2, ...)

Returns the maximum value in a given range, or to a single question across repeats.

max(\${question_one},\${question_two})

min(value1, value2, ...)

Returns the minimum value in a given range, or to a single question across repeats.

min(\${question_one},\${question_two})

sum(repeat)

Returns the sum of all responses to a given question across repeats.

sum(\${question})

count(repeat)

Returns the amount of responses to a given question across repeats.

count(\${question})

property()

Returns the declared device or user property.

pulldata()

Returns information saved in an external CSV, or in the properties of an answer.

pulldata('info','email','name', \${previous_question})

version()

Returns the version of the survey defined in the settings worksheet.

version()

boolean-from-string()

Returns true if the string provided is 'true' or 1'. Otherwise, returns false.

boolean-from-string(\${question_one})

contains(string, substring)

Returns true if the given string contains the substring.

contains(\${question_one}, 'red')

starts-with(string, substring)

Returns true if the given string starts with the substring.

starts-with(\${question_one}, 'The')

ends-with(string, substring)

Returns true if the given string ends with the substring.

ends-with(\${question_one}, 'hand.')

selected-at(question, number)

Used for select_multiple questions. Returns the name of the choice selected for the given number; for example, '2' will return the second selected choice.

selected-at(\${question_one}, 2)

jr:choice-name(choice, 'question')

Used for select_one and select_multiple questions. Returns the label associated with the name of the choice in the given question. Be aware the question must be defined inside quotes.

jr:choice-name(maybe, '\${question_one}')

join(separator, question)

Concatenates all answers to a given question in a repeat, separated by the given separator.

join(',', \${question_in_repeat})

## Constraints

Adding a constraint to a survey question will restrict the accepted inputs for a response. This can include a specific range of numbers, combinations of letters and numbers, or general pattern matching. In your spreadsheet, the constraint expression is entered into the constraint field and informational text is entered into the constraint_message column of the survey worksheet. In the constraint expression, the input for the question is always represented by a full stop.

For example, the following formula can be used to restrict the input of an integer field to positive numbers only:

.>= 0

This formula, when applied to a date field, will prevent the user from entering a value earlier than today:

.>= today()

Calculations can also be used in constraints. This formula will perform a calculation to prevent the user from selecting any dates between today and 14 days from today:

(.>= today()) and (.- (1000 * 24 * 60 * 60 * 14) <= today())

##### Note:

To avoid unexpected errors, if your formula includes decimal values between -1 and 1, be sure to include a leading zero in your values. Without the leading zero, the decimal can be mistaken for the operation character . (or -.). For example, the following expression would fail:

.> .25 and .< 24.25

The following expression would execute as expected:

.> 0.25 and .< 24.25

### Regular expressions

Regular expressions can be used in a wide variety of ways for pattern matching to constrain valid responses to a given format, or to ensure it contains specific content. This example requires that the response to the question include the word "pattern"somewhere in it.

regex(., 'pattern')

The period used at the start of these examples applies the expression to the current field. Adding the name of another field in its place will instead apply the regular expression to that field, which is ideal for relevant expressions. The comma acts as a separator between this field definition and the expression itself.

This regular expression ensures that the response exactly matches the word "pattern", with nothing before or after it.

regex(., '^pattern\$')

Regular expressions are ideal for confining the input of a question to a standard format. This example accepts only the input of a United States five-digit ZIP Code.

regex(., '^\d{5}\$')

This example matches the response to the current format of Indonesian number plates as in the following photo:

regex(., '^[A-Z]{1,2}\d{4}[A-Z]{2,3}\$')

More loosely standardized formats can require very complex regular expressions. This regular expression used by the Survey123 web app constrains the input of a string field to match the format of an email address:

regex(., '^(([^<>()\[\]\\.,;:\s@"]+(\.[^<>()\[\]\\.,;:\s@"]+)*)|(".+"))@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}])|(([a-zA-Z\-0-9]+\.)+[a-zA-Z]{2,}))\$')

This modification of the above expression constrains input to match the format of an email address, while accepting non-English characters:

regex(., '^(([^<>()\[\]\\.,;:\s@"]+(\.[^<>()\[\]\\.,;:\s@"]+)*)|(".+"))@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}])|(([a-zA-Z\u0400-\uffff\-0-9]+\.)+[a-zA-Z\u0400-\uffff]{2,}))\$')

This example, also using a regular expression from the Survey123 web app, limits a field to only accept responses that match the format of a web address:

regex(., ^((https?|ftps?)\\://|)(((?:(?:[\\da-zA-Z](?:[-\\da-zA-Z]{0,61}[\\da-zA-Z])?)\\.)+(?:[a-zA-Z](?:[-\\da-zA-Z]{0,61}[\\da-zA-Z])?)\\.?)|localhost)(\\:\\d+)?(/(?:[^?#\\s/]+/)*(?:[^?#\\s/]+(?:\\?[^?#\\s/]*)?(?:#[A-Za-z][\\w.:-]*)?)?)?)

For more information on regular expressions, refer to the Mozilla Developer Network's documentation. You can also refer to the types sheet in the Survey123 template to see a list of characters and their functions in regular expressions.

##### Note:

Survey123 uses the ODK representation of regular expressions. The only significant difference is that where JavaScript requires two backslashes (for example, \\d to specify a decimal), Survey123 only requires one (for example, \d for a decimal).

## Calculations

Calculations are performed in the calculation column for a question. Calculations are often associated with the calculate type of question, but they can also be applied to integer, decimal, text, and select_one questions. The outcome of the calculation can be used to populate relevant or constraint expressions by referring to the field name of the calculate question. Calculate type questions can be used to hold values that do not need to be displayed on the form but are included in the feature service.

The calculate question type is hidden and doesn't display on a form. This means it can also be used to hold values that don't need to be displayed on the form but are included in the feature service.

For example, you can create a question of type calculate, name it calc, and insert the following expression in its calculation column:

\${question_1} + \${question_2} + \${question_3}

Use the result to set the relevance for the next question:

\${calc} <= 100

Calculations can be used with the responses in date fields. This calculation will estimate the number of years between a date entered and today, ideal for calculating somebody's age:

int((today() - \${birth_date}) div (1000 * 24 * 60 * 60 * 365.25))

Sometimes you will only need part of a value, or a truncated version of a full answer. The substr operator will return only part of a string, defined by the numbers after it. The first character determines the starting point of the selection, while the second value determines the length (if no second value is present, it will continue until the end of the string). In this example, the substr operator removes all but the string from characters 10 to 15:

substr(\${previous_question}, 10, 15)

When the first number is negative, substr will start counting from the end of the string instead of the beginning. This example will return only the last five characters of the answer:

substr(\${previous_question}, -5)

The calculation column can also be used for aggregate functions in repeats. For more information, see Repeats.

### Complex mathematical functions

The calculation column can also handle more complex mathematical operations. This example determines the area of a plot given its radius, using the pi and square root functions:

Other mathematical functions you can use in Survey123 are listed in the following table:

FunctionDescriptionExample

pi()

Returns pi.

pi()

acos(value)

Returns the arccosine of the value.

acos(\${question_one})

asin(value)

Returns the arcsine of the value.

asin(\${question_one})

atan(value)

Returns the arctangent of the value.

atan(\${question_one})

cos(value)

Returns the cosine of the value.

cos(\${question_one})

sin(value)

Returns the sine of the value.

sin(\${question_one})

tan(value)

Returns the tangent of the value as degrees of an angle.

tan(\${question_one})

exp(value)

Returns the natural exponent of the value.

exp(\${question_one})

exp10(value)

Returns 10 to the power of the value.

exp10(\${question_one})

log(value)

Returns the natural logarithm of the value.

log(\${question_one})

log10(value)

Returns the base-ten logarithm of the value.

log10(\${question_one})

sqrt(value)

Returns the square root of the value.

sqrt(\${question_one})

atan2(value1, value2)

Returns the arctangent of the quotient of the values.

atan2(\${question_one},\${question_two})

round(value, power)

Returns the rounded value.

round(\${question_one}, 5)

pow(value, power)

Returns the value to the power specified.

pow(\${question_one}, 3)

The following best practices should be adhered to when using calculations:

• When using random(), consider adding a constant to avoid getting zero (0) as a result, for example, random()+0.5. A value of 0 may result in a blank answer.
• As with constraints, ensure that all decimal values between -1 and 1 in your formula have a leading 0, as leading with a decimal point will cause errors.
• The data type of a calculation result is dependant on the data type of each element of the calculation. If a calculation is performed on two integers, the calculation result is an integer. If a calculation includes a string data type, you may find that a + operator concatenate values instead of adding them. To avoid unexpected results, use the number() function to ensure string values in a calculation are treated as numbers. For example, the calculation to add question1 (of type integer) to question2 (of type text) would be \${question1} + \${integer(question2)}.
##### Note:

The XLSForm default binding type for a calculation question is string. To overwrite this default, you can explicitly designate the type in your spreadsheet. Add a column to your spreadsheet called bind::type and enter the required type (for example, int or decimal) for your question.

### Date formatting

The format-date function can be used in the calculation field to format date and time values. This can be useful when you want to display portions of dates to users, or persist them as strings.

In this example, the value contained in a previous time or dateTime question is returned in 24-hour time:

format-date(\${previous_time},'%H:%M')

The qualifiers that can be used in the format-date function are as follows:

QualifierDescription

%Y

4-digit year

%y

2-digit year

%m

%n

Numeric month

%b

Month name abbreviated

%d

%e

Day of month

%H

%h

Hour (24-hour time)

%M

%S

\$3

%a

Three letter short text day

%W

Week number

## Empty values

When using constraints and calculations that refer to other questions, consider what occurs when that question is empty (that is, it has no response). Empty values are represented as follows:

• NaN (not a number) for integer and decimal questions. This is a special value that represents the absence of a valid value.
• '' (an empty string) for text questions. The default data type for select_one, select_mulitple, and hidden questions is also text. When these question types are empty, they will contain an empty string.

Depending on whether the value is a number or text, the behavior in calculations differ.

In integer or decimal questions, the following behaviors occur:

• Any mathematical expression with a value of NaN will fail to complete, and the question will remain empty.
• The min() and max() functions will complete and ignore any NaN values.
• The NaN value compared to any other value will only be true in a calculation of an "is not equal" value comparison. All other expressions will result in false.

In text questions, the following behaviors occur:

• The concatenation of text questions will complete if empty values are present. For example, "Hello" + \${firstName} + ", how are you?" will result in "Hello , how are you?" when the question firstName is empty.
• The min() and max() functions will complete and ignore any empty strings.
• An empty text answer is equal to another empty text answer and is always less than any nonempty text.

You can determine if a question is empty using the string-length function. The string-length function can be used with all question types. For example, string-length(\${Question1}) will return 0 if Question1 is empty.

## Select_multiple questions

The responses to select_multiple question types are stored differently than others, with each checked answer entered in the order it was selected, separated by commas. For example, selecting answers A and B in that order will present the response as 'A,B'.

Some XLSForm features will not work with select_multiple questions. For example, if you enter the answer 'A' in the relevant column of a question referring to your select_multiple question, and the survey response is 'A,B', the response will not be considered relevant. In this case, the solution would be to use the selected() function, which will check if any one of the values appear in the list.

The following expression, when used in the relevant column of a question, would display the question if the user has selected 'A' as one of the responses in the referenced select_multiple question. Additional responses in the select_multiple question would not change this behavior.

selected(\${previous_question}, 'A')

## Work with external content

By using the pulldata() function in the calculation column of a question, you can preload data from a CSV file that you include in the media folder of your survey. The pulldata() function requires the following four parameters to be specified in order:

1. The name of the CSV file that contains the list of values. The name does not include the .csv filename suffix.
2. The name of the column in the CSV file that contains the value you want to return.
3. The name of the key field in the CSV file that you will use to look up the value.
4. The key value to look up in the key field.

These values can be defined directly or through variables defined elsewhere in the survey. In the following example, the calculation will return the email address of somebody named in a previous question from a CSV file called info:

pulldata('info','email','name', \${previous_question})

The same pulldata() function also works in the constraints column, preventing the user from submitting answers that are not in the CSV file. In the constraints column, this same formula would prevent the form from accepting any values that are not in the CSV file's name column.

There are some limitations to the pulldata() function. The name of the key field has the same limitation as the name column in the choices worksheet, meaning that these values can't have spaces or non-ASCII characters. Additionally, because these are CSV files, using a comma in any of these fields will cause your pulldata() function to produce incorrect results.

##### Note:

The pulldata() function can't be used to populate the values of select_multiple questions.

If values in your CSV file exceed 255 characters, you will need to enter a higher value in the bind::esri:fieldLength column for both the question you're populating with the CSV content and any questions being used as input to the pulldata() function. If your CSV file includes values that are larger than the maximum length of one of these fields, a survey response will fail to submit and will display a Code 1000 error.

### Other uses of pulldata

The pulldata() function can also extract properties provided by the device by providing two parameters: "@property" and the name of the property you intend to extract. This example will return the UTC offset of the device, helping users record the time zone that a survey was taken in.

pulldata("@property", 'utcoffset')

Geopoint questions can have their values extracted and deconstructed by the pulldata() function. For more information, see Geopoints.

Image questions can have EXIF and JSON data extracted by the pulldata() function. For more information, see Media.