XML is a common text-based markup language. ArcGIS Velocity can ingest IoT observation data expressed as XML from a variety of sources.
XML is supported as a data format for the following feed and data source types:
- Feeds—AWS IoT, Azure Event Hub, Azure Service Bus, HTTP Poller, HTTP Receiver, Kafka , MQTT, RabbitMQ, WebSocket.
- Data sources—Amazon S3, Azure Blob Storage, and HTTP Poller.
Options for XML configuration
When configuring a feed or data source, sampling of the data occurs to determine the type of data being ingested. If sampling determines the data to be XML, additional properties regarding the XML configuration can be specified.
Root element
The Root element parameter specifies, if applicable, the root element of the XML structure in which messages are found. The root element should identify an object whose subelements are attributes. Leave this parameter blank if the XML structure contains all the messages under the root of the XML document.
For example, consider the following sample XML data with multiple elements:
<?xml version="1.0" encoding="UTF-8"?>
<flight>
<aircraftData>
<aircraftID>N-X-211</aircraftID>
<pilot>Charles L.</pilot>
<speed>212.28</speed>
<altitude>20200</altitude>
<heading>44.02</heading>
<latitude>52.653726</latitude>
<longitude>-21.885421</longitude>
<status>In flight</status>
<datetime>1589962024</datetime>
</aircraftData>
<aircraftData>
<aircraftID>N02061</aircraftID>
<pilot>Jane D.</pilot>
<speed>132.00</speed>
<altitude>13008</altitude>
<heading>264.54</heading>
<latitude>19.510083</latitude>
<longitude>-153.950114</longitude>
<status>In flight</status>
<datetime>1589962042</datetime>
</aircraftData>
<airportData>
<name>John F. Kennedy Int. Airport</name>
<IATA>JFK</IATA>
<ICAO>KJFK</ICAO>
<date>1589932800</date>
<location>
<latitude>40.643585</latitude>
<longitude>-73.781927</longitude>
</location>
</airportData>
</flight>
If a root element of aircraftData is specified, only the following data will be ingested by Velocity:
<aircraftData>
<aircraftID>N-X-211</aircraftID>
<pilot>Charles L.</pilot>
<speed>212.28</speed>
<altitude>20200</altitude>
<heading>44.02</heading>
<latitude>52.653726</latitude>
<longitude>-21.885421</longitude>
<status>In flight</status>
<datetime>1589962024</datetime>
</aircraftData>
<aircraftData>
<aircraftID>N02061</aircraftID>
<pilot>Jane D.</pilot>
<speed>132.00</speed>
<altitude>13008</altitude>
<heading>264.54</heading>
<latitude>19.510083</latitude>
<longitude>-153.950114</longitude>
<status>In flight</status>
<datetime>1589962042</datetime>
</aircraftData>
The schema derivation in Velocity is illustrated below:
Flatten
The Flatten parameter determines whether any nested XML will be broken into separate fields. These new fields are named by appending subelements with an underscore to the higher level, or parent, element. In the following example, the location element contains two subelements: latitude and longitude. If flattened, the location element is broken into two fields, location_latitude and location_longitude.
If flattening, the XML can be nested with multiple levels of depth. The new flattened field names will append object names with an underscore until reaching the deepest level subelement.
For example, consider the following sample XML, specifically the codes and location elements:
<?xml version="1.0" encoding="UTF-8"?>
<airportData>
<airport>
<name>John F. Kennedy Int. Airport</name>
<codes>
<IATA>JFK</IATA>
<ICAO>KJFK</ICAO>
</codes>
<status>Operational</status>
<date>1589932800</date>
<location>
<latitude>40.643585</latitude>
<longitude>-73.781927</longitude>
</location>
</airport>
<airport>
<name>Los Angeles Int. Airport</name>
<codes>
<IATA>LAX</IATA>
<ICAO>KLAX</ICAO>
</codes>
<status>Delayed</status>
<date>1589932800</date>
<location>
<latitude>33.942619</latitude>
<longitude>-118.420942</longitude>
</location>
</airport>
</airportData>
If airport is specified as the root element and the Flatten check box is checked, the sample XML above is processed as follows in which the codes and location subelements are broken into respective fields:
<airport>
<name>John F. Kennedy Int. Airport</name>
<codes_IATA>JFK</codes_IATA>
<codes_ICAO>KJFK</codes_ICAO>
<status>Operational</status>
<date>1589932800</date>
<location_latitude>40.643585</location_latitude>
<location_longitude>-73.781927</location_longitude>
</airport>
<airport>
<name>Los Angeles Int. Airport</name>
<codes_IATA>LAX</codes_IATA>
<codes_ICAO>KLAX</codes_ICAO>
<status>Delayed</status>
<date>1589932800</date>
<location_latitude>33.942619</location_latitude>
<location_longitude>-118.420942</location_longitude>
</airport>
The schema is ingested in Velocity as illustrated below:
Flattening exemptions
The Flattening exemptions parameter allows you to specify one or more XML element names that will be left as a single string element and not broken into separate fields.
Consider the following sample XML:
<?xml version="1.0" encoding="UTF-8"?>
<airportData>
<airport>
<name>John F. Kennedy Int. Airport</name>
<codes>
<IATA>JFK</IATA>
<ICAO>KJFK</ICAO>
</codes>
<status>Operational</status>
<date>1589932800</date>
<location>
<latitude>40.643585</latitude>
<longitude>-73.781927</longitude>
</location>
</airport>
<airport>
<name>Los Angeles Int. Airport</name>
<codes>
<IATA>LAX</IATA>
<ICAO>KLAX</ICAO>
</codes>
<status>Delayed</status>
<date>1589932800</date>
<location>
<latitude>33.942619</latitude>
<longitude>-118.420942</longitude>
</location>
</airport>
</airportData>
If airport is specified in the Root element parameter, the Flatten parameter is checked, and the Flattening exemptions parameter value of codes is specified, the sample XML above will be treated as illustrated below. Notice that the codes element is not broken up by flattening and IATA and ICAO are maintained as subelements.
<airport>
<name>John F. Kennedy Int. Airport</name>
<codes>
<IATA>JFK</IATA>
<ICAO>KJFK</ICAO>
</codes>
<status>Operational</status>
<date>1589932800</date>
<location_latitude>40.643585</location_latitude>
<location_longitude>-73.781927</location_longitude>
</airport>
<airport>
<name>Los Angeles Int. Airport</name>
<codes>
<IATA>LAX</IATA>
<ICAO>KLAX</ICAO>
</codes>
<status>Delayed</status>
<date>1589932800</date>
<location_latitude>33.942619</location_latitude>
<location_longitude>-118.420942</location_longitude>
</airport>
The schema is ingested in Velocity as illustrated below:
Flatten arrays
The Flatten arrays parameter specifies whether arrays are flattened into separate fields. New fields are named by appending the array element name with an underscore and the name of items in the array element. When flattening arrays of elements, the XML can be nested with multiple levels of depth. The new flattened field names append the number of array elements with an underscore until reaching the deepest element.
For example, the following sample XML shows
the tags array
element: <?xml version="1.0" encoding="UTF-8"?>
<flight>
<aircraftData>
<aircraftID>N-X-211</aircraftID>
<pilot>Charles L.</pilot>
<speed>212.28</speed>
<altitude>20200</altitude>
<heading>44.02</heading>
<latitude>52.653726</latitude>
<longitude>-21.885421</longitude>
<status>In flight</status>
<datetime>1589962024</datetime>
<tags>
<tag n="0">boeing</tag>
<tag n="1">vtol</tag>
<tag n="2">glider</tag>
</tags>
</aircraftData>
</flight>
If aircraftData is specified in the Root element parameter, and the Flatten arrays parameter is checked, the sample XML above will be processed as follows. Notice the items in the tags element are divided into respective fields:<?xml version="1.0" encoding="UTF-8"?>
<flight>
<aircraftData>
<aircraftID>N-X-211</aircraftID>
<pilot>Charles L.</pilot>
<speed>212.28</speed>
<altitude>20200</altitude>
<heading>44.02</heading>
<latitude>52.653726</latitude>
<longitude>-21.885421</longitude>
<status>In flight</status>
<datetime>1589962024</datetime>
<tags_tag_0_n>0</tags_tag_0_n>
<tags_tag_0>boeing</tags_tag_0>
<tags_tag_1_n>1</tags_tag_1_n>
<tags_tag_1>vtol</tags_tag_1>
<tags_tag_2_n>2</tags_tag_2_n>
<tags_tag_2>glider</tags_tag_2>
</aircraftData>
</flight>
The schema is ingested in Velocity as illustrated below:
Array flattening exemptions
The Array flattening exemptions parameter allows you to specify one or more XML array names that return the last item in the array instead of divided into separate fields. Consider the following sample XML:<?xml version="1.0" encoding="UTF-8"?>
<flight>
<aircraftData>
<aircraftID>N-X-211</aircraftID>
<pilot>Charles L.</pilot>
<speed>212.28</speed>
<altitude>20200</altitude>
<heading>44.02</heading>
<latitude>52.653726</latitude>
<longitude>-21.885421</longitude>
<status>In flight</status>
<datetime>1589962024</datetime>
<tags>
<tag n="0">boeing</tag>
<tag n="1">vtol</tag>
<tag n="2">glider</tag>
</tags>
<setting>
<value>3</value>
<value>2</value>
<value>1</value>
</setting>
</aircraftData>
</flight>
If the Flatten arrays parameter is
checked, and an Array flattening exemptions parameter value
of setting_value is specified, the sample XML above is
processed as illustrated below. In the setting array, only the last item in the array is
returned. To indicate that multiple arrays are to be exempted from
flattening, use commas in the Array flattening exemptions parameter to separate values.<?xml version="1.0" encoding="UTF-8"?>
<flight>
<aircraftData>
<aircraftID>N-X-211</aircraftID>
<pilot>Charles L.</pilot>
<speed>212.28</speed>
<altitude>20200</altitude>
<heading>44.02</heading>
<latitude>52.653726</latitude>
<longitude>-21.885421</longitude>
<status>In flight</status>
<datetime>1589962024</datetime>
<tags_tag_0_n>0</tags_tag_0_n>
<tags_tag_0>boeing</tags_tag_0>
<tags_tag_1_n>1</tags_tag_1_n>
<tags_tag_1>vtol</tags_tag_1>
<tags_tag_2_n>2</tags_tag_2_n>
<tags_tag_2>glider</tags_tag_2>
<setting>
<value>1</value>
</setting>
</aircraftData>
</flight>
The schema is represented in Velocity as illustrated below:
Considerations and limitations
When working with XML formatted data in Velocity, there are several important considerations and limitations.
XML data with nested subelements
XML requires a hierarchy of elements that begins and ends with a root element. An unspecified number of additional elements and subelements can be under a root element. Currently, Velocity derives the schema from the primary elements found under the specified root. Any additional subelements under the primary elements are handled as concatenated strings. Further processing of the strings can be performed using Velocity analytic tools. For example, you can use the Root element parameter to define a new root in the XML document.
XML samples compared to the entire schema
In an XML document, elements sharing an identical name do not always have the same subelements. This has implications when first configuring, editing, or eventually consuming an XML-based feed or data source. Velocity samples the data stream or files to determine the data format and schema. If all the samples returned in the sampling process are missing XML elements because those samples have a different schema, those fields are omitted from the feed or source data schema.
XML file size
As a best practice, keep XML files that are imported to Velocity under 100 MB per file. If the files are larger than 100 MB per file, it is recommended that you create separate files that are each under 100 MB per file.