Include custom attributes in your output

AllSource 1.3    |

Available with LocateXT license.

When you extract locations from documents or text, the output feature class contains a point for each location found in the document. Fields in the attribute table store text extracted from the document before and after the location to provide context and help you evaluate it. If you chose to extract dates found in the document, the dates are also stored in feature attributes. You can also extract additional information of interest and store it in custom fields in the output feature class.

For example, documents containing firsthand accounts of a volcanic eruption may include words that are relevant to understanding the nature of the eruption, such as crackling, gas, haze, ash, explosion, steam, lava, and so on. You can define the words to recognize and extract, and place them in a custom field in the attribute table. Custom attributes are defined in a custom attribute file (.lxtca).

Learn about a location's default attributes

If you have a custom attribute file you want to use, add it to the Custom Attribute Files list, activate the file, and turn on the custom attributes toggle. When you extract locations to an existing map layer, new custom attribute fields are not added to the existing feature class's attribute table. When you extract locations to a new map layer, new custom attributes are included in the output feature class's attribute table.

The custom attribute file will be used to define custom fields in the output feature class's attribute table. If the specified content is found when scanning the input files or text, it is extracted and stored in the custom fields.

If you extract locations to an existing map layer and feature class instead, custom fields are not added to the existing feature class's attribute table. However, existing fields can be used to store the specified content if they have the correct data type.

Turn on or turn off custom attributes

When the custom attributes toggle is turned on and you extract locations to a new map layer, the custom attributes defined in the active custom attribute file are included in the output feature class's attribute table. When the custom attributes toggle is turned off and you extract locations to a new map layer, the output feature class's attribute table will only include the default attributes.

  1. In the Extract Locations pane, click the Properties tab.
  2. Turn on or turn off the custom attributes toggle.
    • Click the Options tab Options, and click the Custom attributes toggle.
    • Click the Extract attributes tab Extract attributes, click the Custom Attributes tab, and click the Create fields from custom attributes toggle.

    Click a toggle that is off to turn it on Toggle is turned on. Click a toggle that is on to turn it off Toggle is turned off.

Access the Custom Attributes tab

Access the Custom Attributes tab to activate a custom attribute file, create a custom attribute file, or manage your custom attribute files.

  1. In the Extract Locations pane, click the Properties tab.
  2. Do one of the following to access the Custom Attributes tab:
    • Click the Options tab Options, and click the arrow Jump To Option next to the Custom attributes toggle.
    • Click the Extract attributes tab Extract attributes, and click the Custom Attributes tab.

Define a custom attribute

When the Custom Attribute File dialog box first appears, the Attributes list is empty, but the dialog box is immediately ready to add new attributes to the file. Start typing in the form to define your new custom attribute and add it to the Attributes list when you are finished. To edit an existing attribute, select it in the attributes list and start editing; update the attribute when your changes are complete. If you start adding or editing an attribute and do not want to save your changes, cancel them—this clears the form and allows you to start defining a new attribute instead.

When an attribute has been added or an existing attribute has been updated, the attribute name appears in italic text with an asterisk (*), indicating that it has not been saved to the custom attribute file.

The four components of a custom attribute are as follows:

  • Storage—These properties determine how the field is defined in the attribute table when an output feature class is created.
  • Search options—These properties define how input documents are examined for information that can be extracted.
  • Keywords—These properties define what you are looking for in the input documents.
  • Capture options—If a keyword is found, these properties define what text is extracted from the document and stored in the field.

Storage

Properties determining how a custom attribute is stored in the output feature class are defined under the Attribute Information heading. The name provided in the Attribute Name text box appears in the Attributes list and is also used as the field's alias.

As you type a value for the attribute name, a corresponding value is added to the Field Name text box. The attribute name is adjusted to meet typical field naming requirements. For example, if you type Event Type in the Attribute Name text box, Event_Type appears in the Field Name text box. The field name can be changed to any suitable value.

All custom attributes are assigned the text data type when they are included in a feature class's attribute table. By default, the field size is set to store strings 254 characters in length. Change the value in the Field Length text box to a larger or smaller value, as appropriate.

If you always create geodatabase feature classes as output, provide field names and sizes that are appropriate for this type of data. If you later use the same custom attribute file and create a shapefile as output instead, field names and sizes will be truncated to the allowable limit for this type of data.

Search options

Properties defining how input documents are examined and how keywords are handled if they are found are specified under the Search Options heading. The search type determines how documents and text are examined for keywords. The Type drop-down list has the following two options:

  • Entire document—The entire document is scanned for the specified keywords. This is the default setting.
  • Near locations—When a location is found in a document, the text before and after the location is scanned for the specified keywords. The amount of text scanned before the location is determined by the value in the Characters Before text box. The amount of text scanned after the location is determined by the value in the Characters After text box. The Characters Before and Characters After text boxes are both set to 60 characters by default, for a total range of 120 characters.

When a keyword is found, how it is handled is determined by the value in the Matches drop-down list, which has the following options:

  • Keep only first—Only the first keyword found in the document or in the specified range is handled. This is the default.
  • Keep all—All keywords found in the document or in the specified range are handled.

Consider an input document describing schools in Redlands, CA that includes the following lines:

Date: February 7, 2019

Source: http://www.ed-data.org/district/San-Bernardino/Redlands-Unified

School: Redlands Senior High, Type: High School, Charter: N, Grades: 9-12, Location: 117.1717550°W 34.0552456°N, students: 2325, enrollDate: 2017/08/09, Established: 1891, address: 840 East Citrus Ave. Redlands CA 92374-5399

School: Citrus Valley High, Type: High School, Charter: N, Grades: 9-12, Location: 117.1922398°W 34.0816164°N, students: 2168, enrollDate: 2017/08/09, Established: 2008, address: 800 West Pioneer Ave. Redlands CA 92374-1509

This document has many locations, and many instances of the words Redlands and school. If two keywords are defined to extract these words, the following combination of options will produce the following results:

  • Entire document + Keep only first—Each location will have the same value. The first keyword found in the document will be extracted and recorded in the custom attribute. The custom attribute value will be Redlands.
  • Entire document + Keep all—Each location will have the same value. All instances of the keywords Redlands and school found in the document will be extracted and recorded in the custom attribute. Each piece of extracted text is separated by the pipe character (|) in the attribute value. The custom attribute value will be Redlands | School | Redlands | School | Redlands | School | School | Redlands
  • Near locations + Keep only first, checking a character range of 60 characters before the location and zero characters after the location—Each location will have the first keyword found in the specified range of characters. Both locations will have the custom attribute value School.
  • Near locations + Keep only first, checking a character range of 100 characters before the location and zero characters after the location—Each location will have all instances of the keywords found in the specified range of characters. The first location will have the custom attribute value School | Redlands | School. The second location will have the custom attribute value School | School.

Keywords

The keyword portion of the custom attribute form is immediately ready to add new keywords to the list. Start typing in the form to define a new keyword and add it to the Keywords list when you are finished. To edit an existing keyword, select it in the keywords list and start editing; update the keyword when your changes are complete. If you start adding or editing a keyword and do not want to save your changes, cancel them—this clears the form and allows you to start defining a new keyword instead.

When a new keyword has been added or an existing keyword has been updated, the keyword name appears in the Keywords list in italic text with an asterisk (*), indicating that it has not been saved to the custom attribute file.

Type the text you are looking for in the Keyword text box. If the last character in the keyword is a whitespace character, it will be ignored when the keyword is evaluated.

If appropriate, check Case Sensitive. If the text extracted from the document should include the text specified in the Keyword text box, check Include in Capture.

Capture options

Properties determining which text is extracted from the document and stored in the field in the output feature class's attribute table are defined under the Capture Options heading. If the field size specified for the custom attribute is smaller than the extracted text, the value stored in the field will be truncated. The text is extracted as is from the document, beginning from the last nonwhite space character of the keyword to the stopping point specified by the selected capture option. The extracted text will include white space characters.

The following six options define which text is extracted. Considering the same input document described above, examples of the text extracted for each option is provided.

  • Capture only keyword—Only the text specified in the Keyword text box will be extracted. It is not necessary to check Include in Capture for the keyword to extract the specified keyword. For example, if the keyword is redlands-unified, the text redlands-unified will be extracted and stored in the custom field. This is the default.
  • Capture number of characters—When this option is selected, the Number text box is enabled. The specified number of characters are extracted. By default, one character is extracted. For example, if the keyword is established:, extract five characters to store a value that includes all four characters of the year such as 1891; the first character stored in the field will be the space that follows the colon (:). If a line in the document is missing a space after the colon, that line may have the value 1957, instead.
  • Capture number of words—When this option is selected, the Number text box is enabled. Text up to the last character in the last specified word is extracted. By default, one word is extracted. For these purposes, a word is the text that occurs between two nonalphanumeric characters. For example, if the keyword is grades and you extract two words, the text : 9-12 will be extracted. The first word is 9 and the second word is 12.
  • Capture number of lines—When this option is selected, the Number text box is enabled. The specified number of lines are extracted. By default, one line is extracted. For these purposes, one line is the position following the last character of the keyword to the end of the current line. If more than one line is extracted, all characters on the following number of specified lines are also extracted.
  • Capture until blank line—Text up to the next blank line or the end of the document is extracted. For example, if the keyword is dates and there is not a blank line in the document, all text until the end of the document is extracted. If there is a blank line following the source URL in the file, all text up to the blank line is extracted.
  • Search until stop string—When this option is selected, the Stop String Text text box is enabled. All text until the specified stop string is extracted. For example, if the keyword is type: and the stop string is a comma (,), the text in between, such as High School, will be extracted. With this option, the Case Sensitive and Include in Capture check boxes also become enabled for the stop string, which should be checked, as appropriate. With a keyword such as address: and a stop string such as 92374, the text up to and including the stop string will be extracted: 840 East Citrus Ave. Redlands CA 92374. If other addresses have a different ZIP Code, all text until the next occurrence of the specified ZIP Code or the end of the file will be extracted.

Add an attribute

When the Custom Attribute File dialog box first appears, it is immediately ready to add new attributes to the custom attribute file. Start typing in the form to define your new custom attribute and add it to the Attributes list when you are finished. When an attribute has been added, the attribute name appears in italic text with an asterisk (*), indicating that it has not been saved to the custom attribute file.

If the dialog box was previously opened and you started editing an existing attribute, the name of that attribute appears in the Attribute Name text box. Click Cancel at the bottom of the form to stop editing the attribute. The form is cleared and allows you to start defining a new attribute instead.

  1. Create a custom attribute file or edit a custom attribute file.

    The Custom Attribute File dialog box appears. The Attribute Name text box shows the hint New attribute indicating the form is ready for you to define a new attribute.

  2. Set storage options that determine how the field will be stored in the output feature class.
    1. Type a name for the attribute in the Attribute Name text box.
    2. In the Field Name text box, adjust the name of the field that will be stored in the feature class, if appropriate.
    3. Change the value in the Field Length text box to a longer or shorter length as appropriate.
  3. Set search options that determine how the input document will be examined for the information you want to find.
    1. Click the Type drop-down list and set the scope of text in which to look for the keyword. Set the range of text using the Characters Before and Characters After text boxes, if appropriate.
    2. Click the Matches drop-down list and choose an option indicating if you want to find the first occurrence of a keyword in the input file or all occurrences.
  4. Add keywords to the custom attribute.
    1. Type the text you are looking for in the document in the Keyword text box.
    2. If the text identifying the keyword is case sensitive, check Case Sensitive.
    3. If the keyword should be included in the extracted text, check Include in Capture.
  5. Specify the capture options that define what content is extracted from the document and stored in the field.
    1. Click the Capture Type drop-down list and click the appropriate option determining how you will define what text to extract.
    2. Type a value in the Number text box or the Stop String Text text box, if one of these options becomes enabled for the selected capture type.
    3. If the Stop String Text text box is enabled, check Case Sensitive and Include in Capture, if appropriate.
  6. Click Add Keyword.

    The keyword is added to the Keywords list.

  7. Click Add Attribute to add this custom attribute definition to the Attributes list.
  8. Click Save to save the custom attribute to the custom attribute file.

Edit an attribute

To edit an existing attribute in the Custom Attribute File dialog box, select it in the attributes list and start editing. Update the attribute when your changes are complete. If you start editing an attribute and do not want to save your changes, cancel them—this clears the form and allows you to start over or edit another attribute instead.

When an existing attribute has been updated, the attribute name appears in italic text with an asterisk (*), indicating that it has not been saved to the custom attribute file.

  1. Create a custom attribute file or edit a custom attribute file.

    The Custom Attribute File dialog box appears. The Attributes list includes all custom attributes previously stored to the file. The dialog box is immediately ready to add a new attribute—the Attribute Name text box is empty.

  2. Hover over the attribute you want to edit in the Attributes list, and click the Edit button Edit.

    The attribute's properties appear in the form on the right.

  3. Change how the field is stored in the output feature class by adjusting the values under the Attribute Information heading, if appropriate.
  4. Change how the input document will be scanned to find information by adjusting the values under the Search Options heading, if appropriate.
  5. Hover over the keyword you want to edit in the Keywords list, and click the Edit button Edit.

    The keyword's properties appear in the form.

  6. Change the text you are looking for in the document by adjusting the values under the Keyword heading, if appropriate.
  7. Change how text is extracted from the document when a keyword is found by adjusting the values under the Capture Options heading, if appropriate.
  8. Click Update Keyword to keep your changes to this keyword, or click Cancel to stop editing the keyword.

    When updated, the text identifying the keyword in the Keywords list is modified to reflect your changes, if appropriate. The keyword appears in italic text with an asterisk (*), indicating that your changes have not been saved.

  9. Remove any keywords that are not effective in extracting the information you want. Hover over the keyword you want to remove in the Keywords list, and click the Remove button Remove.
  10. Click Update Attribute to keep your changes to this attribute, or click Cancel to stop editing the attribute.

    When updated, the text identifying the attribute in the Attributes list is modified to reflect your changes, if appropriate. The attribute's name appears in italic text with an asterisk (*), indicating that your changes have not been saved. The file name at the top of the dialog box appears in italic text with an asterisk (*), indicating that your changes have not been saved.

  11. Click Save to update the custom attribute definitions in the custom attribute file.
  12. Click Close to stop editing the custom attribute file.
Tip:

You can double-click an attribute in the Attributes list to edit it. Similarly, you can double-click a keyword in the Keywords list to edit it.

Remove attributes

To remove an attribute from a custom attribute file, first edit the file. Hover over the attribute you want to remove in the Attributes list, and click the Remove button Remove, or press Delete. You can also remove multiple locations from a custom location file at once following the steps below.

  1. Create a custom attribute file or edit a custom attribute file.

    The Custom Attribute File dialog box appears.

  2. Click the first attribute you want to remove.

    The attribute is selected.

  3. Press Ctrl or Shift.
  4. Repeat step 2 to select the other attributes you want to remove.
  5. Click the Remove button Remove at the top of the Attributes list, or press Delete.

    All selected attributes are removed.

  6. Click Save to update the custom attribute definitions in the custom attribute file.
  7. Click Close to stop editing the custom attribute file.