Skip To Content

Solve a spatial problem

In this exercise, you are a business analyst for a consortium of colleges that wants to run a marketing campaign in states with high-value colleges. It's up to you to find states with colleges that have a good return on investment (ROI) for students. You'll use Insights for ArcGIS to analyze United States Department of Education College Scorecard data in the form of a feature layer to find relationships between the cost of college and earnings by graduates. In 45 minutes or less, you will do the following:

  • Create interactive maps, charts, and tables.
  • Apply an advanced filter to your data.
  • Use spatial and nonspatial analysis techniques to solve a problem.

Note:

To complete this exercise, your account must have the following specifications:

  • Creator or GIS Professional user type
  • Publisher or Administrator role, or equivalent custom role
  • Insights license

For more information, see Administer Insights.

Add data to your organization

The data for this analysis has been provided publicly on the ArcGIS website, where it can be downloaded to your machine. Follow these steps to access and load the data to your ArcGIS Online organization:

Note:

If you completed the Create your first workbook lesson, you can skip to the next section. Reopen your workbook, and drag the CollegeScorecard.Table1 dataset to the New page tab New page then create a map on the new page.

  1. Follow the link to the CollegeScorecard item.
  2. Click the Download button to download the item to your machine.
  3. Unzip the folder and save the Microsoft Excel file on your computer in a location that you can find easily.
  4. Sign in to your ArcGIS Online account.
  5. Open Insights and sign in to your account if necessary.
    Tip:

    You can access Insights through the gallery of apps in ArcGIS Online.

    Gallery of apps
  6. On the Workbooks page, click New workbook.
  7. In the Add To Page window, click Excel or CSV.
  8. Click Browse my computer and open the Excel file or drag the file onto the Add To Page window. Click Add.
  9. The workbook opens with the CollegeScorecard.Table1 dataset in the data pane.
  10. Click Dataset options Dataset options for the Excel file and choose Enable Location from the menu.
  11. The default method of enabling location is using coordinates. The LONGITUDE and LATITUDE fields from the Excel file are already populated.
  12. Click Run.
  13. Expand the dataset in the data pane.
    A location field is added to the dataset
  14. A new location field called Coordinates has been added to the dataset.
  15. Drag the dataset onto your page and drop it on the Map drop zone to create a map of United States colleges.
    Create a map
  16. Tip:

    If you prefer to use buttons, you can select a field from the dataset and click the Map button above the data pane.

  17. Click Untitled Workbook and replace it with a unique and useful title, such as US Colleges - Your Name. Including your name in the title will make your workbook easier to find if you share your work. Click Save.

Calculate return on investment

The return on investment (ROI) for colleges is calculated using cost and earnings after graduation. In this section, you will begin your analysis by calculating ROI for all United States colleges.

  1. Click the Action button Action on the map to open the Analytics pane.
  2. Click on the Find Answers tab and How is it related? to display spatial and nonspatial analysis capabilities.
  3. Open Calculate Ratio. For the numerator, choose EARNINGS, and for the denominator, choose COST. Name the field ROI and click Run.
  4. A data table appears, providing a view of your raw data. The ROI field is the last column in the table.
  5. Close the data table.
  6. Save your workbook.

Find states with above average ROI

Now that your dataset has a field for return on investment, you can start to find more information about which areas have a high ROI. In this section you will filter your data and determine which states have the highest ROI.

  1. On the map card, click the Card filter button Card filter. The New Filter pane opens.
  2. Tip:

    Using a card filter instead of a dataset filter will allow you to work with both the filtered and unfiltered data throughout your analysis.

  3. Click Advanced to open the Expression Filter pane.
  4. An advanced filter is an expression-based filter that allows you to create complicated queries or incorporate calculations into your filter.
  5. Enter the expression ROI>AVG(ROI) to query only the colleges with a greater than average return on investment. Click Apply.
  6. Close the Card Filters pane.
  7. A result dataset Advanced card filter result is added to the data pane with the same name as your original dataset. You can rename the result to distinguish it from the original.
  8. Hover over the result dataset and click the Rename dataset button Rename dataset.
  9. Rename the dataset Colleges_ROI and hit Enter on the keyboard to set the changes.
  10. There are several ways to analyze ROI within states. In this case you will use a stacked bar chart so that you can incorporate the state and type of college.
  11. Expand Colleges_ROI to display the fields. Select STATE and TYPE, drag them to the Chart drop zone, and drop them on Stacked Bar Chart.
    Stacked bar chart of STATE, TYPE, and COUNT
  12. A stacked bar chart is created showing the count of colleges with an above average ROI for each state and college type.
  13. Drag the bottom edge of the chart card down to make it easier to see all of the states.
  14. Click the Sort button Sort on the card toolbar and choose Sort Descending Sort descending.
  15. The chart now shows the state with the highest count of colleges with above average ROI at the top, and the rest of the states in descending order. Now you can change the axis labels to make the chart easier to understand.
  16. Click More More and Edit Labels Edit Labels, then click the axis to rename it. Name the x-axis (horizontal) Count of colleges with above average ROI and rename the y-axis (vertical) State and college type.
  17. You now have a bar chart that shows the count of high ROI colleges for each state and college type. In the next section you will analyze the colleges spatially.

Analyze the ROI spatially

    In the previous section you used a chart to calculate the count of colleges with above average ROI. Another way to calculate the count is through a spatial aggregation. Doing a spatial aggregation will allow you to display the counts on a map and incorporate spatial patterns into your analysis. In order to do a spatial aggregation, you will need a spatial dataset with the appropriate boundaries.
  1. Click Add to open the Add To Page pane.
  2. Click Boundaries to display the available boundary layers.
  3. Select the USA States (Generalized) dataset. Use the Search bar if necessary to reduce the number of datasets being displayed. Click Add.
  4. A map of US states is added to your page. You do not need to keep this map on the page in order to use it in your analysis.
  5. On the map of US states, click More More then Delete Delete.
  6. Drag the state dataset from the data pane onto the map of Colleges_ROI and drop it on Spatial aggregation.
  7. The Spatial Aggregation pane opens. By default, the aggregation will calculate the count of colleges in each state.
  8. Click Run.
    Count of colleges with above average return on investment, by state
  9. The Colleges_ROI result dataset is replaced with a new Spatial Aggregation 1 dataset. The map is updated to show the counts of high-ROI colleges for each state with proportional symbols. The map shows the same information as the chart, but without incorporating college type. Rather than using two methods of displaying the same information, you can create a map that shows the percent of colleges with a high ROI. In order to do that you will need a count of all colleges in each state.
  10. Drag the original College_Scorecard.Table1 dataset to the map and drop it in the Spatial aggregation drop zone. Click Run to calculate the count of colleges within each state.
  11. Tip:

    The count is calculated by default.

    A second Spatial Aggregation dataset is added to the data pane and the map updates to show the new Count of CollegeScorecard.Table1 field.

  12. Expand the Spatial Aggregation 2 dataset. There are two count fields: Count of Colleges_ROI, which includes the count of colleges with above average ROI in each state, and Count of CollegeScorecard.Table1, which includes the total count of colleges in each state.
  13. Click Rename dataset next to Spatial Aggregation 2 and name the dataset College counts.
  14. Open the Dataset options menu Dataset options and click View Data Table.
  15. The data table opens, displaying the raw data for the dataset. The data table can be used to calculate the percent of colleges that have an above average ROI.
  16. Click the + Field button to add a new field to the data table.
  17. Click New Field and update the field name to PercAboveAvgROI (Percent above average ROI).
  18. Click the Enter calculate function box and enter the equation (Count of Colleges_ROI/Count of CollegeScorecard.Table1)*100. Click Run and close the data table.
  19. Click the number field icon Number field next to the PercAboveAvgROI field and choose Rate/Ratio to change the field role.
  20. The new field will now be treated as a proportional value, rather than a quantity.
  21. Drag PercAboveAvgROI to the map to update the style.
    The percent of colleges with above average return on investment
  22. A choropleth map (a map styled with graduated colors) is created.
  23. Resize the map so that it fits onto your page and all of the states are visible.
  24. Save your workbook.

Change the classification on your map

Now that you have your map, you can start making conclusions about return on investment in different states. However, it's important to remember that your conclusions are going to be affected by the classification on the map. Therefore, you should check what classification is being used before you finish your analysis.

  1. Open the Layer options pane by expanding the layer name on the map.
  2. Click the Options tab Options.
  3. The default classification is natural breaks with 5 classes. Natural breaks is a useful classification for seeing natural groupings inherent in the data, but it may not be the best classification for this scenario.
  4. Click Natural Breaks to expand the Classification type menu.
  5. There are 6 classification options available: Natural Breaks, Equal Interval, Quantile, Standard Deviation, Unclassed, and Manual. An unclassed classification gives the map a continuous color ramp, rather than discrete classes. In this case it is probably best to have discrete classes so that the states can be analyzed in groups. An equal interval classification is useful for data with a known range, like percentages, because it allows you to group your data into set ranges (for example, a dataset with percentages could be divided into 5 intervals with ranges of 20 percent). A quantile classification divides the data into groups with an equal number of features, which makes it good for data that you want to display by rank. In this case, a ranked classification could be useful. A standard deviation classification is useful when you want to focus on the distance from the mean. While knowing the average could be useful in this scenario, it doesn't need to be the focus of your analysis. Finally, a manual classification can be used to create a custom classification scheme. Manually changing the classification can be useful for data with specific values that need to be taken into account or for standardizing the classification between multiple maps.

    On the surface, quantile and equal interval seem like the best options. However, this dataset has 51 features (50 states plus the District of Columbia), which makes it difficult to divide the data equally into a quantile classification. It may be best to try equal interval.

  6. In the Classification type menu, click Equal Interval.
  7. The classification updates. You can click the dividers to see the ranges of the intervals. The divisions are at 21, 34, 47, and 60. It makes sense to have an equal interval, but these values are not intuitive. It would make more sense to apply an equal interval to a full percentage range from 0 to 100. You will use intervals of 10.
  8. Change the number of classes to 8.
  9. Click the first slider and change the value from 16 to 10. Hit Enter on your keyboard.
  10. The divider moves to 10 and the Classification type updates to Manual.
  11. Change the other dividers to 20, 30, 40, 50, 60, and 70.
    The percent of colleges with above average return on investment, classified by return on investment
  12. Click on the legend tab to see the values. Click on values in the legend to select the features in each class.
  13. Tip:

    Change the Display field Display field for the Location field to STATE so that the state names will be displayed when you hover over them.

  14. Save your workbook.
  15. The highest percentage of above average ROI colleges is in Wyoming. The next highest percentages are in South Dakota and New Mexico. These states are all in the central part of the continental United States. As discussed in Create your first workbook, the central states, Hawaii, and Alaska had relatively few private for-profit and nonprofit schools, and a relatively large proportion of public colleges. You also see in the bar chart that the vast majority of high ROI colleges are public. It makes sense that states with a high proportion of public colleges would have a large percent of colleges with a high ROI.

Next steps

Now that you have finished your analysis, it is time to share your results with your colleagues. Use the quick exercise Share your analysis to continue the college scorecard scenario and share the results as a model and an interactive page.

Try more scenario-based exercises from Learn ArcGIS and the Applied Analysis team: