Skip To Content

Solve a spatial problem

In this exercise, you are a business analyst for a consortium of colleges that wants to run a marketing campaign in states with high-value colleges. It's up to you to find states with colleges that have a good return on investment (ROI) for students. You'll use Insights for ArcGIS to analyze United States Department of Education College Scorecard data in the form of a feature layer to find relationships between the cost of college and earnings by graduates. In 25 minutes or less, you will do the following:

  • Visualize the data and results through interactive maps, charts, and tables.
  • Interact with, sort, filter, and update the visualizations on your page to ask more questions and find answers.
  • Apply spatial analytics, such as spatial aggregation, to summarize data using area features.

Before you begin

The data for this analysis has been provided publicly on the ArcGIS website, where it can be downloaded to your machine. Follow these steps to access and load the data to your ArcGIS Online organization:

  1. Follow the link to the College_Facts item.
  2. Click the Download button to download the item to your machine.
  3. Sign in to your ArcGIS Online organizational account.
  4. Add the zipped shapefile to ArcGIS Online using the Add Item drop-down menu. Include your name in the title so that the item will be unique in the organization. Add tags and click Add Item.
    Add an item from your computer
  5. Open Insights and sign in to your account if necessary.
    Tip:

    You can access Insights through the gallery of apps in ArcGIS Online.

After you sign in, the Workbooks page appears.

Create a workbook and add data

  1. From Workbooks, click New workbook. From the Content tab, choose the dataset you just saved and click Add.
  2. The dataset you added appears in the data pane, and a card appears on your page that shows United States colleges as points on a map.
  3. Click Untitled Workbook and replace it with a unique and useful title, such as College_Rankings_YourName. Including your name in the title will make your workbook easier to find if you share your work. Click Save.

Questions

How are costs distributed across United States regions?

As an analyst, you may want to start with the big picture. The map shows many points. It may be helpful to get a summary of costs by region to start.

  1. In the data pane, expand the College_Facts dataset.
  2. Fields from the dataset are listed. Each field has an icon that indicates the field's role, which is based on the type of data the field contains. The fields that will help answer the question above are the following:
    • region, which represents the part of the United States where the college is located, and is a string field
    • cost, which represents the average annual cost of attendance, and is a number field
  3. Hover over the region field in your dataset and click the circle that appears. Do the same for the cost field. Blue circles around the check marks indicate selected fields.
  4. Drag your selections to the Table drop zone that appears on your page.
    Drag region and cost to create a table
    Note:

    If you prefer buttons to dragging fields, click Table above the data pane after you select your fields.

  5. A summary table appears as a card on your page.
  6. Instead of a sum of costs, average costs would be more helpful to know. Change the cost statistic from sum to average. Use the arrows beside the cost statistic to sort the costs in descending order.
  7. Now, switch the table to a chart. Click the Visualization type button Visualization type on the card and choose Bar Chart.
  8. When you run analysis tools in Insights, the results are added to the data pane. Results are indicated with this icon: Results. There is now a result dataset in the data pane for the bar chart you created.
  9. Save your workbook.

Snap quiz

  • Which region has the highest average school cost?
  • Which region has the lowest average school cost?
  • What is the average school cost across all regions?

Answers

What's the relationship between the cost of college and mean earnings after graduation?

Creating and interacting with a scatter plot is one way to see the relationships in your numeric data. The mean_earnings field represents average earnings of independent students working and not enrolled in college 10 years after entry.

  1. From the data pane, choose cost and mean_earnings. Drag your selections to the Scatter Plot drop zone that appears on your page.
  2. The cost field is on the x-axis (horizontal) and the mean_earnings field is on the y-axis (vertical).
    Tip:

    If cost is not on the x-axis, click the Switch axes button Switch axis at the lower right corner of the card. The cost field then moves to the x-axis and the mean_earnings field moves to the y-axis.

  3. On the scatter plot, click Color by and choose type. Click the Legend button Legend to show the chart legend.
  4. Colors indicate college types: Private For-Profit, Private Nonprofit, and Public.
  5. Hover over a couple of points that show high cost and high mean earnings.
  6. Tip:

    These points are at the upper right.

  7. Hover over a couple of points that show low cost and low mean earnings. Continue exploring the points in the chart.
  8. From the chart legend, click Private For-Profit. Next, click Private Nonprofit, and then Public. The category you select on the legend is reflected on the card.
  9. Save your workbook.

Snap quiz

  • Which type of college has the highest cost and the highest mean earnings?
  • Overall, which type of college tends to have the lowest cost and the lowest mean earnings?
  • What happens to your scatter plot when you click a legend item?
  • What happens to the rest of your cards on the page when you click a legend item?

Answers

How are the average costs for public colleges distributed across the data?

Filter your data to reduce the scope of your analysis. Maps paired with charts are an effective way to see how and where your data is distributed.

  1. From the data pane, hover over the type field in your dataset.
  2. Click the Dataset filter button Dataset filter that appears.
  3. Uncheck Select All to clear selections, check Public, and click Apply.
  4. The cards on your page are updated to reflect your filtered dataset.
  5. Drag the cost field onto your map (Card 1).
    Drag to style your map
  6. The map updates to show cost by proportional symbols. This is hard to interpret. Changing the map style will improve clarity.
  7. Click the arrow next to cost in the legend.
    Note:

    You can display or remove the legend by clicking the Legend button Legend on the map toolbar.

  8. The Layer options pane appears.
  9. Browse to the Options tab Options. Under Symbol Type, choose Counts and Amounts (Color).
  10. The map updates to show shaded points instead of proportional symbols.
  11. Change the Classification type from Natural Breaks to Standard Deviation to show schools above or below the average cost. Change the color ramp on the Style tab Style so that it displays below-average cost and above-average cost with different colors. You may also want to set the outline to no color.
  12. There are many points on your map, so hovering over points to see pop-ups is difficult. Interacting with a map using selections can reveal spatial patterns. From the Legend tab Legend you can make selections on the map based on classification.
  13. Click the top class (>1.5 Standard Deviation) to see where the high-cost colleges are located. Click each class to see the number and locations of points in each range.
  14. Click the Info button Info.
  15. The card flips over to show statistics. Summary statistics provide at-a-glance information. Among the colleges represented (almost 1,600), the minimum cost is $5,536. The maximum is $33,826, and the mean (average) is $15,014. Knowing the average range would be helpful in this analysis.
  16. Click the arrow to flip the card back.
  17. Click the Action button Action to open the Analytics pane, and then click the Find answers tab.
  18. Click How is it distributed? and click View Histogram.
  19. Under Choose a number field, choose cost and click Run.
  20. A histogram appears. Examine the histogram to answer the question below.

Snap quiz

  • What is the most common cost range among public colleges?

Answers

Note:

You do not need the histogram anymore, so delete it by clicking the Delete button Delete at the upper right of the card. You can also remove the filter on the type field by reopening the dataset filter and clicking the Remove filter button Remove filter.

How are costs and mean earnings distributed by state?

Filtering allows you to narrow your scope. In this workflow, you will also see how spatial aggregation can summarize key indicators by geography, and how interacting with more than one map allows you to see patterns with more than one variable.

  1. Filter the dataset to show average cost range. In this case, colleges in the $10,000–$20,000 range will be the focus. In the data pane, click the cost field and click the Dataset filter button Dataset filter. Do one of the following:
    • Adjust the left slider to 10,000 and the right slider to 20,000.
    • Click the left slider and type 10,000 in the field, and then click the right slider and type 20,000 in the field.
  2. Click Apply.
  3. Your cards update to reflect your filtering. Next, perform spatial aggregation using a boundary available in Boundaries.
  4. Above the data pane, click + Add Data. Click Boundaries and search for States. Select USA States (Generalized), and click Add.
  5. Note:

    Using the generalized state boundaries will reduce the amount of time required to run the Spatial Aggregation tool.

  6. Drag the state boundaries onto the existing map and to the Spatial aggregation drop zone. By default, spatial aggregation provides a feature count, but you can calculate additional statistics.
  7. Click to expand Additional options. Choose mean_earnings, and change the statistic from sum to average. Then, choose cost and change the statistic from sum to average.
    Spatial Aggregation dialog box
  8. Click Run and return to the data pane.
    Note:

    If your page contains a map of US states, you can delete the card.

  9. A result dataset is added to the data pane.
  10. Click the arrow next to the Count of College_Facts layer in the map legend to expand the Layer options. On the Options tab, under Style By, choose the average cost field.
  11. Under Symbol Type, choose Counts and Amounts (Color). Change the color ramp and classification to match the one used in your first map.
  12. Dive-in:

    Counts and Amounts (Color) should only be used on area features when the data is relative (for example, averages or proportions). If you do not have relative data, it is a best practice to divide your field by another field, such as total population or total area, to make your data relative. A Divide by field can be entered on the Options tab.

  13. If necessary, move the region-by-cost bar chart away from the map of average cost.
  14. In the data pane, expand the result dataset. From your results, choose Average mean_earnings, and drag it to the Map drop zone next to the map of average cost.
  15. Click the arrow next to the Average mean_earnings layer in the legend. Under Symbol Type, choose Counts and Amounts (Color) and change the classification and color ramp to match the map of average cost.
  16. Click the Sync extents button Sync extents to sync your maps.
  17. Zoom in and pan around your maps to see which states have low cost and high mean earnings. Hover over states of interest for pop-up information. The pop-ups tell you whether the states are above or below average for cost or mean earnings.
  18. Save your workbook.

Snap quiz

  • Name at least three states that have below-average cost and above-average mean earnings.

Answers

In which three states do colleges deliver the highest return on investment?

Exploring maps side-by-side allows you to compare low and high values of different variables at the same time. An easier way to determine the top three states is to calculate a return on investment variable based on average cost and average mean earnings.

  1. Click the map you made in the previous section, and click the Action button Action to open the Analytics pane.
  2. Click the Find answers tab and click How is it related?.
  3. Open the Calculate Ratio tool. For the numerator, choose average mean earnings, and for the denominator, choose average cost. Name the result field ROI and click Run.
  4. A data table appears, providing a view of your raw data. The ROI field is the last column on the right.
  5. Close the data table.
  6. Create a new map using the ROI field.
  7. From the results dataset, choose STATE and ROI and drag them to the Table drop zone. A summary table is created showing STATE and ROI.
  8. Click the Sort button Sort field for the ROI field to sort the summary table so that the states with the highest ROI are at the top.
  9. Save your workbook.

Snap quiz

  • Which states have the highest ROI?
  • Which states have the lowest ROI?

Answer

Next steps

Now that you have finished your analysis, it is time to share your results with your colleagues. Use the quick exercise Share your analysis to continue the college scorecard scenario and share the results as a model and an interactive page.

Quiz answers

How are costs distributed across United States regions?

  • Question: Which region has the highest average school cost?

    Answer: New England

    New England region has the highest cost

  • Question: Which region has the lowest average school cost?

    Answer: Southwest

    Southwest region has the lowest cost

  • Question: What is the average school cost across all regions?

    Answer: $25,524

    Adding Chart Statistics to a bar chart

Back to lessons

What's the relationship between the cost of college and mean earnings after graduation?

  • Question: Which type of college has the highest cost and the highest mean earnings?

    Answer: Private Nonprofit

    Private Nonprofit schools have the highest cost and mean earnings

  • Question: Overall, which type of college tends to have the lowest cost and the lowest mean earnings?

    Answer: Public

    Public schools have the lowest cost and mean earnings

  • Question: What happens to your scatter plot when you click a legend item?

    Answer: All the points for the legend item are selected, for example, all red points.

    Interactions with legends

  • Question: What happens to the rest of your cards on the page when you click a legend item?

    Answer: The map highlights only colleges of the selected type. The bar chart is unchanged because colleges of all types are in all regions.

    Tip:

    You can use the Enable cross filters button to filter the bar chart using the selection in the scatter plot.

    Interactions between cards

Back to lessons

How are the average costs for public colleges distributed across the data?

  • Question: What is the most common cost range among public colleges?

    Answer: $10,680–$13,251

    Histogram pop-up box

Back to lessons

How are costs and mean earnings distributed by state?

  • Question: Name at least three states that have below-average cost and above-average mean earnings.

    Answer: Any three of Washington, California, Wyoming, North Dakota, Nebraska, Kansas, Oklahoma, Texas, Maryland, Connecticut, Rhode Island, and Massachusetts

    States with below-average cost and above-average mean earnings

Back to lessons

In which three states do colleges deliver the highest return on investment?

  • Question: Which states have the highest ROI?

    Answer: Utah, Wyoming, Connecticut, Delaware, and Washington

    Utah and Wyoming are tied for highest ROI

  • Question: Which states have the lowest ROI?

    Answer: New Hampshire, Alaska, West Virginia, New Jersey, and Georgia (not including the District of Columbia)

    States with the highest ROI

Back to lessons