A multifile feature connection (MFC) allows you to quickly connect to data sources to visualize and analyze large datasets. An MFC provides functionality and flexibility to work with data and its formatting.
An MFC references a folder of one or more datasets. Datasets in an MFC are used as input feature data (points, polylines, polygons, and tabular data) to geoprocessing tools. When you create an MFC, an .mfc file is created. This file points to a directory of datasets that outlines the datasets and their schema in the MFC, including geometry and time information. You can browse for MFC datasets in geoprocessing tools and view MFC datasets on the map. The following are examples of when an MFC is appropriate:
- You have multiple shapefiles representing a large area. Each shapefile represents a subset of the area, and you want to use all of the shapefiles together.
- You receive a new .csv file daily with temperature measurements. You want to include the new .csv file as part of a dataset with your existing .csv files.
- You use data that has multiple fields representing the time of an event. You want to use all the fields to represent the time.
- You have parquet files to use.
The following are reasons to use an MFC as input to geoprocessing tools:
- You can represent multiple datasets of the same schema and file type as a single dataset.
- An MFC accesses the data when the analysis is run, so you can continue to add data to an existing dataset in an MFC without reregistering or publishing the data.
- You can modify the MFC to remove, add, or update which datasets are visible.
- MFCs are flexible in how time and geometry can be defined and allow for multiple time formats on a single dataset.
Supported data formats
Multifile feature connections support the following datasets:
- Delimited files (such as .csv, .tsv, and .txt)
- Shapefiles (.shp)
- Parquet files (.parquet)
Note:
Only unencrypted parquet files are supported. GeoParquet files are not supported.
- ORC files (.orc)
If you are using an MFC in GeoAnalytics Desktop tools, all input formats are supported. If you are using MFC datasets in any other geoprocessing tool, delimited files, shapefiles, and parquet files are supported.
Multifile feature connection terminology
The following table lists common terms for working with MFCs:
Term | Description |
---|---|
Multifile feature connection | The item representing the MFC file. This MFC can be expanded to view datasets and browsed to for use in geoprocessing tools. This connection file is the ArcGIS AllSource interface for the MFC file. |
Multifile feature connection file | The file (.mfc) that is created and stored when you create an MFC using the Create Multifile Feature Connection tool. The file contains information about contained datasets and schemas, as well as geometry and time properties. When you view this file in ArcGIS AllSource, it is an MFC item. |
Multifile feature connection dataset | A dataset in the MFC. You can add this dataset to a map or use it as input to geoprocessing tools. |
Source location | The folder location registered as an MFC. This location contains one or more folders representing MFC datasets. Multifile feature connection tools do not modify this folder. |
Source data | The datasets registered in the MFC. When you use an MFC, the source data is not modified. Multifile feature connection tools do not modify this data. |
Structure the input data
To use datasets as inputs in an MFC, the data must be correctly structured. To prepare data for an MFC, format the datasets as subfolders under a single source folder that you register. In this source folder, the names of the subfolders represent the dataset names.
The image above represents the correct structure of an MFC. The source folder is registered, and each subfolder in the source folder represents a dataset. In this example, you would register the source folder, and three datasets would be included in the MFC: Dataset-1, Dataset-2, and Dataset-3.
In the dataset subfolders, you can structure the data. If your subfolders contain multiple folders or files, all of the contents of the subfolders are read as a single dataset, and they must share the same schema and file type.
Note:
All files in a dataset folder have the same schema. If a file has a different schema, it will not be used correctly in visualization and analysis.
In this example, the same three dataset folders have different content. Each dataset is described below:
- Dataset-1—This dataset is composed of a single file: D1-1. When Dataset-1 is used for visualization or analysis, a single shapefile will be used.
- Dataset-2—This dataset is composed of two text files: D2-1 and D2-2. When Dataset-2 is used for visualization or analysis, both text files will be used.
- Dataset-3—This dataset is composed of two folders: D3-Folder-1 and D3-Folder-2, each containing a single dataset, D3-1 and D3-2. When Dataset-3 is used for visualization or analysis, both D3-1 and D3-2 will be used.
Note:
If a dataset is within a folder with a name that begins with an underscore (_), the dataset will be treated as hidden and will not be discoverable as a dataset. This does not apply to subfolders or files. Subfolder and file names can start with an underscore and the data will be used.
These are examples of how you can structure data. The number of files or folders doesn't change how the data is used for visualization or analysis. There is no advantage to adding a subfolder to or removing subfolders from each dataset folder; structuring the folders at that level is optional.
To start using multifile feature connections, see Use multifile feature connections.