To reduce storage requirements, you can compress tables and vector file geodatabase feature classes (collectively referred to as datasets) to a read-only format. Once compressed, display and query performance is comparable to decompressed data. You might find it provides slight performance improvements in some operations but slows slightly in others.
There are two types of compression that can be applied to file geodatabase data: lossless and nonlossless (or lossy). With lossless compression, when you compress data, no information is lost, regardless of the coordinate system or types of attribute data the feature class or table contains, and all floating-point values will be preserved. Lossy compression allows for up to 20 percent better compression of file geodatabase data, but floating-point values will be changed. Lossy compression is a good choice if you require maximum compression and your data is not particularly accurate, or you're not worried about maintaining the full precision of your data, for example, if you're compressing data at a scale of 1:1,000,000 or greater.
Where compressed datasets differ from decompressed data is in editing: a compressed dataset is read-only and therefore cannot be edited or modified in any way except for changing its name and modifying attribute indexes and metadata.
Compression is ideally suited to datasets that do not require further editing. However, if required, a compressed dataset can always be decompressed to return it to its original, read/write format.
Compress data in ArcGIS AllSource
You can compress a geodatabase, feature dataset, stand-alone feature class, or table using the Compress File Geodatabase Data geoprocessing tool and decompress using the Uncompress File Geodatabase Data geoprocessing tool. Both tools are in the File Geodatabase toolset, in the Data Management toolbox.
Benefits of compression
Compressed file geodatabase data takes up less disk space than decompressed data while still offering comparable display and query performance.
The amount of compression possible in feature classes and tables can range from a negligible amount to a ratio exceeding 4:1. The makeup of coordinates and the number of attribute fields and their contents determine the amount of compression possible.
The most important factor for feature classes is the average number of vertices per feature. Points and simple two-vertex lines compress more than lines or polygons with many vertices. A feature class of address points or roads with few vertices, for example, may compress by a ratio of 3:1, whereas a feature class of rivers or soil data with many vertices per feature may only compress by a ratio of 3:2. Features with many vertices already store efficiently when decompressed, and as a result, there is less potential for compression. Attribute fields also play a role in determining the amount of compression: text, integer, and date fields compress better than floats and doubles.
The following examples compare decompressed and compressed format file sizes. The feature class achieving the greatest amount of compression relative to its original size, Europe places, is a point feature class. The feature class with the least amount of compression, Mexico roads, is a line feature class with many vertices per feature.
Comparison of standard and compressed formats
Feature class | Decompressed size | Compressed size | Compression ratio |
---|---|---|---|
Europe places (61,541 point features, 14 fields) | 6.2 MB | 0.67 MB | 9.3 |
U.S. census blocks (8,205,055 point features, 11 fields) | 705 MB | 80 MB | 8.8 |
California roads (2,092,079 line features, 29 fields) | 329 MB | 60 MB | 5.5 |
Europe rails (383,531 line features, 12 fields) | 58 MB | 9.7 MB | 6.0 |
Calgary addresses (285,285 point features, 8 fields) | 21 MB | 6.4 MB | 3.3 |
Calgary buildings (319,000 polygon features, 9 fields) | 48 MB | 20 MB | 2.4 |
U.S. rivers and streams (2,844,231 line features, 9 fields) | 878 MB | 288 MB | 3.0 |
U.S. counties (3,140 polygon features, 57 fields) | 1.6 MB | 0.8 MB | 2.5 |
Europe water (232,375 polygon features, 10 fields) | 176 MB | 70 MB | 2.5 |
U.S. traffic analysis zones (166,747 polygon features, 10 fields) | 68 MB | 35 MB | 1.9 |
Mexico roads (5,847 line features, 7 fields) | 3.5 MB | 1.6 MB | 2.2 |
Tables usually compress by a ratio exceeding 2:1. Redundancy is the most important factor; fields with values that often don't change from one record to the next compress better than fields with many unique values. As with feature classes, text, integer, and date fields compress better than floats and doubles.
While you can compress data of any size, compression is most useful when applied to large amounts of data. Compressing large datasets or many medium- or small-sized datasets can yield significant storage savings, which can be helpful when you're pressed for disk space or are trying to fit data onto a CD or DVD. For example, an 8.9 GB file geodatabase of U.S. census data can compress down to 3.4 GB—small enough to fit onto a DVD.
When minimizing storage requirements, keep in mind that file geodatabase compression may not be the only option at your disposal. If your data is stored at an x,y resolution that is smaller than necessary, you can lower storage requirements by reloading the data to a larger resolution before compressing. For example, if you have a dataset that stores at the default resolution of 1/10 millimeter, but you know the data is only accurate to 1 meter, you could reload the data to a 1-meter resolution. As an example, reloading the 1/10-millimeter-resolution Calgary buildings feature class to a 1-meter resolution reduces storage from 48 to 31 MB. Compressing the 31 MB feature class further reduces the size of the data to 12 MB.
The effects of x,y resolution on storage
Feature class | Decompressed size | Compressed size |
---|---|---|
Calgary buildings, 0.0001-meter resolution | 48 MB | 20 MB |
Calgary buildings, 1.0-meter resolution | 31 MB | 12 MB |
To reload a feature class to a different resolution, export the data to a new feature class. Right-click the feature class in the Catalog pane and choose Export > To Geodatabase (choose the Multiple command to export multiple feature classes at once). Specify the new resolution on the Environments dialog box before you export. For more information, see XY Resolution and Z Resolution.
What you can compress
You can compress a geodatabase, feature dataset, stand-alone feature class, or table. When you compress a geodatabase, all feature classes and tables within it are compressed. When you compress a feature dataset, all its feature classes are compressed. Any item that cannot be compressed is skipped over. The following outlines the items that can be compressed and those that cannot.
File geodatabase data | Able to compress |
---|---|
Geodatabase | Yes (All vector feature classes and tables in the geodatabase compress.) |
Catalog dataset | Yes |
Feature class (stand-alone only) |
Yes |
Feature dataset |
Yes (All vector feature classes in the feature dataset compress.) |
Mosaic dataset | Yes (The dataset compresses, but the mosaicked imagery files to which the dataset links do not.) |
Network dataset | Yes |
Oriented imagery dataset | Yes (The dataset compresses, but the imagery files to which the dataset links do not.) |
Parcel fabric | Yes |
Raster dataset |
No |
Table |
Yes |
Terrain | No |
Topologies | Yes |
Trace network | Yes |
Trajectory dataset | Yes |
Utility network | Yes |
Note:
- If any of the dataset types listed above contain one or more of the following field data types, the dataset will not be compressed:
- Big integer
- Date only
- Time only
- Timestamp offset
- 64-bit object IDs
- You cannot individually compress or decompress a feature class in a feature dataset. You compress or decompress the feature dataset, which compresses or decompresses all objects in the feature dataset.
Restrictions when working with compressed data
In addition to not being able to edit a compressed feature class or table, the following additional properties cannot be modified:
- Coordinate system information
- Subtypes, attribute domains, and default values
- Fields and their properties
- Representations
The only properties that can be modified are the alias of the feature class or table and attribute indexes.
Compressed feature datasets allow you to add decompressed feature classes through operations such as creating an empty feature class, copying and pasting, and importing. This produces a mixed state in which some feature classes in the feature dataset are compressed and others are not. If a feature dataset contains both compressed and decompressed feature classes, you will not be able to edit the decompressed feature classes. To edit a feature class in a feature dataset, all feature classes in the feature dataset must be decompressed.
You can compress feature classes in relationship classes or topologies. However, there are some restrictions related to these:
- You cannot create a topology from compressed feature classes.
- When you compress only one side of a relationship class, you will not be able to edit the other side. This is because updating the decompressed side may require an automatic update in the read-only compressed side.
- You cannot modify the properties of a topology if its feature classes are compressed.