Find Identical (Data Management)

Summary

Reports any records in a feature class or table that have identical values in a list of fields, and generates a table listing the identical records. If the Shape field is specified, feature geometries will be compared.

The Delete Identical tool can be used to find and delete identical records.

Illustration

Find Identical tool illustration
Points with the OBJECTIDs of 1, 2, 3, 8, 9, and 10 are spatially coincident (blue highlight). The output table identifies those spatially coincident points that share the same CATEGORY.

Usage

  • Records are identical if values in the selected input fields are the same for those records. The values from multiple fields in the input dataset can be compared. If more than one field is specified, records are matched by the values in the first field, then by the values of the second field, and so on.

  • With feature class or feature layer input, specify the Shape field in the Field(s) parameter to compare feature geometries to find identical features by location. The XY Tolerance and Z Tolerance parameters are only valid when the Shape field is specified.

    If the Shape field is specified and the input features have m- or z-values enabled, the m- or z-values will also be used to determine identical features.

  • Check the Output only duplicated records parameter if you want only the duplicated records in the output table. If this parameter is unchecked, the output will have the same number of records as the input dataset.

  • The output table will contain the following fields:

    • IN_FID—The Object ID field value from the input dataset. This field can be used to join the records of the output table back to the input dataset.
    • FEAT_SEQ—The sequence number. Records from the input that have the same values will have the same FEAT_SEQ value while nonidentical records will have a unique sequential value. The FEAT_SEQ values have no relationship to IDs of input records.

Parameters

LabelExplanationData Type
Input Dataset

The table or feature class for which identical records will be found.

Table View
Output Dataset

The output table reporting identical records. The FEAT_SEQ field in the output table will have the same value for identical records.

Table
Field(s)

The field or fields whose values will be compared to find identical records.

Field
XY Tolerance
(Optional)

The x,y tolerance that will be applied to each vertex when evaluating whether there is an identical vertex in another feature.

This parameter is active when the Field(s) parameter value includes the Shape field.

Linear Unit
Z Tolerance
(Optional)

The z-tolerance that will be applied to each vertex when evaluating whether there is an identical vertex in another feature.

This parameter is active when the Field(s) parameter value includes the Shape field.

Double
Output only duplicated records
(Optional)

Specifies whether only duplicated records will be included in the output table.

  • Unchecked—All input records will have corresponding records in the output table. This is the default.
  • Checked—Only duplicate records will have corresponding records in the output table. The output will be empty if no duplicate is found.
Boolean

arcpy.management.FindIdentical(in_dataset, out_dataset, fields, {xy_tolerance}, {z_tolerance}, {output_record_option})
NameExplanationData Type
in_dataset

The table or feature class for which identical records will be found.

Table View
out_dataset

The output table reporting identical records. The FEAT_SEQ field in the output table will have the same value for identical records.

Table
fields
[fields,...]

The field or fields whose values will be compared to find identical records.

Field
xy_tolerance
(Optional)

The x,y tolerance that will be applied to each vertex when evaluating whether there is an identical vertex in another feature.

This parameter is enabled when the fields parameter value includes the Shape field.

Linear Unit
z_tolerance
(Optional)

The z-tolerance that will be applied to each vertex when evaluating whether there is an identical vertex in another feature.

This parameter is enabled when the fields parameter value includes the Shape field.

Double
output_record_option
(Optional)

Specifies whether only duplicated records will be included in the output table.

  • ALLAll input records will have corresponding records in the output table. This is the default.
  • ONLY_DUPLICATESOnly duplicate records will have corresponding records in the output table. The output will be empty if no duplicate is found.
Boolean

Code sample

FindIdentical example 1 (Python window)

The following Python window script demonstrates how to use the FindIdentical function in immediate mode.

import arcpy

# Find identical records based on a text field and a numeric field.
arcpy.FindIdentical_management("C:/data/fireincidents.shp", "C:/output/duplicate_incidents.dbf", ["ZONE", "INTENSITY"])
FindIdentical example 2 (stand-alone script)

The following stand-alone script demonstrates how to use the FindIdentical function to identify duplicate records of a table or feature class.

# Name: FindIdentical_Example2.py
# Description: Finds duplicate features in a dataset based on location (Shape field) and fire intensity

import arcpy

arcpy.env.overwriteOutput = True

# Set workspace environment
arcpy.env.workspace = "C:/data/findidentical.gdb"

# Set input feature class
in_dataset = "fireincidents"

# Set the fields upon which the matches are found
fields = ["Shape", "INTENSITY"]

# Set xy tolerance
xy_tol = ".02 Meters"

out_table = "duplicate_incidents"

# Execute Find Identical 
arcpy.FindIdentical_management(in_dataset, out_table, fields, xy_tol)
print(arcpy.GetMessages())
FindIdentical example 3: (stand-alone script)

The following stand-alone script demonstrates the use of the optional output_record_option parameter. If the parameter value is ONLY_DUPLICATES, all unique records are removed, keeping only the duplicates for the output.

# Name: FindIdentical_Example3.py
# Description: Demonstrates the use of the optional parameter Output only duplicated records.

import arcpy

arcpy.env.overwriteOutput = True

# Set workspace environment
arcpy.env.workspace = "C:/data/redlands.gdb"

in_data = "crime"
out_data = "crime_dups"

# Note that XY Tolerance and Z Tolerance parameters are not used
# In that case, any optional parameter after them must assign
# the value with the name of that parameter    
arcpy.FindIdentical_management(in_data, out_data, ["Shape"], output_record_option="ONLY_DUPLICATES")

print(arcpy.GetMessages())
FindIdentical example 4: (stand-alone script)

The following stand-alone script reads the output of the FindIdentical function and groups identical records by the FEAT_SEQ field value.

import arcpy

from itertools import groupby
from operator import itemgetter

# Set workspace environment
arcpy.env.workspace = r"C:\data\redlands.gdb"

# Run Find Identical on feature geometry only.
result = arcpy.FindIdentical_management("parcels", "parcels_dups", ["Shape"])
    
# List of all output records as IN_FID and FEAT_SEQ pair - a list of lists
out_records = []   
for row in arcpy.SearchCursor(result.getOutput(0), fields="IN_FID; FEAT_SEQ"):
    out_records.append([row.IN_FID, row.FEAT_SEQ])

# Sort the output records by FEAT_SEQ values
# Example of out_records = [[3, 1], [5, 3], [1, 1], [4, 3], [2, 2]]
out_records.sort(key = itemgetter(1))
    
# records after sorted by FEAT_SEQ: [[3, 1], [1, 1], [2, 2], [5, 3], [4, 3]]
# records with same FEAT_SEQ value will be in the same group (i.e., identical)
identicals_iter = groupby(out_records, itemgetter(1))
    
# now, make a list of identical groups - each group in a list.
# example identical groups: [[3, 1], [2], [5, 4]]
# i.e., IN_FID 3, 1 are identical, and 5, 4 are identical.
identical_groups = [[item[0] for item in data] for (key, data) in identicals_iter]

print(identical_groups)

Related topics