Suppose I have two disjoint groups / “islands” of polygons (recall census areas in two non-adjacent districts). My data might look something like this:
>>> p1=Polygon([(0,0),(10,0),(10,10),(0,10)]) >>> p2=Polygon([(10,10),(20,10),(20,20),(10,20)]) >>> p3=Polygon([(10,10),(10,20),(0,10)]) >>> >>> p4=Polygon([(40,40),(50,40),(50,30),(40,30)]) >>> p5=Polygon([(40,40),(50,40),(50,50),(40,50)]) >>> p6=Polygon([(40,40),(40,50),(30,50)]) >>> >>> df=gpd.GeoDataFrame(geometry=[p1,p2,p3,p4,p5,p6]) >>> df geometry 0 POLYGON ((0 0, 10 0, 10 10, 0 10, 0 0)) 1 POLYGON ((10 10, 20 10, 20 20, 10 20, 10 10)) 2 POLYGON ((10 10, 10 20, 0 10, 10 10)) 3 POLYGON ((40 40, 50 40, 50 30, 40 30, 40 40)) 4 POLYGON ((40 40, 50 40, 50 50, 40 50, 40 40)) 5 POLYGON ((40 40, 40 50, 30 50, 40 40)) >>> >>> df.plot()

I want the polygons on each island to have an identifier (maybe arbitrary) representing a group. For example, 3 polygons in the lower left corner can have IslandID = 1, and 3 polygons in the upper right corner can have IslandID = 2.
I developed a way to do this, but I wonder if this is the best / most efficient way. I do the following:
1) Create a GeoDataFrame with geometry equal to polygons in a polygon unary union. This gives me two polygons, one for each "island".
>>> SepIslands=gpd.GeoDataFrame(geometry=list(df.unary_union)) >>> SepIslands.plot()

2) Create an identifier for each group.
>>> SepIslands['IslandID']=SepIslands.index+1
3) Spatially connect the islands with the source polygons so that each polygon has a corresponding island identifier.
>>> Final=gpd.tools.sjoin(df, SepIslands, how='left').drop('index_right',1) >>> Final geometry IslandID 0 POLYGON ((0 0, 10 0, 10 10, 0 10, 0 0)) 1 1 POLYGON ((10 10, 20 10, 20 20, 10 20, 10 10)) 1 2 POLYGON ((10 10, 10 20, 0 10, 10 10)) 1 3 POLYGON ((40 40, 50 40, 50 30, 40 30, 40 40)) 2 4 POLYGON ((40 40, 50 40, 50 50, 40 50, 40 40)) 2 5 POLYGON ((40 40, 40 50, 30 50, 40 40)) 2
Is this really the best / most efficient way to do this?