Make sure you actually do use spatial indexes

Ever ran some GIS analysis in QGIS and it took longer than a second? Chances are that your data did not have spatial indexes for QGIS to utilise and that it could have been magnitudes faster.

I realised just today, after years of using QGIS, that it did not automatically create a spatial index when saving a Shapefile. And because of that, lots of GIS stuff I did in the past, involving saving subsets of data to quick’n’dirty Shapefiles, was slower than necessary.

Sadly QGIS does not communicate lack of spatial indexing to the user in any way. I added a feature request to make Processing warn if no indexing is available.

An example: Running ‘Count points in polygon’ on 104 polygons with 223210 points:

  • Points in original GML file: 449 seconds
    • GML is not a format for processing but meant for data transfer, never ever think of using it for anything else
  • Points in ESRI Shapefile: 30 seconds
  • Points in GeoPackage: 3 seconds
  • Points in ESRI Shapefile with spatial index: 3 seconds
    • Same Shapefile as before but this time I had created a .qix index

So yeah, make sure you don’t only use a reasonable format for your data. And also make sure you do actually have an spatial index.

For Shapefiles, look for a .qix or .sbn side-car file somewhere in the heap of files. In QGIS you can create a spatial index for a vector layer either using the “Create spatial index” algorithm in Processing or the button of the same title in the layer properties’ source tab.

PS: GeoPackages always have a spatial index. That’s reason #143 why they are awesome.

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.