Let's look at a couple of examples using ArcGIS functions. There are a number of caveats or gotchas while using multiprocessing with ArcGIS and it is important to cover them up-front because they could result in hundreds of Pro sessions opening and locking your pc, and they effect the ways in which we can write our code.
Esri describe a number of best practices for multiprocessing with arcpy. These include:
- Use the “memory“ (Pro) or "in_memory" (legacy, but still works) workspaces to store temporary results because as noted earlier memory is faster than disk.
- Avoid writing to file geodatabase (FGDB) data types and GRID raster data types. These data formats can often cause schema locking or synchronization issues. That is because file geodatabases and GRID raster types do not support concurrent writing – that is, only one process can write to them at a time. You might have seen a version of this problem in arcpy previously if you tried to modify a feature class in Python that was open in ArcGIS. That problem is magnified if you have an FGDB and you’re trying to write many feature classes to it at once. Even if all of the featureclasses are independent, you can only write them to the FGDB one at a time.
So bearing the two points in mind we should make use of memory workspaces wherever possible and we should avoid writing to FGDBs (in our worker functions at least – but we could use them in our master function to merge a number of shapefiles or even individual FGDBs back into a single source).