CAD to GIS Blog
One of the constant “battles” in GIS is moving data over from a CAD package. Often, we need to get CAD data into a format that is suitable for a database, and/or for displaying on a mobile device or in a corporate GIS system. Whilst ArcGIS Pro or QGIS will happily open CAD files, this is only half the battle. This is the first in a series of blogs outlining how we converted a CAD dataset into a true, 3D GIS database, ready for office and field use.
Below, you can see some highway drainage CAD data loaded into ArcGIS Pro. Looks pretty ok really, right?
One major drawback of CAD is duplication of data. The image above is a series of green lines, but that isn’t the whole story. Not even close. When we turn off the top layer, the output looks different straight away, see below.
This pattern continues for a few layers, so we instantly have a decent sized job tidying up the data (this is an exert from an A road). Added to that, we have the outlines of everything in the poylgon layers. However, that’s still not the entire story. Let’s zoom in and focus on one of the circles in the original map.
We can see that it’s not a symbolised point like we’d expect in GIS, it’s a polyline circle. On top of that, inside the circle there are squares. In this specific example, each of which is made up of four separate lines. This creates yet another job in the data cleansing exercise. We don’t just want to remove this data; we need to create a point at the centre of each of the circles, so we have a point to tie the attribution to. Then, we need to clear them from the line layer, without compromising data integrity.
Now, my default is to work smart not hard. I don’t like the idea of going through the entire dataset and removing each of these circles. If it were just the one junction, it’d be a different story. But this network was in the region of 150Km, with circles roughly once every 50m, in each direction.
So, what was the best way of getting this task done?
Some of the features with ArcGIS Pro I found most useful where the select functions. Drilling down into the data, we can start to recognise patterns. I’d initially hoped that the basic attribution would highlight every single circle entity, but no. when digitising in CAD, it is down to the inputting person what entity type they describe it as. This varies across big datasets as different people do different parts. My next thought was around size.
I made an assumption, if I didn’t want to go through and delete each of these circles, the person who drew them wouldn’t go through and manually draw all of them. So I looked for a pattern in the sizes/lengths of the polylines. Low and behold, the pattern emerged.
I managed to find all of the circles as they shared roughly (give or take the odd mm) the same size using the “Select By Attribute” tool. I then created centre points of the lines, and my point layer had a starting point. I could then remove the circles with one swift press of the delete key (and a trip to the kitchen to make a brew whilst several thousand records were removed).
Recaffinated, I then used the same logic to remove all of the squares, and a series of other features including the outlines of polygons. Afterwards, there were a few stragglers that required some slightly smarter uses of GIS (no, I’m not giving everything away), and then a manual check across the entire network. In the end, the line feature class went from about 1.8 million individual records, to more like 6000. To be precise, we end up with a feature class 0.345% the size of the original.
The next blog is going to focus on symbolising the data form the attribution, and understanding how we can interpret the minimal information generally available in CAD exports.
Written by Ben Smith