Unlock Your Data
GDOT’s Office of Transportation Data (OTD) team was attempting to capture and catalog data from design files. Not being CAD users, their approach was to take plotted paper drawings from a project and work through each sheet to find features they might be interested in cataloging – number of lanes, lane widths, shoulder location and widths, median location, surface materials and so on. This was further challenging because they were not design engineers and drafting “noise” often lead to misunderstanding and misinterpretation of the design elements. The road characteristics they did find would then be manually added to their preferred GIS. The approach was time consuming and not very accurate.
Ideally, a computer should do this kind of work. That is, read drawings for us, automatically catalog and then publish to GIS.
Imagine if we could unlock the data stored in 20 to 30 years of project designs and make that available in a searchable, usable spatial database. And imagine if this could be done in a few hours versus years. At Phocaz, our CLIP project for GDOT is focused on identifying road characteristics important to OTD, but we have already discovered other applications – maintenance and construction for instance. We are finding too that the technology we are building to support this work can be applied to more than just drawings. For now our goal is to make finding road characteristics from project designs easier and faster – make it more accessible to more people.
Machine Learning and AI
Today it might seem evident that using machine learning and artificial intelligence (ML/AI) could help us. However, upon our first investigation into the tools available we discovered that a lot of progress has been made towards discovering features from LIDAR and photographic images, but much less work has been done to train an AI to read a schematic drawing.
As a human, I can quickly identify what line work in a drawing represents – roads, edge of pavement, shoulder locations, sign panels, sign posts, striping pavement markings and other design features. When this is presented to a computer for analysis the black and white imagery of a PDF is very confusing. The lack of context and limited training material for this type of image makes training an AI very difficult. At the time of this update (August 2022) – we are able to identify and extract some features from PDF. Improving on this technology will be key to unlocking more design data from ancient projects that no longer have the original DGN or DWG files.
Using the CAD drawing we can start to make some inferences based on levels/layers and proximity. We can traverse the highway and identify lane groupings. From this we can then determine lane center lines. Lane groupings can be visualized as shown in the image to the right. In this example there are shapes that clearly identify features like lanes and medians. This technique helps us better understand what the computer sees and provides some clues as to how to identify each lane.
Still, while compelling, we don’t yet have enough information to make good decisions about what each lane represents – is it a shoulder, a bike lane, a turn or through lane. We need more tools.
New Tools for AI Training
Phocaz has developed new tools directed at solving the problem of reading schematic drawings. Two of these tools are interactive. With these we are able to sample images from drawings that we want the AI to learn. We want to be able to find basic symbols like turn arrows, but we also have more organic shapes and systems, such as medians.
CLIP Boxer – teach an AI about any symbol used by the client – think cells like turn arrows, route markers, light poles, sign poles, etc. Provides GUI to add key points, and bounding boxes around features we want the AI to learn about.
To predict a question from the reader, “Why can’t you just read the cell name?” One limitation is a case where a right turn arrow is mirrored and used as a left turn arrow. Or, consider the case where a cell has been dropped (or a block exploded). In both these cases we can’t trust the name or that it’s even available.
CLIP Rig – teaches an AI about more organic designs – things that don’t have consistent shapes like medians, cross-walks, and roundabouts.
Both programs create output that can be consumed by an AI.
The third tool is a server application that facilitates the use of multiple ML/AI technologies. From our development team,
“HEMLOCK is an AI/ML inference server designed to provide centralized access to various machine learning techniques through a single interface. Support for cross-language interoperability enables HEMLOCK to employ a larger ecosystem of AI technologies.”
Results for Everyone, iTwin
With a trained AI can detect turn lanes, bike lanes, crosswalks, count the number of lanes, the number of street lights, find sign panels and posts, and any symbol used by GDOT. Using iTwin, we can make the results readily available in a browser – easier for our users to consume. And we can make the results available to a more diverse set of users. In the image to the right we show how the AI is able to identify turn symbols, group them into logical sets, and determine direction of travel. The iTwin viewer provides a searchable report and synchronizes the report results with the graphic view zooming into the selected result in the graphic view.