Lambda IT - Blog

Digitizing old maps with OpenCv and Tensorflow

In todays blog post we are looking at the problem of digitizing old topological maps that only exist on paper. Doing this by hand is very time consuming so we looked at solutions using computer vision and machine learning.

Publiziert am 17. Feb. 2021 von David Strahm

This is an enlarged section of a topological map. The goal is to extract the contour lines as well as the horizontal / vertical reference grid and save them as separate layers. Note that there are no colors, all we have is the grayscale image of the original map.

Input

First attempt

Our first idea was to implement a line-tracing algorithm that traces individual lines and groups them. This is it in action:

Line tracing

This works reasonably well in some cases. But in areas with a lot of information and curves, it fails to correctly trace the lines because there are almost infinite edge cases.

For example, take a look at this segment where there are many close lines as well as text and other noise. Even for a human it is difficult to identify the correct lines.

Noise

Even after adding logic to clean the horizontal and vertical grid lines and removing the small text noise, we still had some cases where our rule-set didn't apply and broke the algorithm. In the red circles, the algorithm failed to connect line segments because of text or other noise that was not correctly removed in preprocessing.

Result tracing

Improving noise removal with machine learning

Our next goal was to remove as much noise as possible using a neural network trained with Tensorflow / Keras so that the line tracking algorithm has an easier job.

From the original image, we (manually) extracted layers of information as input for the neural network.

In this case:

  • Red: Topological contour lines
  • Green: Reference grid
  • Blue: Rock formations

Training input

After training the network for only short amount of time, we already achieved a high accuracy with only a small number of training images:

512/512 [==============================] - 95s 185ms/step - loss: 0.0111 - accuracy: 0.9961

Result

Below is the prediction result for some example maps: Each color is a prediction of one of the classes that we trained the network for.

There is now much less noise. In addition, we can now save individual layers - for example the contour lines layer - and use only this layer for line extraction. There are still some problems with intersecting lines, but overall the line extraction should now be much easier.

Prediction

Conclusion

We first used only classical computer vision techniques to extract information from an image. This worked well for smaller and simpler images. As the complexity of input images increased, so did the number of edge cases. Adding a machine learning step did not make the classical approach redundant, but greatly increased the quality of the input image by removing most of the unwanted noise.

Author

David Strahm

Software Entwickler

Oft auf dem Bike oder in den Bergen.

david.strahm@lambda-it.ch

+41 31 550 18 26

Aktuelles