Part 1: Detecting Truck Parking Lots on Satellite Images

This post describes a student group project developed within the Data Science Lab undergraduate course of the Vienna University of Economics and Business, co-supervised by Trustbit.

Student project team: Michael Fixl, Josef Hinterleitner, Felix Krause and Adrian Seiß

Supervisors: Prof. Dr. Axel Polleres (WU Vienna), Dr. Vadim Savenkov (Trustbit)

Introduction

Real-time truck tracking is crucial in logistics: to enable accurate planning and provide reliable estimation of delivery times, operators build detailed profiles of loading stations, providing expected durations of truck loading and unloading, as well as resting times. Yet, how to derive an exact truck status based on mere GPS signals? Knowing the exact position and shape of truck parking lots can be advantageous in order to find out whether a truck is performing a loading action, or is just waiting nearby. Oftentimes, however, truck parking lots are not entirely recorded. In this post we describe a machine learning approach of detecting parking lot shapes based on satellite images. If you would like to check out details of the project or want to reproduce it, the code can be found on GitHub.

Building a dataset

Our first task is to obtain an annotated dataset of satellite images, so we resort to open data from OpenStreetMap to get both imagery and parking lot annotations, via the open dataset published by Google in BigQuery. In order to increase sample size, we make use of two satellite imagery sources from different timepoints and hence containing different image information. As you can see in the samples below, the images of the same parking lot differ in resolution as well as in time of recording (visible for example via the tree size in the center of the image). Thus, we can use almost all filtered parking lot shape annotations (below in blue) twice. Finally, a dataset of slightly above 1000 satellite images of truck parking lots with corresponding parking lot shape data is ready to be used for training models.

The approach

With the training data at hand, we create a model capable of predicting the exact shape of parking lots. We approach this task by using segmentation techniques. These methods try to divide an image into subgroups of predefined classes, so-called segments. They take a matrix with the pixel’s RGB values of an image as input as well as a matrix with the label of each pixel for training (called the mask). After training, the model assigns every pixel of an image to an object class, finally returning a matrix with the predicted class of each pixel.

In the following we assess five commonly used image segmentation techniques: Mask R-CNN, U-Net, FPN, LinkNet and PSPNet. To simplify the task, we first train the models in a small baseline setting using a ResNet50 backbone with pre-trained weights, a sample of the full dataset and restricted training time. The “mean intersection over union” (mIoU) metric is used to compare the models. For each image, IoU is the ratio of the intersection of the predicted mask and the true parking area to their union, the final metric being then an average IoU value over the test image dataset.

Key findings of the comparison

Assessing Mask R-CNN

Our first candidate is Mask R-CNN. In contrast to other models in question, Mask R-CNN is able to identify each object instance of a particular type, rather than a union of all pixels belonging to a given class. You can see this ability in the images below, as every predicted parking lot has its own color.

As we can see in the samples, performance of this architecture was not very convincing for our task, while training also took up to seven times longer than for the algorithms following later on. The model often detects rooftops and streets as truck parking lots and frequently does not even recognize the true parking areas correctly. Expectably, the mIoU metric of approximately 26% is quite low, and therefore Mask R-CNN is not shortlisted for the final experiment. Let’s hope that other techniques produce better results for our problem.

Assessing semantic segmentation models

The remaining four models, namely U-Net, FPN, LinkNet and PSPNet, all belong to the class of semantic segmentation architectures. These architectures usually consist of an en- and decoder. While the encoder uses filters to extract features from an image, the decoder generates the final output, a mask of the predictions. The exact implementation and structure of en- and decoder differentiate the architectures mentioned and thus influence the final predictions [1].

Doing numerous test runs on Google Colab, the PSPNet architecture turned out to perform best. With a promising mIoU of 69% already in the baseline setting while also having a rather low training time of just a few minutes. The runner-up in our comparison was LinkNet with a mIoU of 65%, while the other two candidates FPN (58%) and U-Net (50%) demonstrated a noticeably lower performance.

Let’s now see what optimization of the PSPNet architecture can bring. Making use of additional data and hyperparameter tuning we can obtain a decent performance increase and reach a mIoU of 73.65%. This increase in prediction power is also clearly visible in the sample images below. Sometimes, however, the PSPNet model fails to recognize the parking area correctly, like in the rightmost image.

Conclusion

Overall, PSPNet showed stunning accuracy on the test set compared to the other algorithms tested. However, once we use out-of-sample data, we can see that performance is not very convincing. In the next blog post, we will thus try to increase generalizability and also test, if the code is easily transferable to other machines.

References:

[1] Source papers of U-Net: U-Net: Convolutional Networks for Biomedical Image Segmentation ,

FPN: Feature Pyramid Networks for Object Detection ,

LinkNet: LinkNet: Exploiting Encoder Representations for Efficient Semantic... ,

PSPNet: Pyramid Scene Parsing Network

Image sources:
Esri, Maxar, Earthstar Geographics, CNES/Airbus DS, and the GIS User Community

Blog 11/30/22

Part 2: Detecting Truck Parking Lots on Satellite Images

In the previous blog post, we created an already pretty powerful image segmentation model in order to detect the shape of truck parking lots on satellite images. However, we will now try to run the code on new hardware and get even better as well as more robust results.

Referenz

How fast does Jira respond? Load simulation provides answers

Fast access and response times – a crucial criterion when selecting new enterprise software. The load simulation by catworkx aids in the decision-making process, as it tests not only static but...

Blog 7/25/23

Revolutionizing the Logistics Industry

As the logistics industry becomes increasingly complex, businesses need innovative solutions to manage the challenges of supply chain management, trucking, and delivery. With competitors investing in cutting-edge research and development, it is vital for companies to stay ahead of the curve and embrace the latest technologies to remain competitive. That is why we introduce the TIMETOACT Logistics Simulator Framework, a revolutionary tool for creating a digital twin of your logistics operation.

Training

Jira Administration Part 1 (Data Center)

Over the course of the training "Jira Administration Part 1 (Data Center)" participants learn the most important steps for setting up a Jira instance (Jira Core, Jira Software or Jira Service Management).

Blog 11/10/23

Part 1: Data Analysis with ChatGPT

In this new blog series we will give you an overview of how to analyze and visualize data, create code manually and how to make ChatGPT work effectively. Part 1 deals with the following: In the data-driven era, businesses and organizations are constantly seeking ways to extract meaningful insights from their data. One powerful tool that can facilitate this process is ChatGPT, a state-of-the-art natural language processing model developed by OpenAI. In Part 1 pf this blog, we'll explore the proper usage of data analysis with ChatGPT and how it can help you make the most of your data.

Service

Data Vault Modeling Approach

Data Vault is a modeling technique that is particularly suitable for agile Data Warehouses. It offers high flexibility for extensions, complete historization of data and allows parallelization of data loading processes. Data Vault supports without significantly increasing the complexity of the Data Warehouse over time.

Referenz 6/1/23

Managed service support for central platform stability

To ensure the quality, availability and performance of the platform at all times, TIMETOACT supports N-ERGIE as a managed service partner.

Blog 8/11/22

Part 1: TIMETOACT Logistics Hackathon - Behind the Scenes

A look behind the scenes of our Hackathon on Sustainable Logistic Simulation in May 2022. This was a hybrid event, running on-site in Vienna and remotely. Participants from 12 countries developed smart agents to control cargo delivery truck fleets in a simulated Europe.

Unternehmen 1/19/23

Sustainability in the TIMETOACT GROUP

Sustainability is one of the big topics of our time and we also want to get involved and face up to our responsibility as TIMETOACT GROUP. Find out everything about our sustainability activities here.

Blog 3/10/21

Introduction to Web Programming in F# with Giraffe – Part 1

In this series we are investigating web programming with Giraffe and the Giraffe View Engine plus a few other useful F# libraries.

Übersicht

Webinars on demand

Atlassian expertise on demand

Referenz 11/3/21

Mix of IASP & ILMT support for optimal license management

To minimize financial risk and personnel time, UTA resorts to proactive management of the license inventory (IASP) by TIMETOACT. In this way, not only will IBM license audits be avoided in the future, but TIMETOACT will also ensure compliance-compliant use of the ILMT as part of license management.

Releasewechsel eines eingesetzten IAM-Tools

Referenz

Release change of a deployed IAM tool

TIMETOACT received the order to carry out a major release change for the IAM tool used and to develop the processes back to the standard of the product as far as possible. At the same time, a change of service provider became necessary, which meant that all components of the IAM had to be moved to a new data center.

Unternehmen

Who we are

The synaigy team is as diverse as digitalization itself. Our older members of staff have been on the digitalization journey for some time and contribute a wealth of experience from various sectors and specialist areas, while our digital natives enrich the team with their creativity, innovative approaches and an affinity with technology that opens up completely new perspectives.

Blog 4/16/24

The Intersection of AI and Voice Manipulation

The advent of Artificial Intelligence (AI) in text-to-speech (TTS) technologies has revolutionized the way we interact with written content. Natural Readers, standing at the forefront of this innovation, offers a comprehensive suite of features designed to cater to a broad spectrum of needs, from personal leisure to educational support and commercial use. As we delve into the capabilities of Natural Readers, it's crucial to explore both the advantages it brings to the table and the ethical considerations surrounding voice manipulation in TTS technologies.

Referenz 1/27/22

Talend migration in record time

TIMETOACT migrated the Talend Data Integration Suite, including all workflows and processes, to the internal system environment of the municipal utility (with a subscription license) within a very short period of time and will continue to provide support as a support partner in the future if required.

Referenz

Custom licensing

MARKANT Handels und Service GmbH (MARKANT) is fully exploiting the potential of its IBM software licenses with this year's license renewal. Instead of relying on IBM's traditional Passport Advantage model as in the past, MARKANT is using a licensing concept specially adapted to the company for the first time.

Referenz

Consulting on the ivv collaboration strategy

The future collaboration of ivv is characterized by modern communication and collaboration tools. It is defined for cross-organizational work in association and with external parties.

Leistung 2/9/22

Application development on IBM i (AS400)

We maintain, modernise or migrate IBM i applications (AS400). To ensure that digital transformation does not remain a buzzword, we provide the developer manpower to drive and realise it.

Service

Demand Management – Clarity on IT needs

We help to determine your effective IT needs, e.g. for workstations with Microsoft Office 365. Save money by only buying what you really need.

Part 1: Detecting Truck Parking Lots on Satellite Images

Introduction

Building a dataset

The approach

Key findings of the comparison

Assessing Mask R-CNN

Assessing semantic segmentation models

Conclusion

More on this topic

Part 2: Detecting Truck Parking Lots on Satellite Images

How fast does Jira respond? Load simulation provides answers

Revolutionizing the Logistics Industry

Jira Administration Part 1 (Data Center)

Part 1: Data Analysis with ChatGPT

Data Vault Modeling Approach

Managed service support for central platform stability

Part 1: TIMETOACT Logistics Hackathon - Behind the Scenes

Sustainability in the TIMETOACT GROUP

Introduction to Web Programming in F# with Giraffe – Part 1

Webinars on demand

Mix of IASP & ILMT support for optimal license management

Release change of a deployed IAM tool

Who we are

The Intersection of AI and Voice Manipulation

Talend migration in record time

Custom licensing

Consulting on the ivv collaboration strategy

Application development on IBM i (AS400)

Demand Management – Clarity on IT needs

Bleiben Sie mit dem TIMETOACT GROUP Newsletter auf dem Laufenden!