CYBERTEC PostgreSQL Logo

Analyzing historical flight data with MobilityDB

04.2022 / Category: / Tags: |

MobilityDB is on the move

Since my last article about MobilityDB, the overall project has further developed and improved.
This is a very good reason to invest some time and showcase MobilityDB’s rich feature stack by analyzing historical flight data from OpenSky-Network, a non-profit organisation which has been collecting air traffic surveillance data since 2013.

From data preparation and data visualization to analysis – this article covers all the steps needed to quick-start analyzing spatio-temporal data with PostGIS and MobilityDB together.
For a basic introduction to MobilityDB, please check out previous MobilityDB post first.

The article is divided into four sections:

  • Software requirements
  • Data allocation and preparation
  • Analysis
  • Outlook

Software requirements

First, you need a current PostgreSQL database, equipped with PostGIS and MobilityDB. I recommend using the latest releases available for your OS of choice, even though MobilityDB works with older releases of PostgreSQL and PostGIS, too. Alternatively, use a docker container from https://registry.hub.docker.com/r/codewit/mobilitydb for those who don’t want to build MobilityDB from scratch.
Second, you'll need a tool to copy our raw flight data served as huge csv files to our database. For this kind of task, I typically use ogr2ogr, a command-line tool shipped with gdal.
Lastly, to visualize our results graphically, we’ll utilize our good old friend “Quantum GIS, a feature-rich GIS-client, which is available for various operating systems.

Here's a quick summary of my setup:

  • Ubuntu (20.04.3),
  • PostgreSQL (13),
  • PostGIS (3.2.1),
  • MobilityDB (1.0.0),
  • ogr2ogr (3.0.4),
  • QGIS (3.20.3)

Data allocation and preparation

The foundation of our analysis is historical flight data, offered by OpenSky-Network free of charge for non-commercial usage. OpenSky offers snapshots of the previous Monday's complete state vector data for the last 6 months. These data sets cover information about time (update interval is one second), icao24, lat/lon, velocity, heading, vertrate, callsign, onground, alert/spi, squawk, baro/geoaltitude, lastposupdate and lastcontact. A detailed description can be found at here.

Enough theory; let’s start by downloading historical datasets for our analysis. The foundation for this post are 24 csv files covering flight data for the day 2022-02-28, which can be downloaded from here.

We’ll continue by creating a new PostgreSQL database, with extensions for PostGIS and MobilityDB enabled. The database will store our state vectors.

From OpenSky’s dataset descriptions, we inherit a data structure for a staging table, which stores unaltered raw state vectors.

From a directory containing the uncompressed csv files, state vectors can now be imported to our flightanalysis database as follows:

for f in ls *.csv; do
ogr2ogr -f PostgreSQL PG:"user=postgres dbname=flightanalysis" $f -oo AUTODETECT_TYPE=YES -nln tvectors --config PG_USE_COPY YES
done

How many vectors did we import?

flightanalysis=# select count(*) from tvectors;
count
--------------
52261548
(1 row)

To analyse our vectors with MobilityDB, we must now turn our native position data into trajectories.
MobilityDB offers various data types to model trajectories, such as tgeompoint, which represents a temporal geometry point type.

To track individual flights from airplanes, trajectories must be generated by selecting vectors by icao24 and callsign ordered by time.

Initially, timestamps (column time) are represented as a unix timestamp in our staging table. For ease of use, we’ll translate unix timestamps to timestamps with time zones first.

Finally, we can create our trajectories by aggregating locations by icao24 and callsign ordered by time.

To visualize our aggregated vectors in QGIS, someone must extract the vectors’ raw geometries from tgeompoint, as this data type is not supported natively out of the box from QGIS.
To create a geometrically simplified version for better performance, we’ll utilize st_simplify on top of trajectory. It’s worth mentioning that simplification methods from MobilityDB also exist, which simplify the whole trajectory and not only its geometry.

From the images below, you can see the vast number of vectors these freely available datasets contain just for one single day.

You can see in the image that the data is noisy, and must be further cleaned and filtered. Gaps in coverage and recording lead to trajectories spanning the world - and subsequently lead to wrong and misleading results. Nevertheless, we’ll skip this extensive cleansing step today (but might cover this as a separate blog post) and move on with our analysis focusing on a rather small area.

Figure 1 Flight vectors world, 2022-02-28
Figure 1 Flight vectors world, 2022-02-28

Analysis

Let’s investigate the trajectories and play through some interesting scenarios.

Number of distinct airframes (identified by icao24)

Average flight duration

Flight vectors intersecting with Iceland between [2022-02-28 00:00:00+00, 2022-02-28 03:00:00+00]

For this kind of analysis, download and import country borders from Natural Earth first.

Figure 2 Vectors intersecting with Iceland: MobilityDB Blog
Figure 2: Vectors intersecting with Iceland

Duration of flyover for flight vectors intersecting with Iceland between [2022-02-28 00:00:00+00, 2022-02-28 03:00:00+00]

Figure 3 Clipped vectors intersecting with Iceland: Mobility DB
Figure 3 Clipped vectors intersecting with Iceland
Figure 4 Clipped vector intersecting with Iceland, icao24 '4cc2c5' callsign 'ICE1046'
Figure 4: Clipped vector intersecting with Iceland, icao24 "4cc2c5" callsign "ICE1046"

Border crossing for individual airframe and callsign while intersecting Iceland between [2022-02-28 00:00:00+00, [2022-02-28 03:00:00+00]

Note that in this case, the plane did not leave the country during this trip.

Figure 5 Border crossing, icao24 '4cc2c5' callsign 'ICE1046' - MobilityDB Blog
Figure 5: Border crossing, icao24 "4cc2c5" callsign "ICE1046"

Outlook

MobilityDB’s rich feature stack drastically simplifies analysis on spatio-temporal data. If I may be allowed to imagine further uses for it, various businesses come to my mind where it could be applied in a smart and efficient manner. Toll systems, surveillance, logistics - just to name a few. Finally, its upcoming release will pave the way for new usages, from research and test utilization to production scenarios.

I hope you enjoyed this session. Learn more about using PostGIS by checking out my blog on spatial datasets based on the OpenStreetMap service

In case you want to learn more about PostGIS and Mobility, just get in touch with the folks from CYBERTEC. Stay tuned!

4 responses to “Analyzing historical flight data with MobilityDB”

  1. I have a few queries:

    1. How to use velocity, heading and altitude fields to precisely depict/present the flights on map.
    2. What sort of different analysis can be drawn from this type of data?

    Would you please guide about it?

CYBERTEC Logo white
Get the newest PostgreSQL Info & Tools


    This site is protected by reCAPTCHA and the Google Privacy Policy & Terms of Service apply.

    ©
    2025
    CYBERTEC PostgreSQL International GmbH
    phone-handsetmagnifiercrosscross-circle
    linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram