Shaping Tomorrow - Balancing AI Innovation and Data Privacy in Transportation Modelling


In the dynamic digital world, the intersection of artificial intelligence (AI) and data privacy has
become a critical area of innovation, discussion, and at times, debate. This is particularly true in the
field of transportation modelling, where the potential of AI to revolutionize our understanding and
management of traffic patterns, public transit systems, and logistics is still largely untapped.

As a modelling professional who has spent a significant portion of my career working with various
algorithms, I've been a first-hand witness to the transformative power of these technologies.
However, I've also seen the challenges that come with protecting the vast amounts of data that fuel
AI development. I recall working on PREDEX - a Goleyo’s software project that used AI to optimize
the impact of roadworks on network routing in a major city. The potential for efficiency
improvement was immense, but so were the concerns about protecting the privacy of those whose
data was being used. This experience, among others, has made me acutely aware of the delicate
balance that must be struck between leveraging AI for progress and ensuring robust data protection.
In this article, we'll delve into this balance, particularly in the context of AI in transportation
modelling, and discuss some innovative approaches to maintaining privacy while still harnessing the
power of AI.

The Impact of AI on Transportation Modelling

Before we delve into how we maintain privacy, it's important to understand where AI can make an
impact in transport modelling. AI has the ability to process and analyze large datasets in real-time,
leading to the development of dynamic transportation models that can adapt to changing conditions
on the fly. For instance, Google Maps uses AI to analyze traffic data from millions of devices to
predict congestion and suggest alternative routes to drivers, helping to save time and avoid traffic

AI has also made it possible to integrate and analyze diverse data sources, from weather patterns to
social media posts, to gain a more holistic understanding of transportation systems. This has led to
more robust and accurate models that can account for a wide range of factors influencing
transportation patterns. For example, the City of San Francisco is using AI to integrate data from a
variety of sources, including weather data, traffic data, and social media data, to better understand
how people are moving around the city. This data is being used to develop more efficient and
effective transportation policies.

Furthermore, AI has been instrumental in advancing autonomous vehicles. Machine learning

algorithms are at the heart of these vehicles, enabling them to navigate complex environments and
make split-second decisions.

The benefits of these advancements are manifold. For individuals, AI-powered transportation
modelling can mean less time spent in traffic and more efficient use of public transit. For cities, it can
lead to better transportation planning and infrastructure development. There is no reason why
transportation modeling would not enjoy a much boost from AI.
However, we must also recognize that while AI models can provide short-term predictions with
reasonable accuracy, predicting long-term trends and outcomes in transportation can be less reliable
due to the numerous variables and uncertainties that can change over time. The quality and
availability of historical data, the constant changes in travel patterns due to technological
advancements, changes in policy, and unpredictable factors like natural disasters or pandemics, all
make accurate long-term prediction a complex task.

Data protection regulation has further made predicting both short-term and long-term trends and
outcomes in transportation even more challenging. For instances, data protection can limit the
amount of data that is available for analysis. This is because data protection laws often restrict the
collection and use of personal data. The GDPR/Data Protection regulation will require that
individuals give their consent before their data can be collected and used. This can make it difficult
to collect the large datasets that are often needed to make accurate predictions.

All these difficulties would mean more importance to balance AI innovation with data protection.
This means ensuring that AI in transport modelling is used in a way that respects individuals' privacy
rights and freedoms.

The Importance of Data Protection

I would not do much delve into this data protection topic here. For this article, we just need an idea
on what are the key data protection element in EU, UK and US.

In the EU, the General Data Protection Regulation (GDPR) is the main data protection law. It sets out
a number of principles for the collection, use, and storage of personal data, including the right to be
informed, the right to access, the right to rectification, the right to erasure, the right to restrict
processing, the right to data portability, the right to be forgotten and the right to object.

In the UK, the Data Protection Act 2018 is the main data protection law. It is based on the GDPR, but
there are some differences between the two laws. For example, the UK GDPR does not include the
right to be forgotten.

In the US, there is no single federal data protection law. However, there are a number of federal and
state laws that protect personal data, such as the Fair Credit Reporting Act, the Health Insurance
Portability and Accountability Act, and the Children's Online Privacy Protection Act. In addition to
these laws, there are also a number of industry-specific data protection standards such as the
Payment Card Industry Data Security Standard (PCI DSS) and the Health Insurance Portability and
Accountability Act (HIPAA).

All three jurisdictions impose penalties on organizations that violate the law, in addition to the usual
reputational damages that come with it. Generally, data protection involves the collection, use, and
storage of personal data in a manner that safeguards the privacy of individuals.

It's also crucial to remember that data protection isn't just about preserving individuals' privacy. It's
also about safeguarding individuals' rights and freedoms. By ensuring data protection, we can help
guarantee that AI is used in a way that benefits everyone.
Ultimately, the goal of data protection is to strike a balance between protecting individuals' privacy
and ensuring that the data available for analysis is accurate and reliable. This can be a challenging
balance to achieve, but it's important to strive for a solution that benefits all parties involved,
especially in the context of transportation models.

Striking the Balance

It would normally be safe to assume that most modelling consultancies, software, and data vendors
have implemented the different data protection regimes in the US, UK, and EU. However, this isn't
always straightforward. For example, the EU GDPR applies to all organizations that process the
personal data of individuals located in the EU, regardless of the organization's location. This means
that even organizations outside of the EU must comply with the GDPR if they process the personal
data of EU citizens. In addition, due to recent calls for more oversight of AI, data protection
regulations in the US, UK, and EU are likely to become much stricter in the future. Hence, a new
technical approach towards AI in transport modeling is much needed.

Balancing AI innovation in transport models with data protection is a complex task that requires a
multifaceted approach. For this discussion, the focus will be on developing AI in Transport Modeling
that is privacy-preserving. This means developing systems that collect and use the minimum amount
of personal data necessary to achieve their intended purpose.

The current and future developments in regulations can drive innovation in privacy-enhancing
technologies. These could include techniques like differential privacy, federated learning, or
synthetic data generation, which can be used to protect privacy while still allowing AI transport
modeling to learn from data.

Privacy-enhancing AI-based Transport Software

Differential privacy, federated learning, and synthetic data generation based AI transport models are
as follows -

Differential privacy (DP) is a mathematical framework for protecting the privacy of individuals when
their data is used in machine learning models. Differential privacy ensures that the addition or
removal of a single individual's data does not significantly change the output of the model. This
makes it difficult to identify individuals in a dataset, even if the dataset is large and contains sensitive
information. However, DP has some limitations. It can reduce the accuracy of machine learning
models. The noise that is added to the data can make it more difficult for machine learning models
to learn accurate patterns. It can be computationally expensive for the relevant stakeholders. It can
also be difficult to implement and there is a risk of introducing more errors into the machine learning

While differential privacy may introduce noise and potentially reduce the accuracy of machine
learning models, it's often a matter of finding the right balance. The level of noise introduced can be
adjusted to find a sweet spot between privacy protection and model accuracy. Additionally,
advancements in computational efficiency and error reduction techniques are continuously being
made, making differential privacy a future viable option for many applications.
Federated learning is a machine learning technique that allows multiple parties to train a machine
learning model on their own data without sharing the data with each other. This is done by having
each party train a local model on their own data, and then sharing the updates to the local models
with each other. This allows the parties to train a more accurate model than they would be able to
train on their own data, while still protecting the privacy of their data.

But it can be less accurate than traditional machine learning methods. This is because the models
are trained on smaller datasets and the updates to the model are aggregated, which can reduce the
accuracy of the model. Moreover, it can be more computationally expensive as federated learning
requires the parties to train a local model on their own data, which can be computationally
expensive to maintain. This is especially so when hosting these local models in a cloud-based
environment. It can be more difficult to implement. Federated learning requires the parties to agree
on a protocol for sharing the updates to the model, which can also be problematic to implement.

Despite the potential computational costs and implementation challenges, federated learning opens
up new possibilities for collaborative learning without compromising data privacy. Techniques such
as model compression and efficient update aggregation can help mitigate some of the
computational challenges. Furthermore, the potential development of standardized protocols and
open-source federated learning platforms can ease implementation difficulties.

Synthetic data generation is the process of creating artificial data that resembles real data. Synthetic
data can be used for a variety of purposes, such as training machine learning models, testing
software, and simulating real-world scenarios. However, synthetic data generation can be
challenging because it is difficult to create synthetic data that is both realistic and representative of
the real world.

While generating realistic and representative synthetic data can be challenging, advancements in AI
and machine learning are making this process increasingly sophisticated. Techniques such as
Generative Adversarial Networks (GANs) are capable of producing highly realistic synthetic data.
Equally, GANs can be difficult to train, and it can take a long time to train them to generate realistic

While each of these privacy-enhancing techniques - differential privacy, federated learning, and
synthetic data generation - come with their own set of challenges, it's important to remember that
they also offer unique opportunities to harness the power of AI in transportation modelling while
protecting individual privacy. I see market reality will ultimately decide the fate on these AI models
based on rhe cost, being readily available and ease in usage.

Looking Ahead

The use of AI in transportation modelling will become more widespread. As AI technology continues
to develop, it will become more feasible to use AI to model transportation systems. The need for
data protection will become more important. This is because AI models often rely on large amounts
of data, and this data could be used to track individuals' movements or identify them.
New privacy-preserving techniques will be developed. To address the need for data protection, new
privacy-preserving techniques will be developed. These techniques will allow AI models to be trained
and used without compromising individual privacy.

Personally I see a greater focus on ethical AI. This means these AI models will be developed and used
in a way that respects individual rights and freedoms. In times, the markets will naturally be
demanding it.

Professionals in the field should learn about data protection so that they can understand how to
protect individual privacy when using AI for transportation modelling. You can sure that Goleyo will
be up for these changes. Watch this space.

