You are on page 1of 10

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/313860069

Port ETA Prediction based on AIS Data

Conference Paper · May 2016

CITATIONS READS

5 4,214

4 authors, including:

Thomas Mestl
DNVGL
20 PUBLICATIONS   916 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

AIS data for maritime research View project

All content following this page was uploaded by Thomas Mestl on 21 February 2017.

The user has requested enhancement of the downloaded file.


15th International Conference on

Computer and IT Applications in the Maritime Industries

COMPIT’16
Lecce, 9-11 May 2016

Edited by Volker Bertram

1
Port ETA Prediction based on AIS Data
Thomas Mestl, DNV GL, Høvik/Norway, thomas.mestl@dnvgl.com
Kay Dausendschön, DNV GL, Hamburg/Germany, kay.dausendschoen@dnvgl.com

Abstract

This paper presents three methodologies for port ETA prediction based on AIS data and a
quantitative assessment of their application. The first two, ETA-for-Liners and ETA-for-Tramps, are
valid for vessels in port. ETA-for-Liners can be used to predict n-ports ahead (and ETA), assuming
the future operations of the vessel can be predicted based on its history. ETA-for-Tramps can be used
to predict the next ports together their probabilities and their ETAs. This method is based on the
behavior of similar vessels that were in the same (current) port as the vessel in question. The third
method applies to ships at sea. It uses the latest position of the ship (based on its AIS signal) and
compares its progress to historic voyages (AIS traces). Being able to obtain an ETA could help
numerous stakeholders in the logistics chain who can benefit from a reliable prediction of a vessel’s
estimated time of arrival in port (ETA).

1. Introduction

Knowing the arrival time of a vessel at a specific port is of interest to many stakeholders ranging from
port authorities, port operators, bunker providers, ship operators, etc. We distinguish between short-
term and long-term ETA (= Estimated Time of Arrival) prediction. Short term is in the range of hours
to days, whereas long term is from weeks to months ahead. The ETA problem consists actually of two
problems: what is the next port and what is the ETA for this port.

2. ETA prediction

2.1. Sources of ETA information

From an outsider perspective, not only the ETA is unknown but also what port the vessel is heading
to. No single source supplies all the required ETA information for all ships. Instead, ETA information
from various sources may be merged for more reliable and complete insight:

• Notification to port authorities – An EU directive requires notifying local port authorities 24


h prior to arrival. Some ports make this information public, e.g. Honk Kong, Montreal or
Dubai.
• Piloting – Public information about ordered pilots offer ETAs usually in the range of hours
but sometimes also weeks ahead.
• DNV GL Navigator/Port Clearance – For historical data as data is transferred only at arrival
in port.
• Port schedules from port operators – Mainly for cruise vessels which have a rather stable
schedule.
• Vessel schedules from operators – Many ship operators publish their schedules, especially for
containerships. Given arrival times are generally “planned”, not “actual”.
• Departure/Arrival search engines – Several search engines on the internet give presumably
planned, not actual arrival times, e.g. www.linescape.com/, www.jocsailings.com
• AIS messages – AIS messages contain free-text fields <Destination port> and <ETA>. These
are filled in manually. The rule rather than the exception is that these fields are either not
filled in at all, filled in wrongly, e.g. “Haburg” (instead of Hamburg) or “going home”, not
updated or the previous destination port is still contained. Our own research has shown that
e.g. for the port of Hamburg only 4% of the ships filled in the destination correctly.
• AIS provider – Vesseltracker claims that “based on arbitrary positions on the globe we
calculate distances, journey times and ETA”. Marine Traffic offers an interactive, map-based

331
service providing ETAs and offers even an alert service. However, there is no public
information on how these providers determine their ETA predictions.
• DNV GL’s in-house AIS-based insight system - DNV GL has built a large data base with
global AIS data for all vessels with AIS transponder starting since 2012. Data samples are
spaced 6-10 minutes apart. The system identifies presence in ports, and in some cases even
terminals, automatically. The system has been used for a variety of applications supporting
ship operators in improving their competitiveness, as described e.g. in Dausendschön (2015).

2.2. Sailing patterns and effect on ETA prediction

We might distinguish between two extremes: one where the future is totally predictable when
knowing the history of the vessel, and one where the future of a vessel cannot be predicted based on
its history.

• Recurring patterns – Liners (e.g. containerships) serve a fixed route over a foreseeable time.
They feature recurring port patterns, Fig. 1. In this case, vessel voyage histories can be used
to predict future destination ports and their ETAs. However, occasionally ports might be
jumped and served by sister vessels.
• Erratic patterns - Tramps (e.g. bulk carriers) have no recurring patterns. Predicting the next
port (and ETA) based on the vessel’s history is usually not possible; a different approach is
needed.

Fig. 1: Example of liner port pattern (source: Maersk)

3. ETA prediction algorithms for ships in port

3.1. ETA for liner shipping

Here we assume that the history of a vessel (port sequences and times) suffices to predict its future.
The next port is determined by looking for chain of previous ports that uniquely determines the next
port. The algorithm does not work if the current port has never been visited before. The ETA for the
predicted next port is based on the typical sailing time for the vessel between the current port and the
next port plus typical port stay time in the current port. The “typical” time is taken as the median
rather than the mean of the historically recorded times. If the next port cannot be determined uniquely
the algorithm gives probabilities for possible next ports along with their ETAs.

332
3.2. ETA for tramp shipping

Here we assume that historic information from similar vessels in the same port can be used. This
approach results in much larger uncertainties. It depends on the number of similar vessels which are in
a similar situation, and especially on the definition of “similar”. If we look only at vessels of same
type and size, operating in the same market segment (same cargo, comparable size of operating
company, etc.) we might end up with a too small data set. If interpreted “similar” too widely we end
up comparing apples with pears as a tanker may have different destination ports from a bulk carrier.
Even for the right balance between too strict and too loose, the inherent uncertainty may still be too
high to provide actionable information.

Filtering criteria for “similar” ships include:


• Ship type (e.g. bulk carrier)
• Time period since current time (e.g. within last 6 months)
• Calling at same terminals / berths (indicative of cargo type, e.g. coal)
• …

Generally, anything that constrains the variability in vessels improves predictability of next ports and
the corresponding ETAs. Ship operator / owner characteristics may give some additional insight.

4. ETA for ships on voyage

The methodology presented here relies heavily on information from the voyage database. As the
voyage database only contains finished sailings, it is by design not up-to-date! Once a vessel has left
port and starts its voyage to the next port, at some point it will become apparent what will be the next
destination port, Fig. 2. Then the next port and ETA prediction can be updated. Actually, we could
either update the list and probabilities of the next ports as given by the ETA for Liners/Tramps, or we
could discard this information and predict potential next ports based on the historic traces (either of
the vessel in question or of a selection of vessels) that are found in the vicinity of the current position.
The latter approach would result list of probable next ports and correspondingly lower probabilities
for each port.

Fig. 2: While in Port X, the ETA algorithm may suggest two potential next ports Y1 and Y2 with
probabilities P1 and P2. Once the vessel has sailed for a time TAIS and reached position PosAIS it is
apparent that the vessel is on a route to port Y1. This means the probability of going to port Y1 has
changed to P1=1, the ETA should be recomputed based on current position PosAIS.

As ships often sail on common sailing routes, this information can be used to update the list of next
possible ports (and their probabilities). If a vessel is on a specific route leading to a set of possible
next ports we can immediately exclude ports that will not be reached from that route, Fig. 1. So when
a vessel passes a route branching point/area the next-port list can be updated. In general, the longer the
vessel is on its way the more branching points will be passed and the more reliable the prediction will
become.

333
Natural constrictions in the shipping geography, Fig. 3, often make it impossible to separate different
sailing routes before the constrictions have been passed. E.g., from Las Palmas the next possible ports
may be Rotterdam, Hamburg or Copenhagen. The English Channel bundles all the sailing routes to
these ports such that only after passing the Channel it will become apparent where the ship will go.
Other such natural constrictions include the Strait of Gibraltar, Strait of Malacca, Singapore Strait,
Cape of Good Hope, Suez Canal, Panama Canal, etc.

Fig. 3: Natural constrictions in sailing patterns (source: www.revistamilitar.pt)

For updating the potential next ports based on the current ship’s position, the general sailing route
network with its branching points/areas can be used, Fig. 4. Sailing routes can be extracted from
global AIS data. They can be tailored to ship size, ship type, selected time period or even season.

Fig. 4: General sailing routes derived from AIS samples

The algorithm for updating next port(s) is based on historic AIS traces. Our approach works even if
very few historic traces are available. It provides a probability whether the vessel in question is on
route R1 to Port P1 or on route R2 to P2 (or general Ri). The question whether being on route R1 or R2
is to a large degree answered if we know the proximity of current position and heading of the vessel to
Ri. A vessel may be heading towards P2 even if it is on the route to P1 as two routes may cross. So in
some cases, additional information must be processed to come to the correct prediction.

What if the vessel is heading towards a port which was not predicted by the ETA algorithm or if no
historic AIS traces are available for that vessel? Every time new AIS arrival data appear, the next port,
probability and corresponding ETA are computed and the corresponding entries in the sailing table
are updated.

Then the probabilities for next ports are computed based on historic traces of the vessel. If a vessel is
on known routes/towards known ports the list of next ports, their probabilities and ETAs can be
updated in the in sailing table. If probabilities for all known routes are zero the vessel is heading
towards an ‘unknown’ port. Then we could extract AIS traces near the current location for all similar
ships and derive potential routes. For these routes potential next ports and associated probabilities and
ETAs are again derived. Taking in additional information (e.g. available terminals or port histories for
sister vessels) allows reducing the potential port list.

334
The voyage data base cannot be used to update the ETA. In order to update the ETA one has to extract
from the large AIS data base all historic sailings of considered and similar vessels between ports
X→Y, Fig. 5. The updated ETA is the typical the remaining sailing time for all the ships being close
to the considered location, i.e. ETAj = Tend j – TAIS j

Fig. 5: Case where general sailing routes approach is not applicable

5. Validation

5.1. Validation of ETA-for-Liners

As a test set, 136 container vessels were chosen representing collectively some 25000 port transitions.
We used the last 10% of their known port transitions to benchmark the algorithm. Fig. 6 gives an
overview of the number of port transitions for each vessel in the test set. About 10 vessels had very
few port transitions (i.e. short port histories) which made it difficult to predict the next port correctly.
In this validation all vessels are assumed to be in port, i.e. no vessel is under sailing.

Fig. 6: Histogram of number of port transitions for the test set of containerships

Fig. 7: Certainty of prediction assigned by the ETA tool for correct predictions (left) and false
positives (right)

335
The test shows that 2/3 of the next ports could be identified correctly. Of the next ports that were
where identified incorrectly ~1/3 were visited for the first time and the history of the vessels could
therefore not be used to predict the next port at all. Fig. 7 shows how certain the algorithm is in its
prediction of the next port. For 82% of the successfully predicted ports, the ETA algorithm indicated
to be certain with probability of >90%, Fig. 7 (left). In cases when the algorithm was wrong about the
next port (false positives) the certainty assigned to it was significantly lower, Fig. 7 (right).

Fig. 8 shows the error of ETA estimation, i.e. difference between calculated ETA and actual arrival
time. For 64%, the error was less than ±12 h, for 91% less than ± 1.5 days. The typical (median) error
was 8 h. This can be considered as a good result considering the median sailing time of vessels.

Fig. 8: Error in the ETA for next ports that were identified correctly

Note that this next port and ETA prediction methodology depends entirely on the assumption that the
history of the vessel alone indicates its future behavior. If, due to macro-economic changes, the ship
owner/charterer changes the routes of their vessel then the ETA methodology will perform poorly. On
the other hand, the tool could be used to detect alterations/trends in route patterns and therefore
changes in the macro economy with changes in trading patterns.

5.2. Comparing ETA-for-Liners with ETA-for-Tramps

In order to compare our two approaches (ETA-for-Liners and ETA-for-Tramps) we establish a


baseline given by using the vessels history to predict its next ports, i.e. using ETA-for-Liners on
Tramps.

The test set contained 345 bulk carriers that visited dry-bulk terminals in Hamburg port since in 2015.
As before, 10% of the last port visits were used to test the algorithm. The correct next port prediction
rate was only 10%. (Note that we consider a prediction to be successful if the correct port is contained
in the list of predicted ports even if not with highest probability.) Almost 60% of all next ports in the
test sets were visited for the first time (in the chosen time period) which confirms that the history of
tramps are ill suited for predicting future ports.

In order to get an indication of the success rate of ETA-for-Tramps, preferably a large number of bulk
ports should be chosen and the corresponding largest probability of the next port should be computed.
The resulting distribution of largest probabilities would provide an estimate of the next port prediction
accuracy of the ETA-for-Tramps. Unfortunately, this task is very time consuming, in the range of
hours for one port. We therefore restricted our validation to the study of some selected ports, 5 in
Europe (DEHAM, DEBRE, GBHUL, RUMMK, ITRAN), 2 in South America (BRARB, ARBUE)
and 1 in the US (USNNS), Table I.

336
Table I: Maximum probability of predicting the next port correctly based on ETA-for-Trumps
Max. correct probability
Port # vessels
for next port prediction
DEHAM 345 7.8%
ITRAN 282 8.0%
USORF 214 6.6%
RUMMK 194 5.7%
USNNS 147 18.4%
BRARB 143 7.6%
DEBRE 111 18.6%
GBHUL 29 10.2%
ARBUE 28 20.0%
MEDIAN 166 11 %

In general, the more vessels are in a port the more difficult it gets to predict the next port correctly (28
vessels give 20% probability whereas 345 vessels give 7.8%). The average correct next-port
prediction probability is near 11%. This is more or less in the area as for the ETA-for-Liner method.

5.3. Deep dive for a tramp example

This case study provides more details with respect to correct next-port predictions of both approaches.
We look the Hamburg example from the previous section, applying successive constraints:

• Constraint 1 - Dry cargo only: 345 bulk carriers visited dry-bulk berths in Hamburg port in
2015. For Hamburg, there are 127 possible next ports with a probability of correct prediction
of only 0.3 - 7.8%. If we base our prediction on the vessel history (ETA-for-Liners), we can
correctly predict the next ports for 5.5% of the ships, which is only slightly less than the ETA-
for-Tramps approach. (As before, correct means that the actual next port is contained in the
next-port list suggested by the algorithm, not necessarily having the highest probability).
• Constraint 2 – in addition requiring Chemicals berths: Based on these constraints the ETA-
for-Tramps approach resulted in a higher probability of 2.4% - 14.3% for predicting the cor-
rect next port. In comparison, if we base our prediction on the history of the vessels (ETA-
for-Liners), we are correct for 5% of the involved ships, which is less than half of that for the
ETA-for-Tramps approach.
• Constraint 3 – in addition requiring self-discharging bulk: Constraining even further, the
ETA-for-Tramps approach resulted in a probability of 12.5% - 25% for predicting the next
port correctly. Using the ETA-for-Liners approach, we are correct in 20% of the next ports,
which is only slightly less than the ETA-for-Tramps approach.

6. Conclusions

If data about many port visits are available and for vessels with non-recurring port sequence pattern, ,
the prediction rate of the ETA-for-Tramp method seems to be in the same range as the ETA-for-
Liners approach, i.e. 10-11% correct prediction of the next port. By introducing constraints (i.e.
additional knowledge) the ETA-for-Tramps method provides a significantly better prediction rate that
using the history of the vessel. This is plausible as filtering out vessels that are not similar should
provide a more usable data set and hence better prediction. However, using too many/too stringent
constraints will result in a single data set – that of the vessel in question. Which method gives the best
prediction depends on the nature of the vessel, i.e. whether it shows more a Tramp-like (erratic) or a
Liner-like (regular) behavior. In practice, we suggest to apply both methods as we often don’t know
whether the vessel in question behaves regularly or irregularly. In economic challenging times, ship
owners may jump on available offerings and a vessel may then behave as a Tramp, whereas in better
times the charterer may want to assure he has a vessel and books it more in advance. The same vessel
will then show a more regular, predictable pattern.

337
References

DAUSENDSCHÖN, K. (2015), Big Data – Business Insight Building on AIS Data, 14th Conf.
Computer and IT Applications in the Maritime Industries (COMPIT), Ulrichshusen

338

View publication stats

You might also like