Read without ads and support Scribd by becoming a Scribd Premium Reader.
 
36
 
PERVASIVE
 
computing
 
Published by the IEEE CS
n
1536-1268/11/$26.00 © 2011 IEEE
Large-ScaLe OppOrtuniStic SenSing
esimig Oigi-Dsiio Flowsusig Mobil phoLoio D
O
rigin-destination (OD) matri-ces represent one o the mostimportant sources o inor-mation used in the strategicplanning and management o transportation networks. A precise calculationo OD matrices is an essential component inenabling administrative authorities to optimizethe use o their transportation networks, notonly to benet users on theirdaily journeys, but also toplan investments required toadapt these inrastructuresto envisaged uture needs. Tra-ditionally, urban planning andtransportation engineeringuse household questionnairesor census and road surveys todevelop methodologies or OD matrix estima-tion. This approach has two main drawbacks:
•
calculating an OD matrix, rom the initialdata gathering to the exploitation o the rstresults, can take years and produce only asnapshot o the travel demand; and
•
the collected data has shortcomings in termso both spatial and temporal scale.Sensor-based OD estimation methods usestreet sensors such as loop detectors and videocameras together with trac-assignment mod-els. Analogous methods have been developedusing probe vehicles, in which vehicle tracesserve as data sources.
1,2
Those methods are,however, limited by the act that models areoten underdetermined because the number o parameters to be estimated is typically largerthan the number o monitored network links.
3
On the other hand, the wide deployment o pervasive computing devices (mobile phone,smart cards, GPS devices, digital cameras, andso on) provides unprecedented digital ootprints,telling where people are and when they’re there.Earlier projects have used dierent methodolo-gies or detecting the presence and movement o crowds through their digital ootprints (Flickrphoto, mobile phone logs, smart card records,and taxi/bus GPS traces).
4–6
This ne-grainedanalysis could dramatically increase our un-derstanding o the use o space and daily com-muting fows or urban mobility planning andmanagement. Thus, it’s no surprise that the ideao using mobile phones to monitor trac con-ditions isn’t new, as we discuss in the “RelatedWork in Analyzing Trac Flow” sidebar.Although the results rom these other studiesshow great potential or using cellular probe tra-jectory inormation to estimate travel demand,all methods must overcome several shortcomingsbeore they can be put into practice. Indeed, as
Using an algorithm to analyze opportunistically collected mobile phone location data, the authors estimate weekday and weekend travel patterns of a large metropolitan area with high accuracy.
Francesco Calabreseand Giusy Di Lorenzo
IBM Dublin Research Laboratory
Liang Liu and Carlo Ratti
Massachusetts Instituteof Technology
 
OCTOBER–DECEMBER 2011
PERVASIVE
 
computing
37
Yi Zhang and his colleagues note,
7
eldtests are needed or the ollowing reasons:
•
real coverage areas o cell phone tow-ers are quite dierent rom simulatedones, and vary rom urban to ruralareas;
•
validation o methods to determinea trip’s origin and destination shouldbe perormed using real individualmobility data;
•
real mobility and calling patternsshould be included in the analysisbecause they crucially infuence themethods’ perormance;
•
existing OD matrices should be usedas ground truth to veriy the correct-ness o the estimated results.Our methodology uses opportunis-tically collected mobile phone locationdata to estimate dynamic OD matrices.We address concerns using a real mo-bility and calling dataset rom 1 millionmobile phone users. We use the Bostonmetropolitan area as a case study andvalidate our methodology using censussurvey data or both county and census-tract levels.
8
To our knowledge, boththe methodology developed and thedata precision and amount are unique.
Mobil pho Ds
The considered dataset consists o anonymous location measurementsgenerated each time a device connectsto the cellular network, including:
•
when a call is placed or received (bothat the beginning and end o a call);
•
when a short message is sent orreceived;
•
when the user connects to the Inter-net (or example, to browse the Web,or through email programs that peri-odically check the mail server).In this article, we reer to these eventsas
network connections
. The eventsrepresent a superset o those in the calldetail records used elsewhere.
9,10
We analyzed 829 million mobilelocation data or 1 million devicescollected by AirSage (www.airsage.com). This data included not only theID o the cell tower the mobile phonewas connected to, but also an estima-tion o its position within the cell,which is generated through triangu-lation by AirSage’s Wireless SignalExtraction technology. Each loca-tion measurement
m
i
 
 
M
is charac-terized by a position
 p
m
i
expressed
S
everal related studies on analyzing trac fow have beenpublished in recent years.Raaele Bolla and Franco Davoli presented a model or estimating trac using an algorithm that calculates trac para-meters on the basis o mobile phone location data.
1
Researchersin Rome developed a case study or real-time urban monitor-ing using aggregated mobile phone data to monitor trac andmovement o vehicles and pedestrians.
2
Randal Cayord andTigran Johnson analyzed the main parameters to be consid-ered, namely precision, metering requency, and the number o localizations necessary to achieve accurate trac descrip-tions.
3
Several companies worldwide, including ITIS Holdings(Britain), Delcan (Canada), CellInt (Israel), and AirSage andIntelliOne (USA), have begun developing commercial applica-tions o mobile phone-based trac monitoring. With the speciic goal o measuring OD lows, dierentmobile phone signaling datasets have been consideredand simulated to evaluate the easibility o estimating trips.Initial work used billing data, consisting o cell phonetower inormation every time a phone received or made acall.
4
Other research has used mobile phone positions everytwo hours to iner trips,
5
location updates to iner mobilephone movement,
6
and cell phone tower handover inorma-tion.
7
A recent eort estimated the daily OD demand usingsimulated cellular probe trajectory inormation (extracted rom location updates, handover, and transition o timingadvance values) and tested the methodology via the VisSimsimulation.
8
REfEREnCES
1. R. Bolla and F. Davoli, “Road Trac Estimation rom Location Track-ing Data in the Mobile Cellular Network,”
Proc. IEEE Wireless Comm.and Networking Conf.
, vol. 3, IEEE Press, 2000, pp. 1107–1112.2. F. Calabrese et al., “Real-Time Urban Monitoring Using Cell Phones: A Case Study in Rome,”
IEEE Trans. Intelligent Transportation Systems 
,vol. 12, no. 1, 2011, pp. 141–151.3. R. Cayord and T. Johnson, “Operational Parameters Aecting Use o  Anonymous Cellphone Tracking or Generating Trac Inormation,”
Proc. Transportation Research Board Ann. Meeting 
, 2003.4. J. White and I. Wells, “Extracting Origin Destination Inormation rom Mobile Phone Data,” International Conerence on
Road Trans-portation and Control 
, IEE, 2002, pp. 30–34.5. C. Pan et al., “Cellular-Based Data-Extracting Method or Trip Distri-bution,”
 J. Transportation Research Board 
, vol. 1945, 2006, pp. 33–39.6. N. Caceres, J. Wideberg, and F. Benitez, “Deriving Origin DestinationData rom a Mobile Phone Network,”
Intelligent Transport Systems 
,vol. 1, no. 1, 2007, pp. 15 –26.7. K. Sohn and D. Kim, “Dynamic Origin-Destination Flow EstimationUsing Cellular Communication System,”
IEEE Trans. Vehicular Technol-ogy 
, vol. 57, no. 5, 2008, pp. 2703 –2713.8. Y. Zhang et al., “Daily O-D Matrix Estimation Using Cellular ProbeData,”
Proc. Transportation Research Board Ann. Meeting,
2010.
rld Wok i alyzig tffi Flow
 
38
 
PERVASIVE
 
computing
 
www.computer.org/pervasive
Large-ScaLe OppOrtuniStic SenSing
in latitude and longitude and atimestamp
m
i
.To iner trips rom these measure-ments, we rst characterized the indi-vidual calling activity and veried thatit’s requent enough to allow monitor-ing the user’s movement over time witha ne enough resolution. For each user,we measured the interevent time—thatis, the time interval between two con-secutive network connections (similarto what Marta González and her col-leagues measured
10
). The average inter-event time measured or the entire pop-ulation was 260 minutes, much lowerthan González and her colleagues’ mea-surement (500 minutes) because we’realso considering mobile Internet con-nections. Because the distribution o interevent times or a user spans severaltemporal scales, we urther character-ized each calling activity distributionby its rst and third quantile and themedian. Figure 1 shows the distribu-tion o the rst and third quantile andthe median or all available users intothe dataset. The arithmetic averageo the medians is 84 minutes (the geo-metric average o the medians is 10.3 min-utes) with results small enough to de-tect changes o location where the userstops or as little as 1.5 hours.Mobile-phone-derived location datahas lower resolution than GPS data.Internal and independent testing sug-gest an average uncertainty radius o 320 meters, and a median o 220 me-ters. Moreover, at some peak usage pe-riods, additional location errors can beintroduced when users are automati-cally transerred by the network romthe closest cellular tower to one that’surther away but less heavily loaded.
Oigi-Dsiioesimio Mhod
The procedure or estimating dynamicOD matrices consists o two steps: tripdetermination and origin-destinationestimation.To alleviate the eects o localizationerrors and event-driven location measure-ments on individual trip determination,we apply a low-pass lter with a 10-minuteresampling rate to the raw data. This ol-lows an approach tested with data romRome, Italy.
11
In addition, because ewerlocalization errors might still generatectitious trips, we adapt a preprocess-ing step used to analyze GPS traces. Thisstep uses clustering to identiy minoroscillations around a common location.The approach used to handle loca-tion errors and identiy meaningullocations in a user’s travel history canbe understood as ollows:
•
We begin with a measurement series
M
s
 
=
 
{
m
q
,
m
q
+
1
,
,
m
z
}
 
 
M
z
-
q
-
1
,
q
 
>
 
z
,derived rom a series o network con-nections over a certain time interval
T t
m m
z q
= >
0.
•
We dene an area with radius
S
 (in this case, 1 km to account or thelocalization errors estimated by Air-Sage), such that 
max ,, .
distance
()
< ∆
S p pi j z
m m
i j
q
•
All consecutive points
 p
 j
 
 
M
s
orwhich this condition holds can beused together such that the centroidbecomes a virtual location 
 p z q p
s mi qi z
i
=
==
( )
1
 (the centroido the points),that is a trip’s origin or destination.
•
Once the virtual locations aredetected, we can evaluate the stops(virtual locations) and trips as pathsbetween users’ positions at conse-cutive virtual locations. Each trip
trip
(
u
,
o
,
,
) is characterized byuser ID
u
, origin location
o
, desti-nation location
, and starting time
.Once trips are extracted, we use theollowing procedure to derive OD fows:
•
The geographical area under analy-sis is divided into regions:
region
i
,
i
 
=
1,
,
n
.
Figure 1. Caracterizatio o idividua caig activity or te etire popuatio,i terms o time betee to etor coectio evets. Graps so tedistributios o te media (soid ie), frst quatie (das-dotted ie), ad tirdquatie (dased ie) o idividua iterevet time.
10
–2
10
–1
10
0
10
1
10
2
10
3
10
4
10
5
00.010.020.030.040.050.060.070.080.090.10Interevent time (minutes)
        D        i      s       t      r        i        b      u       t        i      o      n
First quantile Median Third quantile
Search History:
Searching...
Result 00 of 00
00 results for result for
  • p.
  • Notes
    Load more