You are on page 1of 62

Sustainable Cities and Society

Impact of transportation network companies on urban congestion: Evidence from large-


scale trajectory data
--Manuscript Draft--

Manuscript Number: SCS_2019_2571R1

Article Type: Full Length Article

Keywords: urban traffic congestion; transportation network companies; Emission.; trajectory data

Abstract: We collect vehicle trajectory data from major transportation network companies (TNCs)
in New York City (NYC) in 2017 and 2019, and we use the trajectory data to
understand how the growth of TNCs has impacted traffic congestion and emission in
urban areas.  From 2017 to 2019, the number of for-hire vehicles (FHV) has increased
by over 48%. This results in an average citywide speed reduction of 22.5% on
weekdays, and the average speed in Manhattan decreased from 11.76 km/h to 9.56
km/h in March 2019. The heavier traffic congestion has led to 136% more NOx, 152%
more CO and 157% more HC emission per kilometer traveled by the FHV sector. Our
results show that the traffic condition is consistently worse across the different times of
day and at different locations in NYC. And we build the connection between the
number of available FHVs and the reduction in travel speed between the two years of
data and explain how the rise of TNC may impact traffic congestion in terms of moving
speed and congestion time. The findings in our study provide valuable insights for
different stakeholders and decision-makers in framing regulation and operation policies
towards more effective and sustainable urban mobility.

Powered by Editorial Manager® and ProduXion Manager® from Aries Systems Corporation
Response to Reviewers (without Author Details)

Impact of transportation network companies on urban congestion:


Evidences from large-scale trajectory data

Responses to Reviewers
Dear editor and all three reviewers:

The authors would like to thank you for your insightful comments. We really appreciate your efforts in thoroughly
examining our submitted manuscript and your inputs have helped immensely in improving the quality of our paper.
We have made comprehensive revisions to address all your comments. We have marked major changes in our revised
manuscript in blue, and please find following a summary of the revision:
1. We have expanded the scope of our study from Manhattan to the entire NYC (except for State Island due to few
trips). All descriptions of the data and the numerical experiments have been updated to reflect this change. We
hope that this change will deliver more comprehensive analyses and lead to more convincing findings for the
entire NYC, as compared to our previous discussion that was limited to Manhattan only.

2. We have changed the approach for calculating borough and citywide average travel speed. In our previous
manuscript, the speed was calculated as the fleet-wise average (e.g., the summation of all distances of the
identified activities divided by the summation of all trip time of the identified activities). Since the vehicles are
not uniformly distributed across the city, such calculation put higher weights on places with more vehicles and
will lead to a biased understanding of the actual citywide speed. In this revised manuscript, we first calculated the
speed of each hexagon, and then we measure the citywide and the borough-wide speed as the average of the
hexagonal mean speed. This should more accurately reflect the actual citywide traffic state. This change also
applies to the calculation of other speed and emission metrics.

3. A description of the identified activities is added in our manuscript and can be found on page 11, Section 4.1 –
Overview of identified activities. This section provides the spatial distribution of the number of identified
activities and the summary statistics of the activities in 2017 and 2019. In addition, this section also explains how
we address the differences in the amount of data collected between the two years.

4. We have updated the background facts of the New York City and added a table to summarize the changes between
2017 and 2019.

5. Additional discussions are added for the borough-wide changes between 2017 and 2019.

6. We have corrected several grammar and spelling errors and improved the writing of our revised manuscript.

We hope the changes made in this revised manuscript will sufficiently address the concerns from the reviewers. The
following of the response letter presents the detailed responses to each comment from individual reviewers.
Response to reviewer 1:
Comments from Reviewer 1:
This is a interesting study on transportation network and urban congestion using very detailed data in Manhattan, New
York City. Yet, I have several concerns for this paper and I am looking forward to review the revised manuscript.
Response:
We thank the reviewer for carefully reviewing our paper. Please see the following our response to each of your
concerns.

1. There is a mistake (or my misunderstanding) in Table 2. Throughout the paper, the authors used CO, HC and NOx,
while in Table 2, it is VOC rather than HC.
Response:
In most cases VOC refers to HC that are volatile and will vaporize into atmosphere at room temperature and
atmospheric pressure. We apologize for the inconsistency of the usage between VOC and HC. We have changed all
VOC to HC accordingly.

2. In Table 2, how do you get the emission parameters in Table 2? What is the definition of “small vehicles”? Are
Uber cars necessary to be “small vehicles”?
Response:
The Table 2 in our previous manuscript now becomes Table 3 in this revised manuscript.

The emission parameters were obtained from the widely used and well-calibrated COPERT emission model (reference
[32] in our revised manuscript).

In our manuscript, small vehicle represents the vehicle type of Gasoline PCs (Passenger cars) defined in the COPERT
model. There exists the possibility that not all for-hire vehicles are aligned with the vehicle type Gasoline PCs defined
in the COPERT model. Unfortunately, statistics on the most commonly used vehicles by Uber drivers are not available.
We therefore assumed in our study that most Uber vehicles belong to the Gasoline PC category, which is reasonable
from the economic perspective. In addition, we focus more on indicating the changes from 2017 to 2019. We therefore
noted in our revised manuscript that (see page 8 lines 210-216):

“Note that not all Uber vehicles may comply with the Euro 3 standards and there is no available data to understand
the type of vehicles in Uber fleet. In addition, Euro 3 standards may not fully comply with the US EPA standards and
hence the exact value calculated for emission and fuel consumption may not be taken as the accurate measure for
NYC. Nevertheless, the change of standard will only affect the model parameters but not the relationship between
velocity and the corresponding fuel consumption and emission and the obtained results still capture the relative
change between 2017 and 2019.”

We hope these may sufficiently address the reviewer’s concern.

3. Why does the relationship shown in Figure 5 matter?


Responses:
In this revision, we have changed the way in presenting the relationship between MA speed and SA ratio, and the
updated results can be found in Figure 5 of this revised manuscript.

This figure is mainly used to distinguish two different cases of stationary activities (SA): (1) SA may be because of
drivers waiting on side of the street (e.g., wait in the parking lot at the airport), or (2) SA may be the actual results of
heavier traffic congestion. For case (1), the obtained travel speed will be lower than the actual travel speeds since
parked vehicles do not add to traffic congestion. It is therefore necessary to separate these two cases to evaluate if the
obtained speed is accurate. It also helps to understand where are the places that (1) are mostly observed.

To distinguish between (1) and (2), we may use the relationship between MA speed and SA ratio. Following are three
possible scenarios that correspond to normal traffic condition (case (2)):
1. Both MA speed and SA rate are lower than normal cases. This represents the traffic condition where vehicles are
slowly moving but not making frequent stops. And this can be observed in areas with low speed limit.
2. MA speed is higher than normal and SA rate is lower than normal. This represents the ideal traffic condition
where vehicles are moving smoothly.
3. MA speed is lower than normal and SA rate is higher than normal. This represents the worse traffic condition
where vehicles are slowly moving and making frequent stops.
And none of the three cases support the argument where large number of Uber drivers park on side of the street waiting
for passengers, which is the case (1). And this can be captured by the observations that locate in the upper-left corner
of the scatter plots shown in Figure 5.

In this revised manuscript, we report that there are few observations where drivers park and wait for passengers in
Manhattan. Nevertheless, we suspect many drivers in Bronx, Brooklyn and Queens may prefer to park and wait for
future orders, and we found that this usually takes place in peripherical areas with fewer trips and less traffic. We have
also updated the discussion related to Figure 5 and the discussion can be found on page 13, lines 276-304.

4. The authors asserted that “Our claim can be supported by the relatively better traffic condition during morning
peak (8AM), evening peak (6PM) and night hour (10PM) when there are fewer cruising FHVs.” Why and how? In
addition, where can we see the “relatively better traffic condition during morning peak (8AM), evening peak (6PM)
and night hour (10PM) when there are fewer cruising FHVs”?
Response:
We thank the authors for the comment. This argument was made from our previous biased calculation of speed metrics
and we apologize for this inaccurate description. We have corrected the speed calculation as well as the corresponding
statement. As a result, this argument has been removed and our updated discussion on the traffic condition and the
impact from cruising FHVs can be found on page 16, lines 329-362.

5. The focus and analysis of this paper are about Manhattan, NYC. Can the findings in this paper be generalized to
other regions?
Response:
We appreciate this feedback from the reviewer. This revision now extends the analysis from Manhattan, NYC to the
entire NYC area. We report that the conclusions are largely consistent with our previous manuscript. Meanwhile, we
are able to identify new findings that are not found in our previous manuscript (e.g., the behavior of parking and
waiting in Queens, Brooklyn and Bronx). Nevertheless, our primary goal is to scientifically justify if the increase in
FHVs lead to worse traffic condition and explain the underlying reasons. The data-driven framework can definitely
be generalized and applied to other regions to understand the impact of FHVs. However, we may not guarantee that
the same findings will be observed in other places since the change of traffic condition is not merely a function of the
number of FHVs. The land use pattern, mobility pattern, the level of service of public transit and more other factors
vary from cities to cities, and they will determine how a city may be affected by the increase of FHVs. In our study,
NYC is an ideal place to subjectively evaluate the impact of FHVs since other major factors do not vary significantly
over our study period.
Response to reviewer 2:
Comments from Reviewer 2:
This paper uses trajectory data scraped from the Uber and Lyft APIs to measure speeds and calculation tailpipe
emissions in Manhattan. It does this for 2017 and 2019, reporting on the change in both. The authors find that the
average speed decreases by 11% on weekdays, and attribute that speed decrease to the increase in TNCs.
The data set itself is an important contribution. There are few available data sets on TNC use, especially for detailed
trajectories, so that is a valuable resource. They show quite a large speed decrease over a short period of time. That
period of time corresponds to a large increase in TNC use, so I would expect that their main conclusion that TNCs
increase congestion is correct. In addition, it is consistent with other literature that they cite.
Nonetheless, there are several weaknesses in this analysis that should be addressed before it is published. This includes
both questions of the analysis itself, and clarity about what set of trips it applies to. You can others to poke holes in
your analysis after it is published, so you are best off addressing any concerns now, rather than arguing about them
later.
Response:
We really appreciate the reviewer for providing us very comprehensive comments. We have made comprehensive
revisions in this revised manuscript and we hope the following response may sufficiently address your concerns.
Once again, we thank the reviewer for all these suggestions and your comments have tremendously helped us to
improve our manuscript.

1. There are a number of places where the scope of your analysis is not clear. You should be precise about this so the
reader knows exactly what the numbers you reference refer to. In some cases, you say it, but it doesn’t shine through
clearly. In other cases it needs to be made explicit. Some specific examples include:
1) On page 2 you talk about population and employment in NYC, but limit your analysis to Manhattan. You should
be consistent.
Response:
We thank the reviewer for pointing out all these unclear implications. While collecting data, Uber trip data in the
whole NYC was collected. As specified in the introduction part, considering the higher and more condensed coverage
of trips in Manhattan than in other regions of NYC and also the fact that the heaviest congestion happens in Manhattan,
we then limited our discussion to Manhattan.

Following the reviewer’s comment, we have now changed the scope of our study to cover the entire NYC area (except
for Staten Island due to few number of trips and few data collected).

2) Your data collection is for all of NYC, but your analysis is limited to Manhattan. Why not do the analysis for
the whole city and report it separately by borough? Yes, Manhattan will have a bigger difference because it is more
congested and has more TNCs, but the differences by borough would add to the paper.
Response:
We have now extended the study area to entire NYC (except for Staten Island) and all our results have been updated
accordingly.

3) The 11.3% speed reduction is just the difference from 2017 to 2019, associated with the increase in FHV. The
speed reduction associated with all FHV should be bigger than this, right? That is worth noting.
Response:
We use the FHVs as the probe vehicles of the citywide traffic and the differences in traffic conditions between 2017
and 2019 that are experienced by the FHVs should reflect the actual changes that are resulted from the increase in
FHVs in these two years. In order to understand the total changes since the entrance of FHVs, we will need to compare
the traffic in 2009 vs 2019, which is not the focus of our study. We also note that other factors such as employment
and vehicle ownership have also changed in such a long time period. The results in our study would be a more accurate
characterization of the marginal impact from FHVs as many other factors stay the same.

4) The emissions increases are “per kilometer traveled by the FHV sector”, but it is specific to cruising FHV, right?
You are assuming that it applies to the full sector, which is a reasonable assumption, but should be made explicit.
Response:
Yes the emissions are calculated from the cruising FHVs’ trajectories and we made the assumption that this also
applies to the full sector. Following your suggestion, we have corrected the statement in our revised manuscript and
revised statement can be found on page 13, lines 272-275:

“If we assume the metrics calculated from cruising FHVs also apply to the full FHV sector, and project these values
onto the increase of FHV trips while assuming the same average distance per trip, these translate into that FHVs have
introduced 152% more CO, 157.7% more HC and 136.5% more NOx for every kilometer they traveled.”

5) If the VKT of FHV increases, the total emissions increase from FHV will be quite a bit bigger. This is worth
noting.
Response:
This is correct and but our data are not sufficient to calibrate the changes in total VKT between 2017 and 2019. We
therefore restrict our statement by stating that “while assuming the same average distance per trip…”

6) If the emissions increase per KM is due to increased congestion, shouldn’t that increase apply to all vehicles, not
just FHV? If you are being conservative in your application, that is fine, but that is also worth noting.
Response:
State that our discussion here is applied to FHV only…. Revised that paragraph. We hope this remove the ambiguity.

7) At one point in the paper you talk about collecting data for both Uber and Lyft, but at another point you talk about
doing the analysis just for Uber. Which is it?
Response:
We thank the reviewer for pointing this out. The data we collected were primarily from Uber in 2017. And we changed
our collection approach in 2019 which fetched comprehensive Lyft data as well. And since we have little data in 2017
from Lyft, we decided to only use Uber data for the analysis in this study. And Uber trajectories are representative of
TNC sector in NYC due to its dominant market share. We add the following statement in the revised manuscript to
make it clear that only Uber data are used:

“(Lines 124-125 on page 5) In this study, only the trajectory data from Uber are used as it is the dominant TNC in
NYC with approximately 70% market share[31].”

2. This whole analysis is based on trajectory data, but that trajectory data is just for those vehicles that are cruising. It
does not include TNCs that have a passenger. Cruising is probably 40-50% of the TNC VKT. (I believe the new cap
is 35% and they are struggling to get to that.) So the total TNC VKT is quite a bit higher than you show. A method
to infer the TNC passenger-serving trips can be found in:
Cooper, Drew, Joe Castiglione, Alan Mislove, and Christo Wilson. “Profiling TNC Activity Using Big Data.” In TRB
Annual Meeting, 2018.
You could do more with the paper if you also applied that approach to calculate the total set of trips. In your case,
this could be calibrated to align with the TLC data. (I very much appreciate your validation against the TLC data.)
You would also do well to discuss the implications of the data being only for cruising vehicles.
Response:
We thank the reviewer for this comment. We have carefully reviewed the suggested study and the method proposed
was indeed interesting and sound for inferring vehicle miles traveled by TNCs. But we finally decided to not include
this as part of our study due to the following reasons:
1. The method was well established in the suggested study and the methodology itself will not be a contribution of
our study.
2. The focus of our study is to infer the impact of TNCs on road traffic condition and the approach we adopted it to
measuring the microscopic behavior (e.g. stationary activity and moving activity) of each TNC vehicles on road. In
this regard, having the inferred TNC trips is not sufficient. Instead, we will also need to predict the actual trajectories
of the occupied trips, which itself is a very challenging problem and we are not able to validate the predicted
trajectories. This may affect the credibility of this study.
3. On the other hand, since street hailing is prohibited for TNCs, we believe the behavior of cruising trips are largely
consistent with the occupied trips in most of the areas in NYC. In fact, we are able to compare the average speed
(2017) calculated from the cruising trajectories with the NYC mobility report [1] and we observe the value is quite
consistent in Manhattan area (Our value is 11.66 km/h or 7.28 mph as compared to 7.1 mph reported by NYC DOT).
We believe this finding provide an additional support for the validity of our approach.
We follow the reviewer’s suggestion to improve our discussion on the implications of using trajectory data in this
revised manuscript. Our new results discussed if the road condition inferred from cruising trip is representation of
actual road condition. The added discussion can be found on page 13 from lines 276 to lines 304, and the discussion
is associated with the revised Figure 5 in this manuscript. The major conclusion is that by using the cruising data, we
may underestimate the actual travel speed and hence overestimate the emissions and energy consumption. The
primarily reason is due to that some FHV drivers will choose to park on side of the street and wait for future orders.
While this may still affect the moving traffic, not all of the SA periods actually contribute to the slowing down of the
travel speed. We also report that this issue is minor in Manhattan and the results in Manhattan are therefore very
reliable.

3. The sampling rate in 2017 is 5 s and the sampling rate in 2019 is 1 minute. What are the implications of this? You
are going from dense probe data to sparse probe data, so how much does this affect the emissions calculations among
other factors. There is a risk here that part of the change you are measuring is due to the difference in sampling, rather
than due to a difference in the real world. You could test for this by resampling the 2017 data to 1 minute intervals,
and then applying the analysis to both years with the sparsely sampled data.
Response:
We really appreciate this feedback. Indeed, the change of sampling frequency led to different amount of data collected
in these two years, as we mentioned in our manuscript that:

(Lines 153-154, page 6) The 470 data collection stations fetched around 100 GB data per day in 2017 and 5.17 GB
data per day in 2019.

And for the 100GB of daily data in 2017, there were large amount repetitions (e.g. the same trajectory appeared 4,5
times for a single vehicle). By removing these repetitive samples, we ended up with approximately 10 GB of data per
day. We then proceed with the activity identification for both years, and we further sample 30% of the identified
activities (uniform at random) following the suggestion from the reviewer. This finally gives us similar number of
activities for both years. Please see Figure 3 in our revised manuscript for details on the spatial distribution and
histogram of the number of activities identified. In summary, we finally have 123 million activities from 2017 and
117 million from 2019 that are used to obtain all the results in our paper.

We have also added the following statement in this revised manuscript (page 11, lines 243-245):

“As mentioned earlier, due to the change in data collection frequency, there exists a significant difference in the
amount of data collected in 2017 and 2019. To overcome this issue and deliver fair comparison, we perform sampling
from the 2017 data and include 30% of the total identified activities. This results in similar number of total activities
identified in 2017 and 2019….”

4. Using the TNC data as probe vehicles, you show that the speed decreases between 2017 and 2019. You also note
that FHV use increases between these two years and say that the increase in FHV results in the speed decrease. Yes,
this seems like the obvious explanation, but I still recommend being careful in your language. You are showing that
these happen at the same time, but not necessarily the causation—there could be some other factor in there as well.
So rather than “results in” maybe “associated with” or “correlated with” or some other language.
Response:
We thank the reviewer for this suggestion We have revised our language carefully and modified some of the words
accordingly. In most of the places, we replaced the phrase “results in” with “contributes to”. We have also revised our
statement related to the relationship between the number of Uber drivers and the travel speed as following: (page 18,
lines 379-382):
“While correlation does not necessarily imply causation, if we may eliminate the impact from other contributing
factors such as those shown in Table 1, the strong negative correlation likely hints that the increase in FHV drivers
is the primary contributing factor to the citywide worse traffic congestion and emission.”

5. Along those same lines, you argue on page 2 that basically nothing else changes in NYC over this period. It seems
entirely plausible that the TNC change overwhelms these other changes, but nonetheless, this is a fairly weak argument.
I recommend being as specific as possible to the times and locations of observed congestion increases:
1) How does the spatial distribution of population and employment change over this two year period? Is Manhattan
different? Is midtown different?
Response:
We would like to get as accurate data as possible, but the demographic and socioeconomic data are typically collected
from census survey and are not updated at the same frequency and reflect the fine spatial changes as in the TNC
trajectory data. In this regard, the most accurate statistics we can retrieve are at borough level [6,7]. And we see that
at each county the trend is the same as compared to the change of the whole city’s statistics. We attach here the change
of labor supply and population of each borough for your reference:

Change of total population:

Change of employed population:

Bronx Brooklyn Manhattan Queens All

End of 2018 584,200 1,188,700 901,900 1,144,200 3819000

2017 576,600 1,174,900 891,800 1,130,300 3773600

Change(%) 1.32 1.17 1.13 1.23 1.2

2) How many vehicles enter in 2017 vs 2019, (2010 is not of interest here).
Response:
Since we have extended the study to cover entire NYC area, we now report the number of standard vehicle registrations
in NYC instead of the number vehicles entering Manhattan. The data are obtained from the New York Department of
Motor Vehicles [8] and we provide a summary of the changes in each borough as following:

Bronx Brooklyn Manhattan Queens All

End of 2018 248,120 456,759 221,180 721,426 1647485

2017 247,957 453,515 227,709 722,850 1652031

Change (%) 0.07 0.72 -2.87 -0.2 -0.28


3) You have 2019 data on TNC trips from your own data. That would be better to report than 2018.
Response:
We thank the reviewer or this insightful comment. The TNC trips from our own data were only an estimate that were
inferred from the cruising trajectories and may not reflect the actual number of trips in NYC. Since the NYCTLC has
released recent data reported by TNCs, we have updated our statistics accordingly to reflect the actual numbers from
the TNC sector in March 2019. The updated Table can be found on page 3 (Table 1).

4) The statement that there are no major transportation projects sounds like a stretch. It would be better to list the
biggest projects, and then argue that they are not that substantial.
Response:
We thank the reviewer for this important remark. While reporting the number of major transportation projects, we
tried to imply whether there are significant transportation infrastructure changes during the two years that may have
important influence on traffic condition. While for the type of projects such as regular transportation infrastructure
maintenance, we assume that such projects do not have significant influence on citywide traffic condition or fuel
consumption. Major transportation projects we mentioned here is defined as “projects that include the major alteration
of roadways, including the addition or removal of vehicular lanes, for a considerable distance” [3]. Based on the
information provided by NYC Department of Transportation, there is no new major projects reported in Manhattan
between 2017 and 2019. The latest major project reported in Manhattan is in the year 2013 [3].

5) What about the subway maintenance/reliability issues? Could that cause an increase in congestion? Why or why
not?
Response:
Unfortunately, we do not have access to the maintenance and reliability of subway or transit systems in NYC. But we
believe these factors will affect commuters’ decisions in riding transit, and some of them may eventually shift to TNC
and taxis. Such a trend is also captured by the drop in metro and bus ridership we reported. This is eventually reflected
by the increase in number of FHVs and TNC trips on the road, which can be captured by our analysis.

6) What about the decline in transit ridership? Fewer people on transit may mean more congestion. (It may also be
that they end up on TNC.)
These are things that should be explored/discussed.
Response:
We thank the reviewer for this important remark. Based on this suggestion, we then explored the transit ridership
development trend in NYC. It is reported that daily weekday subway ridership in NYC was 5.44 million in 2018,
which declined by about 2.6% compared with 2017 (143,000 fewer riders per day). Also, weekday bus ridership in
NYC also experienced a drop of 5.9% from 2017 to 2018 (1.81 million). (Page 2)
It is worth noting that while total TNC trips in NYC kept increasing, transit ridership was experiencing a decline. This
may indicate a reasonable guess that people tend to choose FHV instead of public transit modes with the rapid
development of TNC. In the present work, we concentrate more on indicating the changes of road traffic condition
and fuel consumption caused by the explosion of TNC over a two-year period from 2017 to 2019. While the decreasing
of transit usage could be another outcome of the rapid development of TNC. And its impact on road traffic condition
is eventually reflected by the trajectory data from FHVs.

6. The paper would benefit from a table with some additional metrics, such as the total TNC VMT, the total TNC trips,
and other details you could report.
Response:
While we do not have the value for TNC VMT, in the revised manuscript, we have included a table to summarize the
major statistics that serve as the background information of study. The new table (Table 1 on page 3) summarizes the
change of population, employment, vehicle registration and the change in ride-hailing markets. In addition, we added
a section regarding the overview of the identified activities so that the readers can have a better understanding on the
spatial distribution and histogram of the processed activities and the differences between SA and MA. This new section
can be found on page 11(Section 4.1 Overview of identified activities).

7. On page 12, you discuss two possible explanations for the difference between V and VMA. I am not convinced
that 11 am is actually more congested than 9 am. I don’t follow your argument as presented in Figure 5. Your first
explanation—that drivers are more likely to stop and wait in the mid-day seems more likely. I recognize that I am
saying that without putting forth detailed data to counter, but I think you need to strengthen this argument if you want
to use it.
Response:
We appreciate this comment, and this helps us to clarify some confusion related to our previous results. Note that the
reported average Speed, MA speed, RA rate, and the FC and emissions were fleet average. That is, the reported value
in our previous manuscript is a weighted average of the values in each hexagon where the weight is determined by the
number of activities in each a. Consequently, the fleet average is affected by the spatial distribution of FHVs and the
identified activities. And we identify more activities in (1) areas where there is a high demand of FHV and (2) time
periods where there are more FHV supply than demand. The reason for (2) is that there will be more cruising trips
that can be collected due to excessive supply, and this exactly corresponded to the 11 am time period. In this regard,
if we take the fleet average, we actually overweight the heavily congested areas (more vehicles) and it is a biased
estimation of the city-wide speed.

In this revised manuscript, we have now changed the calculation of speed, MA speed and SA rate and the emission
metrics as the average of local values (no longer the fleet average). This definition is closer to our definition of city-
wide speed and the update results are now aligned with the peak and off-peak hour traffic conditions for both weekday
and weekend (see Figure 6 and Figure 7 on page 15). We hope this change will sufficiently address the reviewer’s
concern.

8. The figures are hard to read, especially when printed in black-and-white. The light gray line is especially
challenging. I had to search through the text to remember the difference between V, VMA and RSA. These could be
both improved and bigger.
Response:
We have revised all our figures to improve the readability. The particular figure that the reviewer mentioned is now
Figure 6 and 7 in our revised manuscript. We have changed the gray line to shaded area for better visual results.

9. It would be helpful to include any possible data in the supplemental materials. You may not wish to provide the
full trajectory data due to privacy concerns, but certainly, you can provide the data that underlies each of your figures,
so users can plot that themselves.
Response:
We appreciate this particular comment from the reviewer. And we are happy to provide supplement data which contain
the aggregated information calculated from the trajectory data, which are used for the analysis in our study. The data
will be provided in json format which contains detailed information of the corresponding metrics in each hexagonal
area (total 1371 areas) for each 15 minutes time interval.

I will finish by saying that I would like to see this paper get published, but I think it needs more work to make it solid.
Response:
Once again, we would like to thank the reviewer for your insightful suggestions and comments. Your input has
tremendously helped us to improve the quality of our manuscript. And we hope our responses will sufficiently address
all your concerns.
Response to reviewer 3:
Comments from Reviewer 3:
This is certainly an interesting study, dealing with an important issue called “Congestion and Emissions”. However,
in my opinion, it consists of major issues in terms of result interpretation and policy implications. I have provided my
major concerns for the paper below in a concise way. Minor comments regarding grammatical errors can be looked
by a good proofreader:
1. For estimation of COPERT model, the author has used two different equations for computing fuel consumption for
moving activities (FCMA) and stationary activities (FCSA) respectively. The author has used two different factors
velocity and time for computing FCMA and FCSA respectively (Why FCMA depends on velocity and FCSA depends
on idle time – this point should be discussed thoroughly with proper references).
Response:
We separate stationary activities (SA) from moving activities (MA) in the trajectories so that we may make the best
use of the high resolution trajectories to restore the stop-and-go traffic states, which will also contribute to more
accurate characterization of fuel consumption and emissions for urban traffic. The primary reason for separating SA
and MA is due to that engine works differently when the vehicle is in motion versus when the engine is idle.

In our study, MA represents the moving status of a vehicle (location change) and SA represents stop or idling status
of a vehicle (fixed location). For MA status, fuel consumption and emissions can be estimated according to moving
parameters (velocity), and the parameters for this state are obtained from the COPERT model (reference [36]) using
the calculated moving velocity. While for SA status, it refers to the idle engine state with less fuel consumption but
high emissions of CO, HC and NOx. And the following reference (also reference [35] in our revised manuscript) is
the source where we obtained the corresponding parameters for engines during idle state:

Akcelik, Smit and Besley (2012). Calibrating fuel consumption and emission models for modern vehicles.
http://www.sidrasolutions.com/Cms_Data/Contents/SIDRA/Folders/Resources/Articles/Articles/~contents/8
KR2VYBBFM8VPNLS/AKCELIK_Fuel-EmissionModels-IPENZ2012-.pdf

We have added the following statement in our revised manuscript to clarify the motivation as why MA and SA are
introduced to calculate the emission and energy consumption (page 7, lines 177-183):

“In addition, the functionality of engines differs between idle state and when the vehicle is in motion. MA and SA will
therefore result in the more accurate characterization of fuel consumption and emission for urban traffic, where MA
can be used with emission models for vehicle in motion and SA can be used with emission models for idle engine state
to obtain comprehensively evaluate the actual emissions and fuel consumptions. Studies have shown that this approach
can achieve over 88% accuracy when using macroscopic emission model [29] and over 94% accuracy when using
microscopic emission model [30] when compared to actual fuel consumption.”

We hope this may well address the concern from the reviewer.

2. While defining moving and stationary activities, the author has used velocity as a threshold parameter. The author
should have explained why the ‘velocity’ has been chosen as a deciding factor for defining moving and stationary
activities and why it has been fixed as 5.
Response:
MAs and SAs are separated based on the distance between the adjacent GPS track points. The threshold is determined
based on the positional errors of GPS (which is often 5m). To avoid overly estimating MA, we treat any spatial
displacement that is shorter than 7m as stationary activity. Since consecutive trajectory records in our data are within
4-6 seconds (we added this to our revised manuscript, on page 7 lines 188-189), the 7m distance gap is converted into
the speed of 5 km/h for each activity segment. And we are able to validate the effectiveness of this threshold, as shown
in our newly added Figure 3(c) on page 11. We can verify that this threshold completely separates the SAs from the
MAs, and over 90% of the classified SAs are of speed less than 1km/h.

We have also included the following statement to clarify the velocity threshold (on page 7, lines 193-194):
“The threshold of V_{i,i+1} < 5 for separating SA and MA is selected to mitigate GPS errors that may lead to the
false classification of actual identities.”
3. Conclusion should have included more or detailed discussion on “effect of TNC growth on emissions”. The author
should have provided a better insight on how this research can help the policy makers/planners for effective and
sustainable transportation planning.
Response:
We appreciate this feedback from the reviewer, and we acknowledge the importance for highlighting the implications
from the energy consumption and emission issues based on the results of our study. We have now added a paragraph
in the conclusion section as a discussion on the effect of TNC growth on emissions, as following (page 20, lines 436-
442):
“As a major byproduct of the worse traffic conditions, our results highlight emerging energy consumption and
emissions issues from the TNC sector. We have shown in our study that the increase in FHVs and the number of trips
has led to 136% more NOx, 152% more CO and 157% more HC emissions per kilometer traveled by the FHV sector
within two years. This finding is obtained under a conservative assumption where the duration and distance of each
passenger trip stay the same. In reality, however, the revealed decrease in MA speed and increase in SA ratio are
indicative of longer trip duration as well as longer cruising time before an FHV may reach the next passenger. We
may speculate from this observation that the actual contribution of the TNC sector could be much higher than the
reported values in this study. In this regard, immediate actions should be taken against the overgrowth of TNCs in
urban areas. Based on our results, there are two practical directions that may help to mitigate the energy and emission
issues. First, as a short-term measure, the entry of FHVs in heavily congested areas should be strictly regulated. We
have shown that more FHVs contribute to not only slow-moving speed but also more congestion and stop-and-go
traffic. The latter is the primary source of tailpipe emissions and regulating FHV service in congested areas helps to
avoid the additive effect of more traffic and worse emissions per individual vehicle. But more importantly, considering
a large number of trips served by the TNC sector, policies should be framed to encourage and facilitate the adoption
of alternate fuel vehicles in the ride-hailing industry which can achieve significant long-term savings of the energy
and emission costs..”
Reference
[1] NYC Department of Transportation. New York City Mobiliy Report,
https://www1.nyc.gov/html/dot/downloads/pdf/mobility-report-2019-print.pdf, 2019.
[2] Taxi and Limousine Commission. Improving Efficiency and Managing Growth in New York's For-Hire Vehicle
Sector[J]. 2019.
[3] NYC Department of Transportation. https://www1.nyc.gov/html/dot/html/about/major-transportation-proj.shtml.
[4] Demir E, Bektaş T, Laporte G. A comparative analysis of several vehicle emission models for road freight
transportation[J]. Transportation Research Part D: Transport and Environment, 2011, 16(5): 347-357.
[5] Ntziachristos, Leonidas, et al. "COPERT: a European road transport emission inventory model." Information
technologies in environmental engineering. Springer, Berlin, Heidelberg, 2009. 491-504.
[6] Current estimates of New York City’s population for July 2018. https://www1.nyc.gov/site/planning/planning-
level/nyc-population/current-future-populations.page
[7] Labor statistics for the New York City region. https://www.labor.ny.gov/stats/nyc/.
[8] Statistical data and summaries, New York Department of Motor Vehicles. https://dmv.ny.gov/about-
dmv/statistical-summaries
Revised Manuscript with Changes Marked (without Author
Details)

Impact of transportation network companies on urban congestion:


Evidence from large-scale trajectory data

Abstract

We collect vehicle trajectory data from major transportation network companies (TNCs) in New York
City (NYC) in 2017 and 2019, and we use the trajectory data to understand how the growth of TNCs
has impacted traffic congestion and emission in urban areas. By mining the large-scale trajectory data
and conduct the case study in NYC, we confirm that the rise of TNC is the major contributing factor
that makes urban traffic congestion worse. From 2017 to 2019, the number of for-hire vehicles (FHV)
has increased by over 48% and served 90% more daily trips. This results in an average citywide speed
reduction of 22.5% on weekdays, and the average speed in Manhattan decreased from 11.76 km/h in
April 2017 to 9.56 km/h in March 2019. The heavier traffic congestion may have led to 136% more N Ox,
152% more CO and 157% more HC emission per kilometer traveled by the FHV sector. Our results show
that the traffic condition is consistently worse across the different times of day and at different locations
in NYC. And we build the connection between the number of available FHVs and the reduction in travel
speed between the two years of data and explain how the rise of TNC may impact traffic congestion
in terms of moving speed and congestion time. The findings in our study provide valuable insights for
different stakeholders and decision-makers in framing regulation and operation policies towards more
effective and sustainable urban mobility.
Keywords: Transportation network companies, Trajectory data, Urban traffic congestion, Emission

1 1. Introduction

2 Transportation network companies (TNCs), which connect travelers with drivers through app-based
3 platforms, have expanded rapidly in recent years. Based on a recent report, TNCs have more than
4 doubled the overall size of the for-hire ride services sector since 2012, making the for-hire vehicle (FHV)
5 sector a major provider of urban transportation services by the end of 2018 [1]. The popularity of TNCs
6 is the result of numerous advantages including improved convenience, higher flexibility, shorter waiting
7 time and lower trip fare as compared to traditional taxi services. However, the overgrowth of TNCs
8 also brings new concerns and challenges for urban traffic management. Although TNCs claim that they
9 help to reduce congestion, official reports and many studies have enumerated signs of road traffic getting
10 worse after the emergence of TNCs. It is reported that private-ride TNC services (Uber, Lyft) have
11 introduced an overall 180 percent more traffic to urban road networks and added billions of vehicle miles
12 traveled (VMT) in the nation’s largest metro areas [1]. Another recent study also asserted that TNCs
13 are the biggest contributor to the growth of traffic congestion in San Francisco [2]. These researches
14 depict the big picture of the influence of the overgrowing TNCs on traffic congestion and their findings

Preprint submitted to Sustainable Cities and Society December 15, 2019


15 largely agree with the impression among the general public. However, understanding the precise impact
16 of TNCs on urban traffic is intrinsically difficult, as the change of traffic condition can be the result of
17 the compounding effect of many other factors including population, employment, and change of road
18 network capacity, letting alone the rise of TNCs. And TNCs barely release data that are of sufficient
19 spatial resolution and temporal coverage to allow for tracing their service and evaluating their impacts.
20 Despite its difficulties, understanding the effects of TNCs on traffic conditions has become an in-
21 creasingly important topic for transportation planners and policymakers especially in large cities. Our
22 interpretation of the TNC effects will be directly reflected in the way we regulate TNCs and how we may
23 integrate them into the existing transportation system [3]. And our decisions and policies will largely
24 affect the mobility needs of millions of urban travelers and even the livings of hundreds of thousands
25 of TNC drivers. The previous study suggested that TNCs have the potential for reducing road traffic
26 by replacing individual trips with ride-sharing services [4]. But recent research indicated that rapidly
27 increasing TNCs have a negative effect on traffic conditions by attracting transit riders [5]. In particular,
28 the influence of the entry of TNCs on congestion was assessed based on historical area-level panel data.
29 Erhardt et al. [2] studied the impact of TNCs’ on traffic congestion through a before-and-after evaluation
30 of the 2010 and 2016 traffic conditions. While they specifically took the change of population, employ-
31 ment and road network into consideration, their results may be largely affected by their counterfactual
32 case in 2016 which was projected from the 2010 baseline with no TNC trips using San Francisco’s travel
33 demand model.
34 In this study, we design a control experiment for gaining accurate insights on the impact of TNC’s
35 on urban road traffic by scraping the data from TNC platforms in New York City (NYC) in 2017 and
36 2019. We limit our discussion to four major boroughs (Brooklyn, Bronx, Manhattan, and Queens) in
37 NYC and argue that the rise of TNCs is the foremost contributing factor to the statistically significant
38 changes, if any, of the road traffic condition based on the following facts:

39 1. We eliminate the impact of the population since NYC’s total population declined from 8.623 million
40 in 2017 to 8.399 million as of July 2018 [6].
41 2. We eliminate the impact due to employment changes as the labor force and employment are of the
42 identical level in both years (4.13 million and 4.11 million) for NYC.
43 3. There are no major transportation projects reported in NYC since 2014 according to NYCDOT [7].
44 4. Registration of standard vehicles declined from 1,913,663 in 2017 to 1,912,468 in 2018 [8].
45 5. The number of TNC drivers increased from 58,900 in April 2017 to 87,600 in March 2019 (48.7%
46 more) [9].
47 6. The number of daily TNC trips increased from 393,918 in April 2017 to 769,729 in March 2019
48 (95.4% more) [9].
49 7. The number of medallion taxis remains the same but the number of daily trips decreased from
50 334,865 in April 2017 to 252,634 in March 2019 (24.5% fewer) [9].
51 8. Transit usage in NYC experienced a drop from 2017 to 2018. It is reported that daily weekday
52 subway ridership in NYC was 5.44 million in 2018, which declined by about 2.6% compared with

2
53 2017 (143,000 fewer riders per day). Also, weekday bus ridership in NYC also experienced a drop
54 of 5.9% from 2017 to 2018 (1.81 million).

Table 1: Background facts in NYC

Item 2017 2019


Population (million) 8.623 8.399 (End of 2018)
Employment (million) 4.13 4.11 (End of 2018)
Standard vehicles registration 1,913,663 1,912,468 (End of 2018)
Daily weekday subway ridership (million) 5.58 5.44 (End of 2018)
Daily weekday bus ridership (million) 1.92 1.81 (End of 2018)
Number of TNC drivers 58,900 87,600
Number of daily TNC trips 393,918 769,729
Number of daily taxi trips 334,865 252,634

55 These facts help to narrow the only dominating contributing factor to the rise of TNC if we may
56 observe any meaningful changes in road traffic conditions. To obtain the most precise understanding of
57 road traffic conditions, we have scraped one month of FHV trajectory data in April 2017 and one month
58 of FHV trajectory data in March 2019. And we use the trajectory data from Uber, the largest TNC
59 in NYC, for further analysis. The scraped trajectory data contain the GPS record of the online Uber
60 drivers every few seconds and can be used to visualize and quantify the spatiotemporal change of traffic
61 conditions. And the large amount of data we collected help to obtain findings that are statistically
62 meaningful. We then classify the trajectory data into moving activities and stationary activities for
63 fine-level analysis of the time spent in congestion and speed during travel. We introduce macroscopic
64 energy models to further calculate the change in fuel consumption and emission during the two years.
65 Through comprehensive numerical experiments, we conclude that the increase of FHVs contributes to
66 significant speed reduction in NYC with a daily average drop of 22.5% on weekdays. As for Manhattan,
67 the average speed declines from 11.76 km/h to 9.56 km/h on weekdays and from 14.98 km/h to 13.51
68 km/h on weekend in less than two years. We report that the increased traffic congestion, along with the
69 growing number of TNC trips, double the tailpipe emissions from the TNC sector since 2017.
70 The rest of the study is organized as follows. We briefly review related literature on trajectory
71 analysis in the next section. Section 3 introduces the main methods used in this study, including the
72 developed data collection method, the validation of data quality, activity identification from trajectory
73 data and energy and emission calculation. Section 4 presents comprehensive results and discussion on
74 understanding the FHVs’ impact. Finally, we summarize key findings and future directions in section 5.

75 2. Literature

76 With the rapid development of data collection methods and availability of traffic-related big data in
77 cities, estimating city-level fuel consumption using vehicle trajectory data has gained a lot of interest.

3
78 GPS trajectory data have been widely used to understand mobility patterns [10, 11] and travel behav-
79 ior [12], discover flexible routes [13] and monitor real-time traffic situation (visualize traffic jam) [14]
80 due to their advantages of large coverage, good continuity, low cost and rich information about vehicles’
81 movements. In recent years, GPS trajectory data were used for large-scale fuel consumption estimation
82 to provide a more accurate vision of national or regional level vehicular emissions. Shang et al. [15]
83 calculated the gas consumption and emissions using GPS trajectories generated by over 32,000 taxis in
84 Beijing over a period of two months based on the estimated travel speed of each road segment using a
85 context-aware matrix factorization approach. Du et al. [16] explored the fuel consumption pattern and
86 analyzed the temporal and spatial distribution characteristics of average fuel consumption in Beijing
87 using large samples of historical floating vehicle trajectory data, where a fuel consumption forecasting
88 model was established using the back-propagation neural network. Gately et al. [17] quantified the
89 emissions from traffic congestion and identified local hotspots with highly elevated annual emissions at
90 regional scales using a large database of hourly vehicle trajectory data CO2 from road vehicles on 280,000
91 road segments in eastern Massachusetts. Luo et al. [18] analyzed the energy consumption and emissions
92 and their spatial-temporal distribution in Shanghai using GPS trajectory data obtained from taxis.
93 Vehicular emission models can be summarized as two types: macroscopic models [19, 20] and micro-
94 scopic models [21, 22], which focus on different aspects of vehicle emissions calculations and analysis.
95 For large-scale fuel consumption estimation, macroscopic models are usually used where emissions fac-
96 tors are modeled as functions of the average speed of vehicles [23]. However, these estimations do not
97 consider different driving modes or driving patterns which have been proved to have an obvious effect
98 on vehicle fuel consumption [24]. For example, engine start [25] or idling speed [26] will increase vehicle
99 exhaust emissions. Lack of consideration of these parameters may lead to erroneous estimations. For
100 large-scale emissions estimation, such erroneous estimations may result in a misunderstanding of the
101 overall traffic states and emission levels in the region. While GPS trajectory data can reveal detailed
102 information about vehicle driving modes and traffic states, it therefore provides the possibility of iden-
103 tifying different driving activities that will influence vehicle fuel consumption [27, 28]. In this paper, a
104 two-step integrated emission estimation method [29, 30] that incorporates driving activities (considered
105 in microscopic models) into COPERT model (macroscopic model) is adopted to provide more accurate
106 fuel consumption estimation of Manhattan using GPS trajectory data obtained from Uber. With this
107 method, driving activities of each vehicle are first specified as moving activities and stationary activities.
108 COPERT model is then applied to calculate the emissions of all trajectories considering both types of
109 driving activities of each vehicle. The integrated estimation method ensures more accurate emissions
110 and fuel consumption estimation in a city-level scheme and at the same time provides a more detailed
111 sense of TNC’s influence on traffic conditions.

112 3. Method

113 3.1. Data Collection


114 To gain insights on the impact of FHVs on urban traffic, we develop the data crawler, which simulates
115 the ride requesting behavior on the mobile app, to fetch the trajectory data from major TNCs including

4
116 Uber and Lyft. Our data crawler sends the trip starting location as the pingClient message to TNCs’
117 mobile API and receives back the sequences of coordinates of eight closest online FHV drivers as well as
118 the surge price (SP) and estimated time of arrival (ETA). Online vehicles refer to those who are available
119 for picking up passengers and the vehicles will no longer be recognized if they start a trip or if they go
120 offline. The collected trajectories therefore capture the cruising behavior of FHVs. But different from
121 taxis where street hailing is permitted, FHVs cruise to the next pick up location assigned by the platform
122 and the data therefore well reflect the actual traffic condition. By placing a sufficient number of data
123 collection stations with proper spacing and collection frequency, we are able to collect abundant vehicle
124 trajectories to restore the citywide operation dynamics of FHVs. In this study, only the trajectory data
125 from Uber are used as it is the dominant TNC in NYC with approximately 70% market share [31].
126 Trajectory records collected include the information of timestamp (in Unix), latitude, longitude, driver
127 ID (only first 6 letters shown here), product ID (e.g. UberX and UberXL) and bearing. The sample of
collected trajectory records can be seen in Table 2.

Table 2: Sample data records

Product ID Driver ID Epoch Bearing Latitude Longitude


2083 b97fed 1491760511750 344 40.67387 -73.80141
39 657dbb 1491753163395 209 40.77918 -73.95079
694 6b25cd 1491748277252 299 40.78273 -73.9495
4 73c3f4 1491732814910 191 40.71448 -74.01372
39 5f486 1491733990716 299 40.75755 -73.96903
128

129 We conduct citywide data collection in NYC and the data analyzed in this study were collected from
130 April 7 to May 3rd (6 AM to 11 PM) in 2017 and February 7 to March 13 (24 hours) in 2019 from Uber
131 API. The data collection was performed at the frequency of 5 seconds for each data collection station
132 in 2017, and this frequency was set to 1 minute in 2019 due to the change of functional mechanism
133 of Uber API. As suggested in [32], Uber may dynamically alter the ID assigned to each driver and
134 the data collected therefore do not contain privacy information related to any individual drivers. And
135 the data collection stations only send pingClient messages to Uber server for obtaining nearby vehicles’
136 trajectories without actually requesting a ride. Hence our data collection was conducted in an ethical
137 manner that neither hacked any driver or passenger privacy information nor sent real ride requests which
138 may disturb Uber operations.
139 We set the same data collection station configuration in 2017 and 2019 which consists of 470 stations.
140 The amount and spatial placement of the stations are carefully calibrated to ensure sufficient coverage
141 of the actual operation dynamics. In the beginning, we randomly placed a set of data collection stations
142 spreading over the entire NYC area, with each location having two data collection stations, and sent
143 pingClient message every 5 minutes for 12 consecutive hours. The test results suggested that over
144 99.99% of feedback messages between the two stations at the same location were exactly the same. And
145 we therefore assigned one data collection station per location. Another set of experiments was conducted

5
146 to identify appropriate spacing between two adjacent data collection stations. We used historical taxi
147 demand distributions to divide the whole study area into three sub-regions based on the trip demand
148 level. We varied the spacing from 100m to 1,500m between two adjacent stations in each sub-region
149 and deployed 9 neighboring stations in each region to measure data repetition among the 9 stations for
150 a 12-hour data collection. Finally, we chose the largest spacing that reached at least 40% repetition.
151 The resulting distribution of the data collection stations and the sampled spatial trajectory coverage are
shown in Figure 1.

40.9

40.8

40.7

40.6

40.5 Data collection station


74.2 74.1 74.0 73.9 73.8 73.7
(a) Data collection station (b) Visualization of sampled Uber trajectory in 2019

15000

10000

5000

0
6AM 12PM 6PM 6AM 12PM 6PM 6AM 12PM 6PM 6AM 12PM 6PM 6AM 12PM 6PM 6AM 12PM 6PM 6AM 12PM 6PM
TLC Record ∆ d=0.2 ∆ d=0.4 ∆ d=0.7 ∆ d=1 ∆ d=0.2, ∆ t=120

(c) Comparison between inferred number of trips vs TLC reported Uber trips [33]

Figure 1: Study area and the configuration of data collection stations

152

153 The 470 data collection stations fetched around 100 GB data per day in 2017 and 5.17 GB data per
154 day in 2019. To validate the quality of the data, we infer the number of Uber trips from 2017 data and
155 compare this number with the FHV trips reported by NYCTLC [34] for every 15-minutes time interval of
156 entire NYC. In particular, we track the trajectory of each unique driver ID and consider a trip was taken
157 place if (1) the time gap (∆t ≥ 60) and spatial displacement (∆d) between consecutive records exceeds
158 certain threshold or (2) the record was the last trajectory identified for the driver ID. The validation
159 results with various distance and time thresholds are presented in Figure 1c. While the collected data
160 by no means capture complete FHV operation information, we observe that the inferred number of trips
161 well resembles the reported trip level and the trip trend is closely aligned with the actual trip tendency
162 with the proper choice of distance and time threshold. This demonstrates the quality of the data we
163 collected and suggests that the data yield sufficient coverage of actual FHV dynamics.

6
164 Finally, we choose the data between Feb 27 to March 12 in 2019 and April 12 to April 25 in 2017
165 and compare the change of traffic states in two years. This time selection is to ensure the dates are
166 comparable in the time vicinity given the availability and the quality of the data we collected. Moreover,
167 we only focus on investigating the change in Manhattan as the case study which is the borough of the
168 heaviest congestion and highest FHV trip level in NYC.

169 3.2. Activity identification

170 Based on the collected FHV trajectories, we next convert the trajectories into space-time path seg-
171 ments (STPS) following the method proposed in Kan et al. [29]. The main reason for STPS construction
172 is to identify different vehicle activities during a sequence of GPS records for accurately estimating the
173 trajectory speed and inferring energy consumption and emission. In particular, we focus on separating
174 stationary activities (SA) from moving activities (MA) in the trajectories so that we may make the best
175 use of the high-resolution trajectories to restore the stop-and-go traffic states. MA and SA will contribute
176 to differentiating between the amount of time urban traffic caught in gridlock and the velocity of the
177 moving traffic. In addition, the functionality of engines differs between idle state and when the vehicle
178 is in motion. MA and SA will therefore result in the more accurate characterization of fuel consumption
179 and emission for urban traffic, where MA can be used with emission models for vehicle in motion and
180 SA can be used with emission models for idle engine state to obtain comprehensively evaluate the actual
181 emissions and fuel consumptions. Studies have shown that this approach can achieve over 88% accuracy
182 when using macroscopic emission model [29] and over 94% accuracy when using microscopic emission
183 model [30] when compared to actual fuel consumption.
184 To best identify trajectory activities, we first preprocess the data to remove consecutive trajectory
185 records of time gap that is shorter than 2 seconds or longer than 15 seconds. The removal of short time
186 intervals helps to mitigate GPS errors. On the other hand, we may underestimate the number of SA
187 for including records of longer time gaps as intermediate SA will be consolidated and reflected as MA if
188 these short time intervals are included. The resulting time gaps between consecutive trajectory records
189 mostly lie between 4 seconds and 6 seconds. The preprocessing eliminates around 15% of trip records in
190 the data and we then identify SA and MA based on the velocity (km/h) of the trip segment:

kcoordi+1 − coordi k
Vi,i+1 = (1)
ti+1 − ti
191 where k·k measures the euclidean distance between consecutive trajectory records in kilometers. And
192 we define the state of trajectory segment as:

SA,

if Vi,i+1 < 5
Si,i+1 = (2)
M A, if Vi,i+1 ≥ 5

193 The threshold of Vi,i+1 < 5 for separating SA and MA is selected to mitigate GPS errors that may lead
194 to the false classification of actual identities. We present two sample trajectories and their constructed
195 STPS and identified SA and MA in Figure 2. As seen in the figure, by using the velocity threshold, we
196 are able to accurately identify the non-moving or near non-moving activities as SA and the actual moving

7
197 trajectories as MA. After activity identification, we observe there are over 4.8 million daily activities for
198 2017 data and over 3.2 million daily activities for 2019 data between 7 AM to 11 PM. And these large
number of activities will be sufficient to obtain statistically meaningful results in the following sections.

40
MA 30 MA
Time(s)

40 SA SA
20
30 10
Time(s)

20
40.754
10
-73.987 40.754
-73.987 40.753
40.733
40.733 -73.987 Lat -73.975 -73.975 -73.974
40.732 -73.988 Lon
40.732 -73.988 Lon
Lat

(c) Sample trajectory of length 103 meters (d) Sample trajectory of length 149 meters

Figure 2: Example of collected FHV trajectory and the constructed STPS

199

200 3.3. Estimating fuel consumption and emission

201 Total vehicle emission is usually categorized into cold emission and hot emission. Hot emission entails
202 the emission when the engine is operating at a normal temperature, and the cold emission denotes the
203 emission at transient thermal operation. In this study, we only consider hot emission due to lack of
204 data to classify cold start activities and also because hot emission usually dominates the total emission
205 for long trips. As reviewed in the earlier section, both MOVES and COPERT are popular models for
206 energy and emission calculation and MOVES are specifically tailored to emission standards in the US.
207 Nevertheless, the MOVES model requires the calculation of vehicle specific power which needs the second
208 by second acceleration and engine specification data. This calls for the need of trajectory interpolation
209 and is better suited for long trajectories. Our data primarily contains trajectories over short segments
210 (as shown in Figure 2) and interpolation may result in high estimation errors. As a consequence, we use
211 COPERT model for fuel and emission calculation and assume all vehicles under Euro 3 standards with a
212 capacity of 1.4-2.0L. Note that not all Uber vehicles may comply with the Euro 3 standards and there is
213 no available data to understand the type of vehicles in the Uber fleet. In addition, Euro 3 standards may
214 not fully comply with the US EPA standards and hence the exact value calculated for emission and fuel

8
215 consumption may not be taken as an accurate measure for NYC. Nevertheless, the change of standard
216 will only affect the model parameters but not the relationship between velocity and the corresponding
217 fuel consumption and emission and the obtained results still capture the relative change between 2017
218 and 2019.
219 Based on the aforementioned specifications, for MA, fuel consumption (denoted as F C(g/km)) can
220 be calculated based on trajectory segment speed V (km/h) as:

217 + 0.253V + 0.00965V 2


F CM A = (3)
1 + 0.096V − 0.000421V 2
221 As for SA state, we estimate the fuel consumption based on vehicle idle time T [35]as:

F CSA = 0.361mL/s ∗ 0.75g/mL ∗ T = 0.27g/s ∗ T (4)

222 where the density of gasoline is taken as 0.75g/mL.


223 As for hot emission, following the Tier 3 method of COPERT model[36], the emission factor (EF
224 (g/km)) during MA state are speed-dependent:

aV 2 + bV + c + Vd
EFM A = (1 − RF ) (5)
eV 2 + f V + g
225 where RF is the reduction factor. The corresponding parameters for measuring EF of CO, HC and
226 N Ox are presented as follows:

Table 3: Emission parameter for small vehicles in COPERT model

item a b c d e f g
CO 0 11.4 71.7 0 -0.248 35.4 1
N Ox 6.53e-6 -1.49e-3 9.29e-2 0 3.97e-5 -1.22e-2 1
HC 1.2e-5 -1.1e-3 5.57e-2 0 -1.88e-4 3.65e-2 1

227 Finally, the calculation of EF under the SA state takes the following form:

EFSA = α ∗ T (6)

228 where T is the idle time and the parameter α(mg/s) for CO, N Ox and HC are 13.889, 0.556 and
229 2.222 respectively [35].

230 3.4. FHV as probe vehicles

231 Based on the previous discussions, we are able to make FHV as the probe vehicle for characterizing
232 the traffic condition in Manhattan with the large-scale trajectory data collected. Since Uber has a large
233 fleet of vehicles roaming around NYC, the performance metrics calculated from Uber vehicles will serve
234 as a close approximation of the actual metrics of all vehicles on road. If we consider P as the complete
235 trajectory data generated by the entire Uber fleet, then our collected data PC ⊂ P which can be viewed
236 as the sub-population randomly drawn from P. As a consequence, the average performance metrics

9
237 calculated from our collected data is the sample mean of the entire population. And the mean value
238 of the metrics obtained in our data will be close to the expected value in P based on the law of large
239 numbers. These suggest that the traffic condition mined from our data can well represent the actual
240 traffic condition of the road network.
In this study, we are primarily interested in the spatiotemporal velocity metrics and the corresponding
energy and emission level. In particular, we propose to measure the following velocity metrics:
MA
P
k Dk
Vi,t = P M A
, if activity k is at location i within time t (7)
k Tk + TkSA

DkM A
P
MA
Vi,t = Pk M A
, if activity k is at location i within time t (8)
k Tk
P SA
SA k Tk
Ri,t = P M A + T SA
, if activity k is at location i within time t (9)
T
k k k
MA SA
241 where Vi,t , Vi,t and Ri,t represents the mean velocity, mean MA speed (speed when the vehicle is
242 in motion) and mean SA time (proportion of time spent in stationary traffic congestion) respectively.

10
243 4. Results

244 4.1. Overview of identified activities

d
2017 2019
40.9 40.9

100000
40.8 40.8
75000
40.7 40.7
50000
40.6 40.6
25000
40.5 40.5
74.0 73.9 73.8 73.7 74.0 73.9 73.8 73.7
(a) Spatial distribution of identified activities

2017 0.6 SA Speed 2019


350 MA Speed 2019
2019
0.5 SA Speed 2017
300
MA Speed 2017
Number of zones

250 0.4
Proportion

200
0.3
150
0.2
100

50 0.1

0 0.0
1 3 5
10 10 10 0 5 10 15 20 25 30
Number of activities Speed (km/h)
(b) Histogram for zonal activity level (c) MA and SA speed distribution

Figure 3: Distribution of number of identified activities in 2017 and 2019.

245 As mentioned earlier, due to the change in data collection frequency, there exists a significant differ-
246 ence in the amount of data collected in 2017 and 2019. To overcome this issue and deliver fair comparison,
247 we perform sampling from the 2017 data and include 30% of the total identified activities. This results in
248 similar number of total activities identified in 2017 and 2019, as shown in Figure 3a. There are over 123.8
249 million activities identified during our study period from the 2017 data and the corresponding value is
250 117.6 million for 2019, which suggests the same level of identified activities between the two years. These
251 activities cover the entire study area and we also can verify the expansion of Uber’s service coverage

11
252 areas from 2017 to 2019 based on the spatial distribution of the identified activities. In particular, we
253 report that 70% of the zones in 2017 and 74.5% of the zones have more than 10,000 identified activities
254 (see Figure 3b), representing over 150 activities for each 15-minutes time interval at each location. This
255 vast amount of activities delivers superior spatiotemporal coverage and ensures the obtained results are
256 statistically meaningful. Finally, we present in Figure 3c the validity of the identified SA and MA based
257 on equation 2, where we also measure the speed of SA from the spatial displacement and time gaps. We
258 can verify that over 90% of the identified SA have the speed lower than 1 km/h and there exists a small
259 fraction of SA with speed lower than 3 km/h which we suspect to be caused by GPS errors. On the
260 other hand, we observe that MA is perfectly separated from the SA based on the speed metric and we
261 can readily tell the differences between 2017 and 2019 data from the distributions of the corresponding
262 MA speed. We next present detailed analyses of the changes in traffic condition and emission based on
263 the identified activities.

264 4.2. Overall change in traffic condition

15 14.59 30.21 149.14 147.02 142.79


13.41 13.08 30 26.73 28.37 26.80 28.16 150 136.74 141.50
125.69 132.17 121.32

Fuel Consumption(g/km)
23.72 25.3423.02
Mean MA Speed(km/h)

11.61 11.76 24.33 24.89


Mean Speed(km/h)

11.02 10.13 115.87 116.73


10.01 9.34 9.56
10 20 100

5 10 50

0 0 0
Queens Bronx Brooklyn Manhattan All Queens Bronx Brooklyn Manhattan All Queens Bronx Brooklyn Manhattan All
Borough name Borough name Borough name
2017 2019 2017 2019 2017 2019

(a) Average Speed (b) Average MA Speed (c) Fuel Consumption

4.58 4.44 4.32 0.67


4.22 0.65 0.24
Hydrocarbon Emission(g/km)

4.09 0.59 0.61 0.62 0.25 0.22 0.22 0.23 0.23


4 3.52 3.77 0.6 0.21
CO Emission(g/km)

3.35 0.54 0.20


NOx Emission(g/km)

3.15 3.13 0.50 0.47 0.20 0.18 0.18 0.19


3 0.43 0.43
0.4 0.15
2 0.10
1 0.2
0.05
0 0.0 0.00
Queens Bronx Brooklyn Manhattan All Queens Bronx Brooklyn Manhattan All Queens Bronx Brooklyn Manhattan All
Borough name Borough name Borough name
2017 2019 2017 2019 2017 2019

(d) CO (e) HC (f) NOx

Figure 4: Average borough-wide performance during weekdays

265 We first show the comparisons of daily average speed, energy consumption and emission across the
266 study area and the results can be found in Figure 4. One immediate observation from the results is
267 the deterioration of citywide traffic performances from 2017 to 2019, and such observation is consistent
268 across the four major boroughs in our study area. We find that the citywide average daily speed reduced
269 from 13.08 km/h in 2017 to 10.13 km/h in 2019, representing a significant drop of 22.6%. Among the
270 four boroughs, Manhattan and Brooklyn are the areas with the worst traffic condition and we observe
271 the average speed reduction around 19%. Meanwhile, we see a notable increase in energy consumption
272 and emission due to the worse borough-wide traffic condition. For each kilometer traveled, the vehicles

12
273 in NYC now consume 21 grams more gasoline and emit 1 more gram of CO, 0.15 more grams of HC and
274 0.04 more grams of NOx on average as compared to the 2017 state. If we assume the metrics calculated
275 from cruising FHVs also apply to the full FHV sector, and project these values onto the increase of FHV
276 trips while assuming the same average distance per trip, these translate into that FHVs have introduced
277 152% more CO, 157.7% more HC and 136.5% more NOx for every kilometer they traveled.

60 9AM 60 2PM 60 9AM 60 2PM

40 40 40 40
MA speed

MA speed

MA speed

MA speed
20 20 20 20
2017 2017
2019 2019
0 0.0 0.5 1.0 0 0.0 0.5 1.0 0 0.0 0.5 1.0 0 0.0 0.5 1.0
SA time SA time SA time SA time
(a) Manhattan (9 AM: 2.2%, 2PM: 4.7%) (b) Brooklyn (9 AM: 6.3%, 2PM: 7.6%)

60 9AM 60 2PM 60 9AM 60 2PM

40 40 40 40
MA speed

MA speed

MA speed

MA speed
20 20 20 20
2017 2017
2019 2019
0 0.0 0.5 1.0 0 0.0 0.5 1.0 0 0.0 0.5 1.0 0 0.0 0.5 1.0
SA time SA time SA time SA time
(c) Queens (9 AM: 9.1%, 2PM: 13.3%) (d) Bronx (9 AM: 3.5%, 2PM: 9.4%)

Figure 5: Relationship between standardized SA time and standardized MA speed. The percentage of areas that exceeds
the predefined threshold is shown in the bracket.

278 There is, however, one particular drawback associated with the cruising trajectory data when it is
279 used for probing citywide traffic conditions. Since FHV drivers may choose to park by side of the street
280 and wait for future orders from the platform, this may lead to overestimation of the actual number of
281 SA activities on the road and the calculated average zonal speed, therefore, constitutes the lower bound
MA SA
282 of actual travel speed. By inspecting the relationship between Vi,t and Ri,t , we may gain additional
283 insight on this particular issue. Specifically, when there is heavy congestion in certain areas, we should
284 observe low MA speed and high SA time which captures the stop-and-go traffic pattern during congestion
285 and gridlock. Similarly, when traffic is light, the trajectory data should reveal high MA speed and low SA
286 time. However, when Uber drivers choose to park and wait rather than cruise, we are likely to encounter
287 the anomalies with both high MA speed and high SA time. This motivates us to explore the percentage
288 of observations that fall into the latter abnormal state and reveal the park-and-wait behavior of FHV
289 drivers. By inspecting the 2017 and 2019 data, we find that the 75% percentile of MA speed across the
290 data is approximately 25 km/h and that for the SA time is around 0.7. We then set the threshold of
291 V̄ M A = 25 and R̄SA = 0.7 and measure the proportion of areas in each borough with both MA speed
292 and SA time being higher than the thresholds. The results are shown in Figure 5.

13
293 We report that during morning peak hours (9 AM) all boroughs present traces of the park-and-wait
294 behavior. Manhattan has the lowest value of 2.2% while Queens has the highest percentage of park-and-
295 wait observations (9.1%), followed by Brooklyn (7.6%) and the Bronx (3.5%). And the percentage of
296 park-and-wait observations is increased for all boroughs during the off-peak time (2 PM), where Queens
297 still has the highest value of 13.3% and we find a drastic jump in the Bronx to 9.4% and that of Brooklyn
298 is 7.6%. These findings explain why the estimated speed in Brooklyn is lower than in Manhattan despite
299 the fact that Manhattan has the highest number of FHVs and passenger demand. In this situation,
300 the cruising trajectory data may slightly underestimate the average speed and overestimate the actual
301 energy consumption and emission. And the change in MA speed serves as a more accurate metric for
302 comparing the change of traffic conditions across different boroughs. On the other hand, the results also
303 indicate that the obtained traffic condition and emission are relatively accurate in Manhattan as there
304 are few park-and-wait observations. Moreover, the measured average speed of 11.76 km/h in 2017 is
305 well aligned with the speed of 11.2 km/h reported in the NYC mobility report [37]. We next zoom into
306 Manhattan borough and discuss how traffic conditions and emission change over time.
307 We summarize the time-varying performance metrics for both weekday and weekend in Manhattan in
308 Figure 6 and 7. For measuring the changes in traffic conditions, we plot the average speed, average MA
309 speed, as well as average SA time from 2017 and 2019 data and the corresponding changes between the
310 two years, are visualized by the shaded area. While the average weekday speed in Manhattan decreased
311 from 11.76 km/h to 9.56 km/h (Figure 4), this reduction can be further decomposed into two parts.
312 On one hand, more FHVs result in slower-moving speed so that there is a reduction of 9.2% in average
313 MA speed in Manhattan. On the other hand, there is also an increase in average SA time by 7.04%.
314 These provide strong evidence showing that the traffic condition is worse in 2019 than in 2017, and such
315 observation is consistent across different times of the day. As for the weekend, the mean speed is 14.98
316 km/h in 2017 and 13.51 km/h in 2019 respectively, suggesting a reduction of 9.8%. During weekdays,
317 we find that the peak hours (especially morning peak during 7-9 AM) have the worst traffic condition
318 and also suffer the greatest decline in average speed and MA speed. The changes during off-peak hours
319 are relatively minor. For the weekend, we observe that notable changes in mean speed mainly take place
320 during off-peak hours (7 AM to 12 PM on weekend) and during the nighttime period (7 PM to 10 PM).
321 The decreases in mean speed during weekdays and weekends correspond to the largest drop in travel
322 speed in Manhattan sine 2015 [37] and eventually lead to higher fuel consumption and more tailpipe
323 emission across different times of the day. For Manhattan, we observe that, during weekdays, vehicles
324 will consume 10.0% more gasoline and exhale 12.0% more N Ox , 16.1% more CO and 18.6% more HC
325 for each kilometer they traveled in Manhattan in 2019 as compared to those in 2017. These results
326 highlight the critical traffic congestion issues related to the rise of TNC in NYC, and possibly around the
327 world: the fast expansion of TNCs quickly saturates the road network, resulting in the increase of fuel
328 consumption and vehicle emission for all road traffic and even significant addition from the compound
329 of increasing worse traffic condition and more FHV trips.

14
18

Change of 2019 to 2017 (%)

Change of 2019 to 2017 (%)


0.625

Change of 2019 to 2017 (%)


30 12.5 15

Mean MA Speed(km/h)
16 15

Mean Speed(km/h)
0.600

Mean SA rate (%)


28 10.0
0.575 10
14 10 26 7.5
5.0 0.550
12 5 24 5
2.5 0.525
10 22
0 0.0 0.500 0
7 9 11 13 15 17 19 21 7 9 11 13 15 17 19 21 7 9 11 13 15 17 19 21
2017 2019 Change% 2017 2019 Change% 2017 2019 Change%

V V MA RSA

150 0.24 20
15.0 25 30

Hydrocarbon Emission(g/km)
Change of 2019 to 2017 (%)

Change of 2019 to 2017 (%)

Change of 2019 to 2017 (%)

Change of 2019 to 2017 (%)


4.5 0.65
Fuel Consumption(g/km)

12.5 25
NOx Emission(g/km)

15 20

CO Emission(g/km)
140 0.22 0.60
10.0 4.0 20
15 0.55
130 7.5 0.20 10 15
3.5 10 0.50
120 5.0 10
0.18 5 0.45
2.5 5 5
3.0 0.40
110
0.0 0.16 0 0 0
7 9 11 13 15 17 19 21 7 9 11 13 15 17 19 21 7 9 11 13 15 17 19 21 7 9 11 13 15 17 19 21
2017 2019 Change% 2017 2019 Change% 2017 2019 Change% 2017 2019 Change%

Fuel consumption N Ox emission CO emission HC emission

Figure 6: Change of city-wide metrics on weekdays between 2017 and 2019

15.0
20 25 0.62
Change of 2019 to 2017 (%)

34
Change of 2019 to 2017 (%)

Change of 2019 to 2017 (%)


12.5 15
Mean MA Speed(km/h)

0.60
Mean Speed(km/h)

20 32
Mean SA rate (%)

18 10.0
0.58 10
15 30 7.5
16 0.56
10 28 5.0
14 0.54 5
5 26 2.5 0.52
12 0 24 0.0 0
7 9 11 13 15 17 19 21 7 9 11 13 15 17 19 21 7 9 11 13 15 17 19 21
2017 2019 Change% 2017 2019 Change% 2017 2019 Change%

V V MA RSA

140 20 4.25
Hydrocarbon Emission(g/km)

0.60
Change of 2019 to 2017 (%)

Change of 2019 to 2017 (%)

Change of 2019 to 2017 (%)

Change of 2019 to 2017 (%)


15 0.22 25 30
Fuel Consumption(g/km)

4.00
NOx Emission(g/km)

CO Emission(g/km)

0.21 15 0.55
130 3.75 20
10 0.20 20
10 3.50 15 0.50
120 0.19
5 3.25 10 0.45
0.18 5 10
5
110 0.17 3.00 0.40
0 0 0 0
7 9 11 13 15 17 19 21 7 9 11 13 15 17 19 21 7 9 11 13 15 17 19 21 7 9 11 13 15 17 19 21
2017 2019 Change% 2017 2019 Change% 2017 2019 Change% 2017 2019 Change%

Fuel consumption N Ox emission CO emission HC emission

Figure 7: Change of city-wide metrics on weekend between 2017 and 2019

15
330 4.3. Spatial impact

40
30
20
8AM 11AM 2PM 6PM 10PM
10
(a) Mean MA speed distribution in 2019 on weekdays

5.0
2.5
0.0
8AM 11AM 2PM 6PM 10PM 2.5
5.0
(b) Change of mean MA speed between 2017 and 2019 on weekdays

1.00
0.75
0.50
8AM 11AM 2PM 6PM 10PM 0.25
0.00
(c) Mean SA ratio distribution in 2019 on weekdays

0.50
0.25
0.00
8AM 11AM 2PM 6PM 10PM 0.25
0.50
(d) Change of mean SA ratio between 2017 and 2019 on weekdays

Figure 8: Spatial distribution of mean speed and SA ratio on weekdays

331 We next present the spatial distributions of the mean MA speed and SA time in 2019 and their changes
332 as compared to 2017, and the results during weekdays are shown in Figure 8. Based on Figure 8a, we can
333 clearly distinguish between the traffic peak hours (8 AM and 6 AM) and off-peak periods. In particular,
334 we find that heavy traffic congestion in Manhattan persists across the day time, whereas there are notable
335 differences in terms of the MA speed in other boroughs between peak and off-peak hours. Location-wise,
336 we observe that lower and middle Manhattan, as well as the areas in other boroughs that are adjacent
337 to Manhattan, are found to suffer the heaviest congestion with the average MA speed being less than
338 20 km/h. And with the rise of TNCs, we notice a city-wide reduction of MA speed in despite of the

16
339 particular times of the day, where there are 88.2%, 83.1%, 78.9%, 85.1% and 81.8% of all 1371 areas that
340 have slower travel speed for 8 AM, 11 AM, 2 PM, 6 PM, and 10 PM. On weekend, the corresponding
341 values are 76.9%, 76.0%, 73.3%, 75.1% and 70.3% respectively. These numbers indicate that the traffic
342 condition is more affected during weekdays than on weekend, and the congestion is worse during peak
343 hours on both weekdays and weekends.
344 Note that with an excessive number of FHV vehicles on the road in cruising mode, these drivers
345 are likely to present different driving behavior than other commuting drivers. Specifically, they need to
346 pay close attention to their smartphones for incoming passenger orders which distracted them from the
347 road. And they may have to frequently merge into or diverge from the slow traffic to pick up passengers
348 or find temporary parking spots to save their cruising cost. These undoubtedly introduce significant
349 disturbance to the already slow traffic and adding frequent stop-and-go activities and shockwaves into
350 the traffic flow. These can be confirmed from the spatial distribution of the SA ratio, as shown in
351 Figure 8c. Being different than the findings from the distributions of the MA speed, the morning peak
352 hour has a lower level of SA ratio and the SA ratio is found to be higher at the time of more number of
353 cruising drivers (see x-axis in Figure 9 for the number of identified cruising drivers). This is likely due
354 to lower demand levels and more cruising FHVs in off-peak hours, resulting in greater disturbance to
355 the traffic and more stop-and-go traffic. The average SA ratio is 0.503 for 8 AM, and 0.542,0.535,0.532,
356 and 0.508 for 11 AM, 2 PM, 6 PM and 10 PM respectively. For every 10 minutes of driving in NYC,
357 these numbers translate into over 5 minutes sitting in non-moving traffic and highlight the huge amount
358 of wasted time for the large number of daily travelers who drive themselves or rely on taxis and FHVs.
359 And when compared to the 2017 scenario, there are 66.4%, 71.5%, 71.0%,70.6% and 73.4% of the areas
360 at 8 AM, 11 AM, 2 PM, 6 PM and 10 PM that experience increased SA ratio. Finally, following our
361 previous discussion on the relationship between MA speed and SA ratio, we can also visually identify the
362 places with dominant park-and-wait behavior. And such behavior is primarily found in peripheral areas
363 of Brooklyn, the Bronx, and Queens with lower trip intensity and less traffic. It is easier for drivers to
364 spot a parking place in these places and wait for future orders from the ride-hailing platforms.

365 4.4. Active drivers and speed change

366 Finally, we build the connection between the change of vehicle speed and the number of available
367 Uber drivers on the road during weekdays. In particular, we track the active Uber drivers as the number
368 of unique driver IDs identified across the study areas for every 15 minutes time interval. And we separate
369 our daily observations and group similar time intervals to increase the number of individual observations
370 for comparison between 2017 and 2019. For each year, we use the observations from 8 consecutive time
371 intervals (every 2 hours) on each of the 10 weekdays. This gives 80 observations for each 2 hour time
372 period for both 2017 and 2019, and each observation contains the number of active Uber drivers and the
373 average speed in NYC. The relationship between active Uber drivers and the speed at different times
374 of the day are presented in Figure 9. One immediate finding based on the results is that the traffic
375 condition in 2017 and 2019 are in entirely different states and the differences between the two years are
376 distinguishable even if we visualize the observations of all time periods in one plot (7:00-23:00). And

17
7:00-23:59 7:00-9:00 9:00-11:00
slope=-1.35E-03 24 slope=-1.90E-03 24 slope=-1.41E-03
26 R 2=0.53 R 2=0.67 R 2=0.83
2017 2017 2017
2019 22 2019 22 2019
24
22 20
Speed (km/h)

Speed (km/h)

Speed (km/h)
20
20
18
18 18
16
16 16
14 14
14
1000 2000 3000 4000 5000 6000 7000 3000 4000 5000 6000 1000 2000 3000 4000 5000 6000 7000
Active Uber drivers Active Uber drivers Active Uber drivers
11:00-13:00 13:00-15:00 15:00-17:00
24
slope=-1.24E-03 slope=-1.24E-03 21 slope=-1.26E-03
R 2=0.85 R 2=0.80 R 2=0.76
2017 2017 20 2017
2019 22 2019 2019
22
19
Speed (km/h)

Speed (km/h)

Speed (km/h)
20 18
20
17
18 16
18
15
16
16 14

2000 3000 4000 5000 6000 7000 2000 3000 4000 5000 6000 7000 3000 4000 5000 6000 7000
Active Uber drivers Active Uber drivers Active Uber drivers
17:00-19:00 19:00-21:00 21:00-23:00
22 slope=-1.28E-03 slope=-1.36E-03 slope=-1.90E-03
R 2=0.66 23 R 2=0.80 R 2=0.88
2017 2017 26 2017
2019 22 2019 2019
20
21 24
Speed (km/h)

Speed (km/h)

Speed (km/h)

18 20
22
19
16 18 20
17
14 18
16
3000 4000 5000 6000 7000 4000 5000 6000 7000 3000 4000 5000 6000 7000
Active Uber drivers Active Uber drivers Active Uber drivers

Figure 9: city-wide average speed comparison across different time of the day

377 the two years of data are also linearly separable in all of the individual time periods. We calculate the
378 Pearson correlation coefficient between the number of active drivers and the average speed for each two-
379 hour period of time, and the resulting coefficients are -0.82,-0.91,-0.92,-0.89,-0.87,-0.81,-0.89 and -0.94
380 for each of the time period. All these values are close to -1 and they suggest the significant negative
381 correlation between the number of FHV drivers and the average travel speed in Manhattan. While
382 correlation does not necessarily imply causation, if we may eliminate the impact from other contributing
383 factors such as those shown in Table 1, the strong negative correlation likely hints that the increase in
384 FHV drivers is the primary contributing factor to the citywide worse traffic congestion and emission.
385 Note that the active Uber drivers we identified from the data may only serve as a proxy of the total
386 number of Uber drivers in service. During the time with high passenger demand (e.g. 7:00-9:00, 17:00-
387 19:00), we may identify fewer drivers than that during off-peak hours. This is because our data capture
388 cruising Uber drivers and the drivers are less likely in cruising state when there is more passenger
389 demand than the vehicle supply. Nevertheless, the results are still valid as we compare the same time
390 of the day in two different years. And the number of identified drivers as shown by the x-axis also echo

18
391 our finding on the impact of excessive drivers in cruising on travel speed and SA ratio (Figure 8). As
392 the observation is based on the consistent results across all time periods over multiple days of data for
393 two years, this confirms that the increase of FHV trips and TNC drivers are one significant contributing
394 factor to urban traffic congestion. If we fit a simple linear regression model over the data, as shown
395 by the lines in Figure 9, the number of active Uber drivers alone may explain up to 88% (21:00-23:00
396 with R2 = 0.88) of the variability for the reduction of travel speed and such linear relationship may
397 well fit the observations for most of the time periods. Based on the coefficients of active Uber drivers,
398 we further notice that the impact of the number of FHVs on travel speed can be categorized into two
399 cases depending on the number of cruising drivers in the city. In the first case (7:00-9:00, 21:00-23:00),
400 the impact of this is reflected by both the reduction in MA speed and the increase in SA time. For the
401 second state (9:00-21:00), the impact of FHV vehicles is primarily reflected by the increase in SA ratio,
402 where the park-and-wait behavior from excessive cruising drivers as well as the disturbance to normal
403 traffic from more number of FHV trips together lead to the worse traffic congestion and emission. In
404 general, for the first state, the increase in FHV vehicles has a greater impact on the speed reduction
405 than the second state with the fitted coefficients being 50% higher.

406 5. Conclusion

407 In this study, we collect and mine large-scale FHV data and provide comprehensive understandings of
408 how the rise of TNCs impacts the traffic congestion and emissions in urban areas. We choose Manhattan
409 in NYC as the study area and conduct analyses of the trajectory data in 2017 and 2019. We classify
410 stationary and moving activities from the trajectory data and calculate the mean speed, energy con-
411 sumption and fuel consumption based on the classified MA and SA. Our results suggest that the increase
412 of FHV trips in NYC has resulted in an average citywide speed reduction of 22.5% on weekdays and the
413 average speed in Manhattan has decreased from 11.76 km/h in April 2017 to 9.56 km/h in March 2019.
414 And if we consider the increase of FHV trips over the two years. Our results confirm that the increase of
415 TNC vehicles is one of the major contributing factors to the increase in traffic congestion. And we also
416 articulate two different ways, which depend on the overall congestion level, that the increase in FHVs
417 may affect traffic congestion with different magnitude of speed reduction.
418 As a major byproduct of the worse traffic conditions, our results highlight emerging energy consump-
419 tion and emissions issues from the TNC sector. We have shown in our study that the increase in FHVs
420 and the number of trips has led to 136% more NOx, 152% more CO and 157% more HC emissions per
421 kilometer traveled by the FHV sector within two years. This finding is obtained under a conservative
422 assumption where the duration and distance of each passenger trip stay the same. In reality, however,
423 the revealed decrease in MA speed and increase in SA ratio are indicative of longer trip duration as
424 well as longer cruising time before an FHV may reach the next passenger. We may speculate from this
425 observation that the actual contribution of the TNC sector could be much higher than the reported
426 values in this study. In this regard, immediate actions should be taken against the overgrowth of TNCs
427 in urban areas. Based on our results, there are two practical directions that may help to mitigate the
428 energy and emission issues. First, as a short-term measure, the entry of FHVs in heavily congested areas

19
429 should be strictly regulated. We have shown that more FHVs contribute to not only slow-moving speed
430 but also more congestion and stop-and-go traffic. The latter is the primary source of tailpipe emissions
431 and regulating FHV service in congested areas helps to avoid the additive effect of more traffic and worse
432 emissions per individual vehicle. But more importantly, considering a large number of trips served by the
433 TNC sector, policies should be framed to encourage and facilitate the adoption of alternate fuel vehicles
434 in the ride-hailing industry which can achieve significant long-term savings of the energy and emission
435 costs.
436 We believe that the ”failure” of TNCs in populated urban areas can be attributed to three primary
437 reasons. One straightforward reason is the overgrowth of the number of TNCs that exceeds the already
438 limited capacity of the urban road network. It is noted that the increase in the number of TNC drivers
439 contributes differently to traffic congestion as compared to regular commuters. This can be reflected by
440 much more frequent merges and diverges for picking up and dropping off passengers. And these introduce
441 vital disturbances to regular traffic flow and result in more stop-and-go traffic. As a consequence, TNC
442 vehicles not only add traffic, they also downgrade the capacity of the road network.
443 The second reason is due to the competitive nature of the TNC market. The market involves com-
444 petition among different service provides and the traditional taxi sector, it also includes the competition
445 among the drivers of the same TNC platform [38]. Such competition adds another layer of inefficiency
446 if there is excessive supply than the actual demand, which is often the case during off-peak periods of
447 passenger demand and corresponds to our analyses of the time periods with low average speed but a high
448 number of active drivers. We should be aware that TNCs’ prime time only accounts for approximately 6
449 hours (morning peak + evening peak) per day or 25% of the time daily. But for the rest of the day, there
450 are more number of drivers competing for fewer number of passengers, resulting in excessive cruising
451 miles and searching time. And we have pointed out in our analyses that TNC drivers will need to pay
452 attention to their smartphones during cruising and such distracted driving is one notorious casual factor
453 for traffic accidents.
454 Finally, we consider the lack of effective regulation and operation mode to be another reason. We
455 assert that the observations that ”TNC worsens urban traffic congestion and emissions” should not be
456 viewed as contradicting the potential of TNCs for improving the efficiency and sustainability of our
457 urban mobility. Indeed, several studies have pointed out that TNC could be a highly effective solution
458 for efficient travel (e.g. 60% to 90% empty trips may be reduced if passengers and drivers are optimally
459 matched [39]) and have validated the effectiveness of properly designed ridesharing mechanisms [40, 41].
460 But at present, there is no evidence showing how efficiently are TNC drivers and passengers being
461 matched and the ’real’ ridesharing which actually combines multiple single rides only accounts for a
462 small amount of the total number of TNC trips [42]. Apparently the current TNC practice, which is
463 primarily revenue driven, is still far from its optimal performance considering aspects of social benefits
464 and overall sustainability. It is therefore necessary to frame regulations to strike the balance between the
465 TNC’s business model and social welfare. And the findings in our study provide important insights for
466 evaluating the actual externalities from the TNC sector and will be valuable for decision and policymakers
467 in framing effective regulations. As an example, NYC recently started the congestion surcharge for TNC

20
468 and taxi trips entering Manhattan (south of 96th street) [43]. Our findings largely favor this regulation
469 as the first step to mitigate the congestion impacts from the TNC sector, but also suggest the possibility
470 for the surcharge to be varying spatially and temporally.

471 References

472 [1] Bruce Schaller. The new automobility: Lyft, uber and the future of american cities. 2018.

473 [2] Gregory D Erhardt, Sneha Roy, Drew Cooper, Bhargava Sana, Mei Chen, and Joe Castiglione. Do
474 transportation network companies decrease or increase congestion? Science advances, 5(5):eaau2670,
475 2019.

476 [3] Maarit Moran and Philip Lasley. Legislating transportation network companies. Transportation
477 Research Record, 2650(1):163–171, 2017.

478 [4] Susan Shaheen, Nelson Chan, Apaar Bansal, and Adam Cohen. Shared mobility: A sustainability
479 & technologies workshop: definitions, industry developments, and early understanding. 2015.

480 [5] Eun Hye Grace CHOI. The Effects of transportation network companies on traffic congestion. PhD
481 thesis, KDI School, 2017.

482 [6] NYC current and projected population., accessed June, 2019. Available online at https://www1.
483 nyc.gov/site/planning/planning-level/nyc-population/current-future-populations.
484 page.

485 [7] NYCDOT: major transportation projects, accessed Oct, 2019. Available online at https://www1.
486 nyc.gov/html/dot/html/about/major-transportation-proj.shtml.

487 [8] New York Department of Motor Vehicles. Statistical data and summaries.

488 [9] New York City Taxi and Limousine Commission. Aggregated reports - data reports monthly indi-
489 cators.

490 [10] Yu Zheng, Quannan Li, Yukun Chen, Xing Xie, and Wei-Ying Ma. Understanding mobility based
491 on gps data. In Proceedings of the 10th international conference on Ubiquitous computing, pages
492 312–321. ACM, 2008.

493 [11] Yu Zheng, Yukun Chen, Quannan Li, Xing Xie, and Wei-Ying Ma. Understanding transportation
494 modes based on gps data for web applications. ACM Transactions on the Web (TWEB), 4(1):1,
495 2010.

496 [12] Jinjun Tang, Han Jiang, Zhibin Li, Meng Li, Fang Liu, and Yinhai Wang. A two-layer model
497 for taxi customer searching behaviors using gps trajectory data. IEEE Transactions on Intelligent
498 Transportation Systems, 17(11):3318–3324, 2016.

21
499 [13] Favyen Bastani, Yan Huang, Xing Xie, and Jason W Powell. A greener transportation mode:
500 flexible routes discovery from gps trajectory data. In Proceedings of the 19th ACM SIGSPATIAL
501 International Conference on Advances in Geographic Information Systems, pages 405–408. ACM,
502 2011.

503 [14] Zuchao Wang, Min Lu, Xiaoru Yuan, Junping Zhang, and Huub Van De Wetering. Visual traffic
504 jam analysis based on trajectory data. IEEE transactions on visualization and computer graphics,
505 19(12):2159–2168, 2013.

506 [15] Jingbo Shang, Yu Zheng, Wenzhu Tong, Eric Chang, and Yong Yu. Inferring gas consumption
507 and pollution emission of vehicles throughout a city. In Proceedings of the 20th ACM SIGKDD
508 international conference on Knowledge discovery and data mining, pages 1027–1036. ACM, 2014.

509 [16] Yiman Du, Jianping Wu, Senyan Yang, and Liutong Zhou. Predicting vehicle fuel consumption
510 patterns using floating vehicle data. Journal of Environmental Sciences, 59:24–29, 2017.

511 [17] Conor K Gately, Lucy R Hutyra, Scott Peterson, and Ian Sue Wing. Urban emissions hotspots:
512 Quantifying vehicle congestion and air pollution using mobile phone gps data. Environmental pol-
513 lution, 229:496–504, 2017.

514 [18] Xiao Luo, Liang Dong, Yi Dou, Ning Zhang, Jingzheng Ren, Ye Li, Lu Sun, and Shengyong Yao.
515 Analysis on spatial-temporal features of taxis’ emissions from big data informed travel patterns: a
516 case of shanghai, china. Journal of cleaner production, 142:926–935, 2017.

517 [19] Kyoungho Ahn and Hesham Rakha. The effects of route choice decisions on vehicle energy consump-
518 tion and emissions. Transportation Research Part D: Transport and Environment, 13(3):151–167,
519 2008.

520 [20] Mohammad Amin Pouresmaeili, Iman Aghayan, and Seyed Ali Taghizadeh. Development of mash-
521 had driving cycle for passenger car to model vehicle exhaust emissions calibrated using on-board
522 measurements. Sustainable cities and society, 36:12–20, 2018.

523 [21] Maryam Shekarrizfard, Ahmadreza Faghih-Imani, Louis-Francois Tétreault, Shamsunnahar Yasmin,
524 Frederic Reynaud, Patrick Morency, Celine Plante, Louis Drouin, Audrey Smargiassi, Naveen Eluru,
525 et al. Regional assessment of exposure to traffic-related air pollution: Impacts of individual mobility
526 and transit investment scenarios. Sustainable Cities and Society, 29:68–76, 2017.

527 [22] Aarshabh Misra, Matthew J Roorda, and Heather L MacLean. An integrated modelling approach
528 to estimate urban traffic emissions. Atmospheric Environment, 73:81–91, 2013.

529 [23] Jianlei Lang, Shuiyuan Cheng, Ying Zhou, Yonglin Zhang, and Gang Wang. Air pollutant emissions
530 from on-road vehicles in china, 1999–2011. Science of The Total Environment, 496:1–10, 2014.

531 [24] Boski P Chauhan, GJ Joshi, and Purnima Parida. Driving cycle analysis to identify intersection
532 influence zone for urban intersections under heterogeneous traffic condition. Sustainable cities and
533 society, 41:180–185, 2018.

22
534 [25] Jean-Yves Favez, Martin Weilenmann, and Jan Stilli. Cold start extra emissions as a function of
535 engine stop time: Evolution over the last 10 years. Atmospheric Environment, 43(5):996–1007, 2009.

536 [26] SM Ashrafur Rahman, HH Masjuki, MA Kalam, MJ Abedin, A Sanjid, and H Sajjad. Impact of
537 idling on fuel consumption and exhaust emissions and available idle-reduction technologies for diesel
538 vehicles–a review. Energy Conversion and Management, 74:171–182, 2013.

539 [27] Eric Jackson, Lisa Aultman-Hall, Britt A Holmén, and Jianhe Du. Evaluating the ability of global
540 positioning system receivers to measure a real-world operating mode for emissions research. Trans-
541 portation research record, 1941(1):43–50, 2005.

542 [28] Carolien Beckx, Luc Int Panis, Davy Janssens, and Geert Wets. Applying activity-travel data for
543 the assessment of vehicle exhaust emissions: Application of a gps-enhanced data collection tool.
544 Transportation Research Part D: Transport and Environment, 15(2):117–122, 2010.

545 [29] Zihan Kan, Luliang Tang, Mei-Po Kwan, and Xia Zhang. Estimating vehicle fuel consumption and
546 emissions using gps big data. International journal of environmental research and public health,
547 15(4):566, 2018.

548 [30] Zihan Kan, Luliang Tang, Mei-Po Kwan, Chang Ren, Dong Liu, Tao Pei, Yu Liu, Min Deng, and
549 Qingquan Li. Fine-grained analysis on fuel-consumption and emission from vehicles trace. Journal
550 of cleaner production, 203:340–352, 2018.

551 [31] Taxi and ridehailing usage in New York City, accessed Sep, 2019. Available online at https:
552 //toddwschneider.com/dashboards/nyc-taxi-ridehailing-uber-lyft-data/.

553 [32] Le Chen, Alan Mislove, and Christo Wilson. Peeking beneath the hood of uber. In Proceedings of
554 the 2015 Internet Measurement Conference, pages 495–508. ACM, 2015.

555 [33] Xinwu Qian, Dheeraj Kumar, Wenbo Zhang, and Satish Ukkusuri. Understanding the operational
556 dynamics of mobility service providers: A case of Uber. ACM Transactions on Spatial Algorithms
557 and Systems (TSAS), 2019.

558 [34] TLC trip record data, accessed May, 2019. Available online at https://www1.nyc.gov/site/tlc/
559 about/tlc-trip-record-data.page.

560 [35] Rahmi Akçelik, Robin Smit, and Mark Besley. Calibrating fuel consumption and emission models
561 for modern vehicles. In IPENZ Transportation Group Conference, 2012.

562 [36] Leon Ntziachristos and Zissis Samaras. Methodology for the calculation of exhaust emissions, 2018.
563 Available online at https://www.emisia.com/utilities/copert/documentation/.

564 [37] NYC Department of Transportation. New York City mobility report, 2018.

565 [38] Xinwu Qian and Satish V Ukkusuri. Taxi market equilibrium with third-party hailing service.
566 Transportation Research Part B: Methodological, 100:43–63, 2017.

23
567 [39] Xianyuan Zhan, Xinwu Qian, and Satish V Ukkusuri. A graph-based approach to measuring the
568 efficiency of an urban taxi service system. IEEE Transactions on Intelligent Transportation Systems,
569 17(9):2479–2489, 2016.

570 [40] Paolo Santi, Giovanni Resta, Michael Szell, Stanislav Sobolevsky, Steven H Strogatz, and Carlo
571 Ratti. Quantifying the benefits of vehicle pooling with shareability networks. Proceedings of the
572 National Academy of Sciences, 111(37):13290–13294, 2014.

573 [41] Xinwu Qian, Wenbo Zhang, Satish V Ukkusuri, and Chao Yang. Optimal assignment and incentive
574 design in the taxi group ride problem. Transportation Research Part B: Methodological, 103:208–226,
575 2017.

576 [42] Uber says that 20% of its rides globally are now on UberPool, accessed
577 June, 2019. Available online at https://techcrunch.com/2016/05/10/
578 uber-says-that-20-of-its-rides-globally-are-now-on-uber-pool/?ncid=rss.

579 [43] Judge approves congestion pricing for New York City taxi, Uber and Lyft rides,
580 accessed June, 2019. Available online at https://abcnews.go.com/Business/
581 judge-approves-congestion-pricing-york-city-taxi-uber/story?id=60778450.

24
Revision highlight

Impact of transportation network companies on urban congestion:


Evidence from large-scale trajectory data

Revision Highlight

We summarize the highlight of the changes that area made in the revised manuscript
as follows:

1. We have expanded the scope of our study from Manhattan to the entire NYC
(except for State Island due to few trips). All descriptions of the data and the
numerical experiments have been updated to reflect this change. We hope that this
change will deliver more comprehensive analyses and lead to more convincing
findings for the entire NYC, as compared to our previous discussion that was
limited to Manhattan only.

2. We have changed the approach for calculating borough and citywide average
travel speed. In our previous manuscript, the speed was calculated as the
fleet-wise average (e.g., the summation of all distances of the identified activities
divided by the summation of all trip time of the identified activities). Since the
vehicles are not uniformly distributed across the city, such calculation put higher
weights on places with more vehicles and will lead to a biased understanding of
the actual citywide speed. In this revised manuscript, we first calculated the speed
of each hexagon, and then we measure the citywide and the borough-wide speed
as the average of the hexagonal mean speed. This should more accurately reflect
the actual citywide traffic state. This change also applies to the calculation of
other speed and emission metrics.

3. A description of the identified activities is added in our manuscript and can be


found on page 11, Section 4.1 – Overview of identified activities. This section
provides the spatial distribution of the number of identified activities and the
summary statistics of the activities in 2017 and 2019. In addition, this section also
explains how we address the differences in the amount of data collected between
the two years.

4. We have updated the background facts of the New York City and added a table to
summarize the changes between 2017 and 2019.

5. Additional discussions are added for the borough-wide changes between 2017 and
2019.

6. We have corrected several grammar and spelling errors and improved the writing
of our revised manuscript.

We hope the changes made in this revised manuscript will sufficiently address the
concerns from the reviewers.
Manuscript (without Author Details) Click here to view linked References

Impact of transportation network companies on urban congestion:


Evidence from large-scale trajectory data

Abstract

We collect vehicle trajectory data from major transportation network companies (TNCs) in New York
City (NYC) in 2017 and 2019, and we use the trajectory data to understand how the growth of TNCs
has impacted traffic congestion and emission in urban areas. By mining the large-scale trajectory data
and conduct the case study in NYC, we confirm that the rise of TNC is the major contributing factor
that makes urban traffic congestion worse. From 2017 to 2019, the number of for-hire vehicles (FHV)
has increased by over 48% and served 90% more daily trips. This results in an average citywide speed
reduction of 22.5% on weekdays, and the average speed in Manhattan decreased from 11.76 km/h in
April 2017 to 9.56 km/h in March 2019. The heavier traffic congestion may have led to 136% more N Ox,
152% more CO and 157% more HC emission per kilometer traveled by the FHV sector. Our results show
that the traffic condition is consistently worse across the different times of day and at different locations
in NYC. And we build the connection between the number of available FHVs and the reduction in travel
speed between the two years of data and explain how the rise of TNC may impact traffic congestion
in terms of moving speed and congestion time. The findings in our study provide valuable insights for
different stakeholders and decision-makers in framing regulation and operation policies towards more
effective and sustainable urban mobility.
Keywords: Transportation network companies, Trajectory data, Urban traffic congestion, Emission

1 1. Introduction

2 Transportation network companies (TNCs), which connect travelers with drivers through app-based
3 platforms, have expanded rapidly in recent years. Based on a recent report, TNCs have more than
4 doubled the overall size of the for-hire ride services sector since 2012, making the for-hire vehicle (FHV)
5 sector a major provider of urban transportation services by the end of 2018 [1]. The popularity of TNCs
6 is the result of numerous advantages including improved convenience, higher flexibility, shorter waiting
7 time and lower trip fare as compared to traditional taxi services. However, the overgrowth of TNCs
8 also brings new concerns and challenges for urban traffic management. Although TNCs claim that they
9 help to reduce congestion, official reports and many studies have enumerated signs of road traffic getting
10 worse after the emergence of TNCs. It is reported that private-ride TNC services (Uber, Lyft) have
11 introduced an overall 180 percent more traffic to urban road networks and added billions of vehicle miles
12 traveled (VMT) in the nation’s largest metro areas [1]. Another recent study also asserted that TNCs
13 are the biggest contributor to the growth of traffic congestion in San Francisco [2]. These researches
14 depict the big picture of the influence of the overgrowing TNCs on traffic congestion and their findings

Preprint submitted to Sustainable Cities and Society December 15, 2019


15 largely agree with the impression among the general public. However, understanding the precise impact
16 of TNCs on urban traffic is intrinsically difficult, as the change of traffic condition can be the result of
17 the compounding effect of many other factors including population, employment, and change of road
18 network capacity, letting alone the rise of TNCs. And TNCs barely release data that are of sufficient
19 spatial resolution and temporal coverage to allow for tracing their service and evaluating their impacts.
20 Despite its difficulties, understanding the effects of TNCs on traffic conditions has become an in-
21 creasingly important topic for transportation planners and policymakers especially in large cities. Our
22 interpretation of the TNC effects will be directly reflected in the way we regulate TNCs and how we may
23 integrate them into the existing transportation system [3]. And our decisions and policies will largely
24 affect the mobility needs of millions of urban travelers and even the livings of hundreds of thousands
25 of TNC drivers. The previous study suggested that TNCs have the potential for reducing road traffic
26 by replacing individual trips with ride-sharing services [4]. But recent research indicated that rapidly
27 increasing TNCs have a negative effect on traffic conditions by attracting transit riders [5]. In particular,
28 the influence of the entry of TNCs on congestion was assessed based on historical area-level panel data.
29 Erhardt et al. [2] studied the impact of TNCs’ on traffic congestion through a before-and-after evaluation
30 of the 2010 and 2016 traffic conditions. While they specifically took the change of population, employ-
31 ment and road network into consideration, their results may be largely affected by their counterfactual
32 case in 2016 which was projected from the 2010 baseline with no TNC trips using San Francisco’s travel
33 demand model.
34 In this study, we design a control experiment for gaining accurate insights on the impact of TNC’s
35 on urban road traffic by scraping the data from TNC platforms in New York City (NYC) in 2017 and
36 2019. We limit our discussion to four major boroughs (Brooklyn, Bronx, Manhattan, and Queens) in
37 NYC and argue that the rise of TNCs is the foremost contributing factor to the statistically significant
38 changes, if any, of the road traffic condition based on the following facts:

39 1. We eliminate the impact of the population since NYC’s total population declined from 8.623 million
40 in 2017 to 8.399 million as of July 2018 [6].
41 2. We eliminate the impact due to employment changes as the labor force and employment are of the
42 identical level in both years (4.13 million and 4.11 million) for NYC.
43 3. There are no major transportation projects reported in NYC since 2014 according to NYCDOT [7].
44 4. Registration of standard vehicles declined from 1,913,663 in 2017 to 1,912,468 in 2018 [8].
45 5. The number of TNC drivers increased from 58,900 in April 2017 to 87,600 in March 2019 (48.7%
46 more) [9].
47 6. The number of daily TNC trips increased from 393,918 in April 2017 to 769,729 in March 2019
48 (95.4% more) [9].
49 7. The number of medallion taxis remains the same but the number of daily trips decreased from
50 334,865 in April 2017 to 252,634 in March 2019 (24.5% fewer) [9].
51 8. Transit usage in NYC experienced a drop from 2017 to 2018. It is reported that daily weekday
52 subway ridership in NYC was 5.44 million in 2018, which declined by about 2.6% compared with

2
53 2017 (143,000 fewer riders per day). Also, weekday bus ridership in NYC also experienced a drop
54 of 5.9% from 2017 to 2018 (1.81 million).

Table 1: Background facts in NYC

Item 2017 2019


Population (million) 8.623 8.399 (End of 2018)
Employment (million) 4.13 4.11 (End of 2018)
Standard vehicles registration 1,913,663 1,912,468 (End of 2018)
Daily weekday subway ridership (million) 5.58 5.44 (End of 2018)
Daily weekday bus ridership (million) 1.92 1.81 (End of 2018)
Number of TNC drivers 58,900 87,600
Number of daily TNC trips 393,918 769,729
Number of daily taxi trips 334,865 252,634

55 These facts help to narrow the only dominating contributing factor to the rise of TNC if we may
56 observe any meaningful changes in road traffic conditions. To obtain the most precise understanding of
57 road traffic conditions, we have scraped one month of FHV trajectory data in April 2017 and one month
58 of FHV trajectory data in March 2019. And we use the trajectory data from Uber, the largest TNC
59 in NYC, for further analysis. The scraped trajectory data contain the GPS record of the online Uber
60 drivers every few seconds and can be used to visualize and quantify the spatiotemporal change of traffic
61 conditions. And the large amount of data we collected help to obtain findings that are statistically
62 meaningful. We then classify the trajectory data into moving activities and stationary activities for
63 fine-level analysis of the time spent in congestion and speed during travel. We introduce macroscopic
64 energy models to further calculate the change in fuel consumption and emission during the two years.
65 Through comprehensive numerical experiments, we conclude that the increase of FHVs contributes to
66 significant speed reduction in NYC with a daily average drop of 22.5% on weekdays. As for Manhattan,
67 the average speed declines from 11.76 km/h to 9.56 km/h on weekdays and from 14.98 km/h to 13.51
68 km/h on weekend in less than two years. We report that the increased traffic congestion, along with the
69 growing number of TNC trips, double the tailpipe emissions from the TNC sector since 2017.
70 The rest of the study is organized as follows. We briefly review related literature on trajectory
71 analysis in the next section. Section 3 introduces the main methods used in this study, including the
72 developed data collection method, the validation of data quality, activity identification from trajectory
73 data and energy and emission calculation. Section 4 presents comprehensive results and discussion on
74 understanding the FHVs’ impact. Finally, we summarize key findings and future directions in section 5.

75 2. Literature

76 With the rapid development of data collection methods and availability of traffic-related big data in
77 cities, estimating city-level fuel consumption using vehicle trajectory data has gained a lot of interest.

3
78 GPS trajectory data have been widely used to understand mobility patterns [10, 11] and travel behav-
79 ior [12], discover flexible routes [13] and monitor real-time traffic situation (visualize traffic jam) [14]
80 due to their advantages of large coverage, good continuity, low cost and rich information about vehicles’
81 movements. In recent years, GPS trajectory data were used for large-scale fuel consumption estimation
82 to provide a more accurate vision of national or regional level vehicular emissions. Shang et al. [15]
83 calculated the gas consumption and emissions using GPS trajectories generated by over 32,000 taxis in
84 Beijing over a period of two months based on the estimated travel speed of each road segment using a
85 context-aware matrix factorization approach. Du et al. [16] explored the fuel consumption pattern and
86 analyzed the temporal and spatial distribution characteristics of average fuel consumption in Beijing
87 using large samples of historical floating vehicle trajectory data, where a fuel consumption forecasting
88 model was established using the back-propagation neural network. Gately et al. [17] quantified the
89 emissions from traffic congestion and identified local hotspots with highly elevated annual emissions at
90 regional scales using a large database of hourly vehicle trajectory data CO2 from road vehicles on 280,000
91 road segments in eastern Massachusetts. Luo et al. [18] analyzed the energy consumption and emissions
92 and their spatial-temporal distribution in Shanghai using GPS trajectory data obtained from taxis.
93 Vehicular emission models can be summarized as two types: macroscopic models [19, 20] and micro-
94 scopic models [21, 22], which focus on different aspects of vehicle emissions calculations and analysis.
95 For large-scale fuel consumption estimation, macroscopic models are usually used where emissions fac-
96 tors are modeled as functions of the average speed of vehicles [23]. However, these estimations do not
97 consider different driving modes or driving patterns which have been proved to have an obvious effect
98 on vehicle fuel consumption [24]. For example, engine start [25] or idling speed [26] will increase vehicle
99 exhaust emissions. Lack of consideration of these parameters may lead to erroneous estimations. For
100 large-scale emissions estimation, such erroneous estimations may result in a misunderstanding of the
101 overall traffic states and emission levels in the region. While GPS trajectory data can reveal detailed
102 information about vehicle driving modes and traffic states, it therefore provides the possibility of iden-
103 tifying different driving activities that will influence vehicle fuel consumption [27, 28]. In this paper, a
104 two-step integrated emission estimation method [29, 30] that incorporates driving activities (considered
105 in microscopic models) into COPERT model (macroscopic model) is adopted to provide more accurate
106 fuel consumption estimation of Manhattan using GPS trajectory data obtained from Uber. With this
107 method, driving activities of each vehicle are first specified as moving activities and stationary activities.
108 COPERT model is then applied to calculate the emissions of all trajectories considering both types of
109 driving activities of each vehicle. The integrated estimation method ensures more accurate emissions
110 and fuel consumption estimation in a city-level scheme and at the same time provides a more detailed
111 sense of TNC’s influence on traffic conditions.

112 3. Method

113 3.1. Data Collection


114 To gain insights on the impact of FHVs on urban traffic, we develop the data crawler, which simulates
115 the ride requesting behavior on the mobile app, to fetch the trajectory data from major TNCs including

4
116 Uber and Lyft. Our data crawler sends the trip starting location as the pingClient message to TNCs’
117 mobile API and receives back the sequences of coordinates of eight closest online FHV drivers as well as
118 the surge price (SP) and estimated time of arrival (ETA). Online vehicles refer to those who are available
119 for picking up passengers and the vehicles will no longer be recognized if they start a trip or if they go
120 offline. The collected trajectories therefore capture the cruising behavior of FHVs. But different from
121 taxis where street hailing is permitted, FHVs cruise to the next pick up location assigned by the platform
122 and the data therefore well reflect the actual traffic condition. By placing a sufficient number of data
123 collection stations with proper spacing and collection frequency, we are able to collect abundant vehicle
124 trajectories to restore the citywide operation dynamics of FHVs. In this study, only the trajectory data
125 from Uber are used as it is the dominant TNC in NYC with approximately 70% market share [31].
126 Trajectory records collected include the information of timestamp (in Unix), latitude, longitude, driver
127 ID (only first 6 letters shown here), product ID (e.g. UberX and UberXL) and bearing. The sample of
collected trajectory records can be seen in Table 2.

Table 2: Sample data records

Product ID Driver ID Epoch Bearing Latitude Longitude


2083 b97fed 1491760511750 344 40.67387 -73.80141
39 657dbb 1491753163395 209 40.77918 -73.95079
694 6b25cd 1491748277252 299 40.78273 -73.9495
4 73c3f4 1491732814910 191 40.71448 -74.01372
39 5f486 1491733990716 299 40.75755 -73.96903
128

129 We conduct citywide data collection in NYC and the data analyzed in this study were collected from
130 April 7 to May 3rd (6 AM to 11 PM) in 2017 and February 7 to March 13 (24 hours) in 2019 from Uber
131 API. The data collection was performed at the frequency of 5 seconds for each data collection station
132 in 2017, and this frequency was set to 1 minute in 2019 due to the change of functional mechanism
133 of Uber API. As suggested in [32], Uber may dynamically alter the ID assigned to each driver and
134 the data collected therefore do not contain privacy information related to any individual drivers. And
135 the data collection stations only send pingClient messages to Uber server for obtaining nearby vehicles’
136 trajectories without actually requesting a ride. Hence our data collection was conducted in an ethical
137 manner that neither hacked any driver or passenger privacy information nor sent real ride requests which
138 may disturb Uber operations.
139 We set the same data collection station configuration in 2017 and 2019 which consists of 470 stations.
140 The amount and spatial placement of the stations are carefully calibrated to ensure sufficient coverage
141 of the actual operation dynamics. In the beginning, we randomly placed a set of data collection stations
142 spreading over the entire NYC area, with each location having two data collection stations, and sent
143 pingClient message every 5 minutes for 12 consecutive hours. The test results suggested that over
144 99.99% of feedback messages between the two stations at the same location were exactly the same. And
145 we therefore assigned one data collection station per location. Another set of experiments was conducted

5
146 to identify appropriate spacing between two adjacent data collection stations. We used historical taxi
147 demand distributions to divide the whole study area into three sub-regions based on the trip demand
148 level. We varied the spacing from 100m to 1,500m between two adjacent stations in each sub-region
149 and deployed 9 neighboring stations in each region to measure data repetition among the 9 stations for
150 a 12-hour data collection. Finally, we chose the largest spacing that reached at least 40% repetition.
151 The resulting distribution of the data collection stations and the sampled spatial trajectory coverage are
shown in Figure 1.

40.9

40.8

40.7

40.6

40.5 Data collection station


74.2 74.1 74.0 73.9 73.8 73.7
(a) Data collection station (b) Visualization of sampled Uber trajectory in 2019

15000

10000

5000

0
6AM 12PM 6PM 6AM 12PM 6PM 6AM 12PM 6PM 6AM 12PM 6PM 6AM 12PM 6PM 6AM 12PM 6PM 6AM 12PM 6PM
TLC Record ∆ d=0.2 ∆ d=0.4 ∆ d=0.7 ∆ d=1 ∆ d=0.2, ∆ t=120

(c) Comparison between inferred number of trips vs TLC reported Uber trips [33]

Figure 1: Study area and the configuration of data collection stations

152

153 The 470 data collection stations fetched around 100 GB data per day in 2017 and 5.17 GB data per
154 day in 2019. To validate the quality of the data, we infer the number of Uber trips from 2017 data and
155 compare this number with the FHV trips reported by NYCTLC [34] for every 15-minutes time interval of
156 entire NYC. In particular, we track the trajectory of each unique driver ID and consider a trip was taken
157 place if (1) the time gap (∆t ≥ 60) and spatial displacement (∆d) between consecutive records exceeds
158 certain threshold or (2) the record was the last trajectory identified for the driver ID. The validation
159 results with various distance and time thresholds are presented in Figure 1c. While the collected data
160 by no means capture complete FHV operation information, we observe that the inferred number of trips
161 well resembles the reported trip level and the trip trend is closely aligned with the actual trip tendency
162 with the proper choice of distance and time threshold. This demonstrates the quality of the data we
163 collected and suggests that the data yield sufficient coverage of actual FHV dynamics.

6
164 Finally, we choose the data between Feb 27 to March 12 in 2019 and April 12 to April 25 in 2017
165 and compare the change of traffic states in two years. This time selection is to ensure the dates are
166 comparable in the time vicinity given the availability and the quality of the data we collected. Moreover,
167 we only focus on investigating the change in Manhattan as the case study which is the borough of the
168 heaviest congestion and highest FHV trip level in NYC.

169 3.2. Activity identification

170 Based on the collected FHV trajectories, we next convert the trajectories into space-time path seg-
171 ments (STPS) following the method proposed in Kan et al. [29]. The main reason for STPS construction
172 is to identify different vehicle activities during a sequence of GPS records for accurately estimating the
173 trajectory speed and inferring energy consumption and emission. In particular, we focus on separating
174 stationary activities (SA) from moving activities (MA) in the trajectories so that we may make the best
175 use of the high-resolution trajectories to restore the stop-and-go traffic states. MA and SA will contribute
176 to differentiating between the amount of time urban traffic caught in gridlock and the velocity of the
177 moving traffic. In addition, the functionality of engines differs between idle state and when the vehicle
178 is in motion. MA and SA will therefore result in the more accurate characterization of fuel consumption
179 and emission for urban traffic, where MA can be used with emission models for vehicle in motion and
180 SA can be used with emission models for idle engine state to obtain comprehensively evaluate the actual
181 emissions and fuel consumptions. Studies have shown that this approach can achieve over 88% accuracy
182 when using macroscopic emission model [29] and over 94% accuracy when using microscopic emission
183 model [30] when compared to actual fuel consumption.
184 To best identify trajectory activities, we first preprocess the data to remove consecutive trajectory
185 records of time gap that is shorter than 2 seconds or longer than 15 seconds. The removal of short time
186 intervals helps to mitigate GPS errors. On the other hand, we may underestimate the number of SA
187 for including records of longer time gaps as intermediate SA will be consolidated and reflected as MA if
188 these short time intervals are included. The resulting time gaps between consecutive trajectory records
189 mostly lie between 4 seconds and 6 seconds. The preprocessing eliminates around 15% of trip records in
190 the data and we then identify SA and MA based on the velocity (km/h) of the trip segment:

kcoordi+1 − coordi k
Vi,i+1 = (1)
ti+1 − ti
191 where k·k measures the euclidean distance between consecutive trajectory records in kilometers. And
192 we define the state of trajectory segment as:

SA,

if Vi,i+1 < 5
Si,i+1 = (2)
M A, if Vi,i+1 ≥ 5

193 The threshold of Vi,i+1 < 5 for separating SA and MA is selected to mitigate GPS errors that may lead
194 to the false classification of actual identities. We present two sample trajectories and their constructed
195 STPS and identified SA and MA in Figure 2. As seen in the figure, by using the velocity threshold, we
196 are able to accurately identify the non-moving or near non-moving activities as SA and the actual moving

7
197 trajectories as MA. After activity identification, we observe there are over 4.8 million daily activities for
198 2017 data and over 3.2 million daily activities for 2019 data between 7 AM to 11 PM. And these large
number of activities will be sufficient to obtain statistically meaningful results in the following sections.

40
MA 30 MA
Time(s)

40 SA SA
20
30 10
Time(s)

20
40.754
10
-73.987 40.754
-73.987 40.753
40.733
40.733 -73.987 Lat -73.975 -73.975 -73.974
40.732 -73.988 Lon
40.732 -73.988 Lon
Lat

(c) Sample trajectory of length 103 meters (d) Sample trajectory of length 149 meters

Figure 2: Example of collected FHV trajectory and the constructed STPS

199

200 3.3. Estimating fuel consumption and emission

201 Total vehicle emission is usually categorized into cold emission and hot emission. Hot emission entails
202 the emission when the engine is operating at a normal temperature, and the cold emission denotes the
203 emission at transient thermal operation. In this study, we only consider hot emission due to lack of
204 data to classify cold start activities and also because hot emission usually dominates the total emission
205 for long trips. As reviewed in the earlier section, both MOVES and COPERT are popular models for
206 energy and emission calculation and MOVES are specifically tailored to emission standards in the US.
207 Nevertheless, the MOVES model requires the calculation of vehicle specific power which needs the second
208 by second acceleration and engine specification data. This calls for the need of trajectory interpolation
209 and is better suited for long trajectories. Our data primarily contains trajectories over short segments
210 (as shown in Figure 2) and interpolation may result in high estimation errors. As a consequence, we use
211 COPERT model for fuel and emission calculation and assume all vehicles under Euro 3 standards with a
212 capacity of 1.4-2.0L. Note that not all Uber vehicles may comply with the Euro 3 standards and there is
213 no available data to understand the type of vehicles in the Uber fleet. In addition, Euro 3 standards may
214 not fully comply with the US EPA standards and hence the exact value calculated for emission and fuel

8
215 consumption may not be taken as an accurate measure for NYC. Nevertheless, the change of standard
216 will only affect the model parameters but not the relationship between velocity and the corresponding
217 fuel consumption and emission and the obtained results still capture the relative change between 2017
218 and 2019.
219 Based on the aforementioned specifications, for MA, fuel consumption (denoted as F C(g/km)) can
220 be calculated based on trajectory segment speed V (km/h) as:

217 + 0.253V + 0.00965V 2


F CM A = (3)
1 + 0.096V − 0.000421V 2
221 As for SA state, we estimate the fuel consumption based on vehicle idle time T [35]as:

F CSA = 0.361mL/s ∗ 0.75g/mL ∗ T = 0.27g/s ∗ T (4)

222 where the density of gasoline is taken as 0.75g/mL.


223 As for hot emission, following the Tier 3 method of COPERT model[36], the emission factor (EF
224 (g/km)) during MA state are speed-dependent:

aV 2 + bV + c + Vd
EFM A = (1 − RF ) (5)
eV 2 + f V + g
225 where RF is the reduction factor. The corresponding parameters for measuring EF of CO, HC and
226 N Ox are presented as follows:

Table 3: Emission parameter for small vehicles in COPERT model

item a b c d e f g
CO 0 11.4 71.7 0 -0.248 35.4 1
N Ox 6.53e-6 -1.49e-3 9.29e-2 0 3.97e-5 -1.22e-2 1
HC 1.2e-5 -1.1e-3 5.57e-2 0 -1.88e-4 3.65e-2 1

227 Finally, the calculation of EF under the SA state takes the following form:

EFSA = α ∗ T (6)

228 where T is the idle time and the parameter α(mg/s) for CO, N Ox and HC are 13.889, 0.556 and
229 2.222 respectively [35].

230 3.4. FHV as probe vehicles

231 Based on the previous discussions, we are able to make FHV as the probe vehicle for characterizing
232 the traffic condition in Manhattan with the large-scale trajectory data collected. Since Uber has a large
233 fleet of vehicles roaming around NYC, the performance metrics calculated from Uber vehicles will serve
234 as a close approximation of the actual metrics of all vehicles on road. If we consider P as the complete
235 trajectory data generated by the entire Uber fleet, then our collected data PC ⊂ P which can be viewed
236 as the sub-population randomly drawn from P. As a consequence, the average performance metrics

9
237 calculated from our collected data is the sample mean of the entire population. And the mean value
238 of the metrics obtained in our data will be close to the expected value in P based on the law of large
239 numbers. These suggest that the traffic condition mined from our data can well represent the actual
240 traffic condition of the road network.
In this study, we are primarily interested in the spatiotemporal velocity metrics and the corresponding
energy and emission level. In particular, we propose to measure the following velocity metrics:
MA
P
k Dk
Vi,t = P M A
, if activity k is at location i within time t (7)
k Tk + TkSA

DkM A
P
MA
Vi,t = Pk M A
, if activity k is at location i within time t (8)
k Tk
P SA
SA k Tk
Ri,t = P M A + T SA
, if activity k is at location i within time t (9)
T
k k k
MA SA
241 where Vi,t , Vi,t and Ri,t represents the mean velocity, mean MA speed (speed when the vehicle is
242 in motion) and mean SA time (proportion of time spent in stationary traffic congestion) respectively.

10
243 4. Results

244 4.1. Overview of identified activities

d
2017 2019
40.9 40.9

100000
40.8 40.8
75000
40.7 40.7
50000
40.6 40.6
25000
40.5 40.5
74.0 73.9 73.8 73.7 74.0 73.9 73.8 73.7
(a) Spatial distribution of identified activities

2017 0.6 SA Speed 2019


350 MA Speed 2019
2019
0.5 SA Speed 2017
300
MA Speed 2017
Number of zones

250 0.4
Proportion

200
0.3
150
0.2
100

50 0.1

0 0.0
1 3 5
10 10 10 0 5 10 15 20 25 30
Number of activities Speed (km/h)
(b) Histogram for zonal activity level (c) MA and SA speed distribution

Figure 3: Distribution of number of identified activities in 2017 and 2019.

245 As mentioned earlier, due to the change in data collection frequency, there exists a significant differ-
246 ence in the amount of data collected in 2017 and 2019. To overcome this issue and deliver fair comparison,
247 we perform sampling from the 2017 data and include 30% of the total identified activities. This results in
248 similar number of total activities identified in 2017 and 2019, as shown in Figure 3a. There are over 123.8
249 million activities identified during our study period from the 2017 data and the corresponding value is
250 117.6 million for 2019, which suggests the same level of identified activities between the two years. These
251 activities cover the entire study area and we also can verify the expansion of Uber’s service coverage

11
252 areas from 2017 to 2019 based on the spatial distribution of the identified activities. In particular, we
253 report that 70% of the zones in 2017 and 74.5% of the zones have more than 10,000 identified activities
254 (see Figure 3b), representing over 150 activities for each 15-minutes time interval at each location. This
255 vast amount of activities delivers superior spatiotemporal coverage and ensures the obtained results are
256 statistically meaningful. Finally, we present in Figure 3c the validity of the identified SA and MA based
257 on equation 2, where we also measure the speed of SA from the spatial displacement and time gaps. We
258 can verify that over 90% of the identified SA have the speed lower than 1 km/h and there exists a small
259 fraction of SA with speed lower than 3 km/h which we suspect to be caused by GPS errors. On the
260 other hand, we observe that MA is perfectly separated from the SA based on the speed metric and we
261 can readily tell the differences between 2017 and 2019 data from the distributions of the corresponding
262 MA speed. We next present detailed analyses of the changes in traffic condition and emission based on
263 the identified activities.

264 4.2. Overall change in traffic condition

15 14.59 30.21 149.14 147.02 142.79


13.41 13.08 30 26.73 28.37 26.80 28.16 150 136.74 141.50
125.69 132.17 121.32

Fuel Consumption(g/km)
23.72 25.3423.02
Mean MA Speed(km/h)

11.61 11.76 24.33 24.89


Mean Speed(km/h)

11.02 10.13 115.87 116.73


10.01 9.34 9.56
10 20 100

5 10 50

0 0 0
Queens Bronx Brooklyn Manhattan All Queens Bronx Brooklyn Manhattan All Queens Bronx Brooklyn Manhattan All
Borough name Borough name Borough name
2017 2019 2017 2019 2017 2019

(a) Average Speed (b) Average MA Speed (c) Fuel Consumption

4.58 4.44 4.32 0.67


4.22 0.65 0.24
Hydrocarbon Emission(g/km)

4.09 0.59 0.61 0.62 0.25 0.22 0.22 0.23 0.23


4 3.52 3.77 0.6 0.21
CO Emission(g/km)

3.35 0.54 0.20


NOx Emission(g/km)

3.15 3.13 0.50 0.47 0.20 0.18 0.18 0.19


3 0.43 0.43
0.4 0.15
2 0.10
1 0.2
0.05
0 0.0 0.00
Queens Bronx Brooklyn Manhattan All Queens Bronx Brooklyn Manhattan All Queens Bronx Brooklyn Manhattan All
Borough name Borough name Borough name
2017 2019 2017 2019 2017 2019

(d) CO (e) HC (f) NOx

Figure 4: Average borough-wide performance during weekdays

265 We first show the comparisons of daily average speed, energy consumption and emission across the
266 study area and the results can be found in Figure 4. One immediate observation from the results is
267 the deterioration of citywide traffic performances from 2017 to 2019, and such observation is consistent
268 across the four major boroughs in our study area. We find that the citywide average daily speed reduced
269 from 13.08 km/h in 2017 to 10.13 km/h in 2019, representing a significant drop of 22.6%. Among the
270 four boroughs, Manhattan and Brooklyn are the areas with the worst traffic condition and we observe
271 the average speed reduction around 19%. Meanwhile, we see a notable increase in energy consumption
272 and emission due to the worse borough-wide traffic condition. For each kilometer traveled, the vehicles

12
273 in NYC now consume 21 grams more gasoline and emit 1 more gram of CO, 0.15 more grams of HC and
274 0.04 more grams of NOx on average as compared to the 2017 state. If we assume the metrics calculated
275 from cruising FHVs also apply to the full FHV sector, and project these values onto the increase of FHV
276 trips while assuming the same average distance per trip, these translate into that FHVs have introduced
277 152% more CO, 157.7% more HC and 136.5% more NOx for every kilometer they traveled.

60 9AM 60 2PM 60 9AM 60 2PM

40 40 40 40
MA speed

MA speed

MA speed

MA speed
20 20 20 20
2017 2017
2019 2019
0 0.0 0.5 1.0 0 0.0 0.5 1.0 0 0.0 0.5 1.0 0 0.0 0.5 1.0
SA time SA time SA time SA time
(a) Manhattan (9 AM: 2.2%, 2PM: 4.7%) (b) Brooklyn (9 AM: 6.3%, 2PM: 7.6%)

60 9AM 60 2PM 60 9AM 60 2PM

40 40 40 40
MA speed

MA speed

MA speed

MA speed
20 20 20 20
2017 2017
2019 2019
0 0.0 0.5 1.0 0 0.0 0.5 1.0 0 0.0 0.5 1.0 0 0.0 0.5 1.0
SA time SA time SA time SA time
(c) Queens (9 AM: 9.1%, 2PM: 13.3%) (d) Bronx (9 AM: 3.5%, 2PM: 9.4%)

Figure 5: Relationship between standardized SA time and standardized MA speed. The percentage of areas that exceeds
the predefined threshold is shown in the bracket.

278 There is, however, one particular drawback associated with the cruising trajectory data when it is
279 used for probing citywide traffic conditions. Since FHV drivers may choose to park by side of the street
280 and wait for future orders from the platform, this may lead to overestimation of the actual number of
281 SA activities on the road and the calculated average zonal speed, therefore, constitutes the lower bound
MA SA
282 of actual travel speed. By inspecting the relationship between Vi,t and Ri,t , we may gain additional
283 insight on this particular issue. Specifically, when there is heavy congestion in certain areas, we should
284 observe low MA speed and high SA time which captures the stop-and-go traffic pattern during congestion
285 and gridlock. Similarly, when traffic is light, the trajectory data should reveal high MA speed and low SA
286 time. However, when Uber drivers choose to park and wait rather than cruise, we are likely to encounter
287 the anomalies with both high MA speed and high SA time. This motivates us to explore the percentage
288 of observations that fall into the latter abnormal state and reveal the park-and-wait behavior of FHV
289 drivers. By inspecting the 2017 and 2019 data, we find that the 75% percentile of MA speed across the
290 data is approximately 25 km/h and that for the SA time is around 0.7. We then set the threshold of
291 V̄ M A = 25 and R̄SA = 0.7 and measure the proportion of areas in each borough with both MA speed
292 and SA time being higher than the thresholds. The results are shown in Figure 5.

13
293 We report that during morning peak hours (9 AM) all boroughs present traces of the park-and-wait
294 behavior. Manhattan has the lowest value of 2.2% while Queens has the highest percentage of park-and-
295 wait observations (9.1%), followed by Brooklyn (7.6%) and the Bronx (3.5%). And the percentage of
296 park-and-wait observations is increased for all boroughs during the off-peak time (2 PM), where Queens
297 still has the highest value of 13.3% and we find a drastic jump in the Bronx to 9.4% and that of Brooklyn
298 is 7.6%. These findings explain why the estimated speed in Brooklyn is lower than in Manhattan despite
299 the fact that Manhattan has the highest number of FHVs and passenger demand. In this situation,
300 the cruising trajectory data may slightly underestimate the average speed and overestimate the actual
301 energy consumption and emission. And the change in MA speed serves as a more accurate metric for
302 comparing the change of traffic conditions across different boroughs. On the other hand, the results also
303 indicate that the obtained traffic condition and emission are relatively accurate in Manhattan as there
304 are few park-and-wait observations. Moreover, the measured average speed of 11.76 km/h in 2017 is
305 well aligned with the speed of 11.2 km/h reported in the NYC mobility report [37]. We next zoom into
306 Manhattan borough and discuss how traffic conditions and emission change over time.
307 We summarize the time-varying performance metrics for both weekday and weekend in Manhattan in
308 Figure 6 and 7. For measuring the changes in traffic conditions, we plot the average speed, average MA
309 speed, as well as average SA time from 2017 and 2019 data and the corresponding changes between the
310 two years, are visualized by the shaded area. While the average weekday speed in Manhattan decreased
311 from 11.76 km/h to 9.56 km/h (Figure 4), this reduction can be further decomposed into two parts.
312 On one hand, more FHVs result in slower-moving speed so that there is a reduction of 9.2% in average
313 MA speed in Manhattan. On the other hand, there is also an increase in average SA time by 7.04%.
314 These provide strong evidence showing that the traffic condition is worse in 2019 than in 2017, and such
315 observation is consistent across different times of the day. As for the weekend, the mean speed is 14.98
316 km/h in 2017 and 13.51 km/h in 2019 respectively, suggesting a reduction of 9.8%. During weekdays,
317 we find that the peak hours (especially morning peak during 7-9 AM) have the worst traffic condition
318 and also suffer the greatest decline in average speed and MA speed. The changes during off-peak hours
319 are relatively minor. For the weekend, we observe that notable changes in mean speed mainly take place
320 during off-peak hours (7 AM to 12 PM on weekend) and during the nighttime period (7 PM to 10 PM).
321 The decreases in mean speed during weekdays and weekends correspond to the largest drop in travel
322 speed in Manhattan sine 2015 [37] and eventually lead to higher fuel consumption and more tailpipe
323 emission across different times of the day. For Manhattan, we observe that, during weekdays, vehicles
324 will consume 10.0% more gasoline and exhale 12.0% more N Ox , 16.1% more CO and 18.6% more HC
325 for each kilometer they traveled in Manhattan in 2019 as compared to those in 2017. These results
326 highlight the critical traffic congestion issues related to the rise of TNC in NYC, and possibly around the
327 world: the fast expansion of TNCs quickly saturates the road network, resulting in the increase of fuel
328 consumption and vehicle emission for all road traffic and even significant addition from the compound
329 of increasing worse traffic condition and more FHV trips.

14
18

Change of 2019 to 2017 (%)

Change of 2019 to 2017 (%)


0.625

Change of 2019 to 2017 (%)


30 12.5 15

Mean MA Speed(km/h)
16 15

Mean Speed(km/h)
0.600

Mean SA rate (%)


28 10.0
0.575 10
14 10 26 7.5
5.0 0.550
12 5 24 5
2.5 0.525
10 22
0 0.0 0.500 0
7 9 11 13 15 17 19 21 7 9 11 13 15 17 19 21 7 9 11 13 15 17 19 21
2017 2019 Change% 2017 2019 Change% 2017 2019 Change%

V V MA RSA

150 0.24 20
15.0 25 30

Hydrocarbon Emission(g/km)
Change of 2019 to 2017 (%)

Change of 2019 to 2017 (%)

Change of 2019 to 2017 (%)

Change of 2019 to 2017 (%)


4.5 0.65
Fuel Consumption(g/km)

12.5 25
NOx Emission(g/km)

15 20

CO Emission(g/km)
140 0.22 0.60
10.0 4.0 20
15 0.55
130 7.5 0.20 10 15
3.5 10 0.50
120 5.0 10
0.18 5 0.45
2.5 5 5
3.0 0.40
110
0.0 0.16 0 0 0
7 9 11 13 15 17 19 21 7 9 11 13 15 17 19 21 7 9 11 13 15 17 19 21 7 9 11 13 15 17 19 21
2017 2019 Change% 2017 2019 Change% 2017 2019 Change% 2017 2019 Change%

Fuel consumption N Ox emission CO emission HC emission

Figure 6: Change of city-wide metrics on weekdays between 2017 and 2019

15.0
20 25 0.62
Change of 2019 to 2017 (%)

34
Change of 2019 to 2017 (%)

Change of 2019 to 2017 (%)


12.5 15
Mean MA Speed(km/h)

0.60
Mean Speed(km/h)

20 32
Mean SA rate (%)

18 10.0
0.58 10
15 30 7.5
16 0.56
10 28 5.0
14 0.54 5
5 26 2.5 0.52
12 0 24 0.0 0
7 9 11 13 15 17 19 21 7 9 11 13 15 17 19 21 7 9 11 13 15 17 19 21
2017 2019 Change% 2017 2019 Change% 2017 2019 Change%

V V MA RSA

140 20 4.25
Hydrocarbon Emission(g/km)

0.60
Change of 2019 to 2017 (%)

Change of 2019 to 2017 (%)

Change of 2019 to 2017 (%)

Change of 2019 to 2017 (%)


15 0.22 25 30
Fuel Consumption(g/km)

4.00
NOx Emission(g/km)

CO Emission(g/km)

0.21 15 0.55
130 3.75 20
10 0.20 20
10 3.50 15 0.50
120 0.19
5 3.25 10 0.45
0.18 5 10
5
110 0.17 3.00 0.40
0 0 0 0
7 9 11 13 15 17 19 21 7 9 11 13 15 17 19 21 7 9 11 13 15 17 19 21 7 9 11 13 15 17 19 21
2017 2019 Change% 2017 2019 Change% 2017 2019 Change% 2017 2019 Change%

Fuel consumption N Ox emission CO emission HC emission

Figure 7: Change of city-wide metrics on weekend between 2017 and 2019

15
330 4.3. Spatial impact

40
30
20
8AM 11AM 2PM 6PM 10PM
10
(a) Mean MA speed distribution in 2019 on weekdays

5.0
2.5
0.0
8AM 11AM 2PM 6PM 10PM 2.5
5.0
(b) Change of mean MA speed between 2017 and 2019 on weekdays

1.00
0.75
0.50
8AM 11AM 2PM 6PM 10PM 0.25
0.00
(c) Mean SA ratio distribution in 2019 on weekdays

0.50
0.25
0.00
8AM 11AM 2PM 6PM 10PM 0.25
0.50
(d) Change of mean SA ratio between 2017 and 2019 on weekdays

Figure 8: Spatial distribution of mean speed and SA ratio on weekdays

331 We next present the spatial distributions of the mean MA speed and SA time in 2019 and their changes
332 as compared to 2017, and the results during weekdays are shown in Figure 8. Based on Figure 8a, we can
333 clearly distinguish between the traffic peak hours (8 AM and 6 AM) and off-peak periods. In particular,
334 we find that heavy traffic congestion in Manhattan persists across the day time, whereas there are notable
335 differences in terms of the MA speed in other boroughs between peak and off-peak hours. Location-wise,
336 we observe that lower and middle Manhattan, as well as the areas in other boroughs that are adjacent
337 to Manhattan, are found to suffer the heaviest congestion with the average MA speed being less than
338 20 km/h. And with the rise of TNCs, we notice a city-wide reduction of MA speed in despite of the

16
339 particular times of the day, where there are 88.2%, 83.1%, 78.9%, 85.1% and 81.8% of all 1371 areas that
340 have slower travel speed for 8 AM, 11 AM, 2 PM, 6 PM, and 10 PM. On weekend, the corresponding
341 values are 76.9%, 76.0%, 73.3%, 75.1% and 70.3% respectively. These numbers indicate that the traffic
342 condition is more affected during weekdays than on weekend, and the congestion is worse during peak
343 hours on both weekdays and weekends.
344 Note that with an excessive number of FHV vehicles on the road in cruising mode, these drivers
345 are likely to present different driving behavior than other commuting drivers. Specifically, they need to
346 pay close attention to their smartphones for incoming passenger orders which distracted them from the
347 road. And they may have to frequently merge into or diverge from the slow traffic to pick up passengers
348 or find temporary parking spots to save their cruising cost. These undoubtedly introduce significant
349 disturbance to the already slow traffic and adding frequent stop-and-go activities and shockwaves into
350 the traffic flow. These can be confirmed from the spatial distribution of the SA ratio, as shown in
351 Figure 8c. Being different than the findings from the distributions of the MA speed, the morning peak
352 hour has a lower level of SA ratio and the SA ratio is found to be higher at the time of more number of
353 cruising drivers (see x-axis in Figure 9 for the number of identified cruising drivers). This is likely due
354 to lower demand levels and more cruising FHVs in off-peak hours, resulting in greater disturbance to
355 the traffic and more stop-and-go traffic. The average SA ratio is 0.503 for 8 AM, and 0.542,0.535,0.532,
356 and 0.508 for 11 AM, 2 PM, 6 PM and 10 PM respectively. For every 10 minutes of driving in NYC,
357 these numbers translate into over 5 minutes sitting in non-moving traffic and highlight the huge amount
358 of wasted time for the large number of daily travelers who drive themselves or rely on taxis and FHVs.
359 And when compared to the 2017 scenario, there are 66.4%, 71.5%, 71.0%,70.6% and 73.4% of the areas
360 at 8 AM, 11 AM, 2 PM, 6 PM and 10 PM that experience increased SA ratio. Finally, following our
361 previous discussion on the relationship between MA speed and SA ratio, we can also visually identify the
362 places with dominant park-and-wait behavior. And such behavior is primarily found in peripheral areas
363 of Brooklyn, the Bronx, and Queens with lower trip intensity and less traffic. It is easier for drivers to
364 spot a parking place in these places and wait for future orders from the ride-hailing platforms.

365 4.4. Active drivers and speed change

366 Finally, we build the connection between the change of vehicle speed and the number of available
367 Uber drivers on the road during weekdays. In particular, we track the active Uber drivers as the number
368 of unique driver IDs identified across the study areas for every 15 minutes time interval. And we separate
369 our daily observations and group similar time intervals to increase the number of individual observations
370 for comparison between 2017 and 2019. For each year, we use the observations from 8 consecutive time
371 intervals (every 2 hours) on each of the 10 weekdays. This gives 80 observations for each 2 hour time
372 period for both 2017 and 2019, and each observation contains the number of active Uber drivers and the
373 average speed in NYC. The relationship between active Uber drivers and the speed at different times
374 of the day are presented in Figure 9. One immediate finding based on the results is that the traffic
375 condition in 2017 and 2019 are in entirely different states and the differences between the two years are
376 distinguishable even if we visualize the observations of all time periods in one plot (7:00-23:00). And

17
7:00-23:59 7:00-9:00 9:00-11:00
slope=-1.35E-03 24 slope=-1.90E-03 24 slope=-1.41E-03
26 R 2=0.53 R 2=0.67 R 2=0.83
2017 2017 2017
2019 22 2019 22 2019
24
22 20
Speed (km/h)

Speed (km/h)

Speed (km/h)
20
20
18
18 18
16
16 16
14 14
14
1000 2000 3000 4000 5000 6000 7000 3000 4000 5000 6000 1000 2000 3000 4000 5000 6000 7000
Active Uber drivers Active Uber drivers Active Uber drivers
11:00-13:00 13:00-15:00 15:00-17:00
24
slope=-1.24E-03 slope=-1.24E-03 21 slope=-1.26E-03
R 2=0.85 R 2=0.80 R 2=0.76
2017 2017 20 2017
2019 22 2019 2019
22
19
Speed (km/h)

Speed (km/h)

Speed (km/h)
20 18
20
17
18 16
18
15
16
16 14

2000 3000 4000 5000 6000 7000 2000 3000 4000 5000 6000 7000 3000 4000 5000 6000 7000
Active Uber drivers Active Uber drivers Active Uber drivers
17:00-19:00 19:00-21:00 21:00-23:00
22 slope=-1.28E-03 slope=-1.36E-03 slope=-1.90E-03
R 2=0.66 23 R 2=0.80 R 2=0.88
2017 2017 26 2017
2019 22 2019 2019
20
21 24
Speed (km/h)

Speed (km/h)

Speed (km/h)

18 20
22
19
16 18 20
17
14 18
16
3000 4000 5000 6000 7000 4000 5000 6000 7000 3000 4000 5000 6000 7000
Active Uber drivers Active Uber drivers Active Uber drivers

Figure 9: city-wide average speed comparison across different time of the day

377 the two years of data are also linearly separable in all of the individual time periods. We calculate the
378 Pearson correlation coefficient between the number of active drivers and the average speed for each two-
379 hour period of time, and the resulting coefficients are -0.82,-0.91,-0.92,-0.89,-0.87,-0.81,-0.89 and -0.94
380 for each of the time period. All these values are close to -1 and they suggest the significant negative
381 correlation between the number of FHV drivers and the average travel speed in Manhattan. While
382 correlation does not necessarily imply causation, if we may eliminate the impact from other contributing
383 factors such as those shown in Table 1, the strong negative correlation likely hints that the increase in
384 FHV drivers is the primary contributing factor to the citywide worse traffic congestion and emission.
385 Note that the active Uber drivers we identified from the data may only serve as a proxy of the total
386 number of Uber drivers in service. During the time with high passenger demand (e.g. 7:00-9:00, 17:00-
387 19:00), we may identify fewer drivers than that during off-peak hours. This is because our data capture
388 cruising Uber drivers and the drivers are less likely in cruising state when there is more passenger
389 demand than the vehicle supply. Nevertheless, the results are still valid as we compare the same time
390 of the day in two different years. And the number of identified drivers as shown by the x-axis also echo

18
391 our finding on the impact of excessive drivers in cruising on travel speed and SA ratio (Figure 8). As
392 the observation is based on the consistent results across all time periods over multiple days of data for
393 two years, this confirms that the increase of FHV trips and TNC drivers are one significant contributing
394 factor to urban traffic congestion. If we fit a simple linear regression model over the data, as shown
395 by the lines in Figure 9, the number of active Uber drivers alone may explain up to 88% (21:00-23:00
396 with R2 = 0.88) of the variability for the reduction of travel speed and such linear relationship may
397 well fit the observations for most of the time periods. Based on the coefficients of active Uber drivers,
398 we further notice that the impact of the number of FHVs on travel speed can be categorized into two
399 cases depending on the number of cruising drivers in the city. In the first case (7:00-9:00, 21:00-23:00),
400 the impact of this is reflected by both the reduction in MA speed and the increase in SA time. For the
401 second state (9:00-21:00), the impact of FHV vehicles is primarily reflected by the increase in SA ratio,
402 where the park-and-wait behavior from excessive cruising drivers as well as the disturbance to normal
403 traffic from more number of FHV trips together lead to the worse traffic congestion and emission. In
404 general, for the first state, the increase in FHV vehicles has a greater impact on the speed reduction
405 than the second state with the fitted coefficients being 50% higher.

406 5. Conclusion

407 In this study, we collect and mine large-scale FHV data and provide comprehensive understandings of
408 how the rise of TNCs impacts the traffic congestion and emissions in urban areas. We choose Manhattan
409 in NYC as the study area and conduct analyses of the trajectory data in 2017 and 2019. We classify
410 stationary and moving activities from the trajectory data and calculate the mean speed, energy con-
411 sumption and fuel consumption based on the classified MA and SA. Our results suggest that the increase
412 of FHV trips in NYC has resulted in an average citywide speed reduction of 22.5% on weekdays and the
413 average speed in Manhattan has decreased from 11.76 km/h in April 2017 to 9.56 km/h in March 2019.
414 And if we consider the increase of FHV trips over the two years. Our results confirm that the increase of
415 TNC vehicles is one of the major contributing factors to the increase in traffic congestion. And we also
416 articulate two different ways, which depend on the overall congestion level, that the increase in FHVs
417 may affect traffic congestion with different magnitude of speed reduction.
418 As a major byproduct of the worse traffic conditions, our results highlight emerging energy consump-
419 tion and emissions issues from the TNC sector. We have shown in our study that the increase in FHVs
420 and the number of trips has led to 136% more NOx, 152% more CO and 157% more HC emissions per
421 kilometer traveled by the FHV sector within two years. This finding is obtained under a conservative
422 assumption where the duration and distance of each passenger trip stay the same. In reality, however,
423 the revealed decrease in MA speed and increase in SA ratio are indicative of longer trip duration as
424 well as longer cruising time before an FHV may reach the next passenger. We may speculate from this
425 observation that the actual contribution of the TNC sector could be much higher than the reported
426 values in this study. In this regard, immediate actions should be taken against the overgrowth of TNCs
427 in urban areas. Based on our results, there are two practical directions that may help to mitigate the
428 energy and emission issues. First, as a short-term measure, the entry of FHVs in heavily congested areas

19
429 should be strictly regulated. We have shown that more FHVs contribute to not only slow-moving speed
430 but also more congestion and stop-and-go traffic. The latter is the primary source of tailpipe emissions
431 and regulating FHV service in congested areas helps to avoid the additive effect of more traffic and worse
432 emissions per individual vehicle. But more importantly, considering a large number of trips served by the
433 TNC sector, policies should be framed to encourage and facilitate the adoption of alternate fuel vehicles
434 in the ride-hailing industry which can achieve significant long-term savings of the energy and emission
435 costs.
436 We believe that the ”failure” of TNCs in populated urban areas can be attributed to three primary
437 reasons. One straightforward reason is the overgrowth of the number of TNCs that exceeds the already
438 limited capacity of the urban road network. It is noted that the increase in the number of TNC drivers
439 contributes differently to traffic congestion as compared to regular commuters. This can be reflected by
440 much more frequent merges and diverges for picking up and dropping off passengers. And these introduce
441 vital disturbances to regular traffic flow and result in more stop-and-go traffic. As a consequence, TNC
442 vehicles not only add traffic, they also downgrade the capacity of the road network.
443 The second reason is due to the competitive nature of the TNC market. The market involves com-
444 petition among different service provides and the traditional taxi sector, it also includes the competition
445 among the drivers of the same TNC platform [38]. Such competition adds another layer of inefficiency
446 if there is excessive supply than the actual demand, which is often the case during off-peak periods of
447 passenger demand and corresponds to our analyses of the time periods with low average speed but a high
448 number of active drivers. We should be aware that TNCs’ prime time only accounts for approximately 6
449 hours (morning peak + evening peak) per day or 25% of the time daily. But for the rest of the day, there
450 are more number of drivers competing for fewer number of passengers, resulting in excessive cruising
451 miles and searching time. And we have pointed out in our analyses that TNC drivers will need to pay
452 attention to their smartphones during cruising and such distracted driving is one notorious casual factor
453 for traffic accidents.
454 Finally, we consider the lack of effective regulation and operation mode to be another reason. We
455 assert that the observations that ”TNC worsens urban traffic congestion and emissions” should not be
456 viewed as contradicting the potential of TNCs for improving the efficiency and sustainability of our
457 urban mobility. Indeed, several studies have pointed out that TNC could be a highly effective solution
458 for efficient travel (e.g. 60% to 90% empty trips may be reduced if passengers and drivers are optimally
459 matched [39]) and have validated the effectiveness of properly designed ridesharing mechanisms [40, 41].
460 But at present, there is no evidence showing how efficiently are TNC drivers and passengers being
461 matched and the ’real’ ridesharing which actually combines multiple single rides only accounts for a
462 small amount of the total number of TNC trips [42]. Apparently the current TNC practice, which is
463 primarily revenue driven, is still far from its optimal performance considering aspects of social benefits
464 and overall sustainability. It is therefore necessary to frame regulations to strike the balance between the
465 TNC’s business model and social welfare. And the findings in our study provide important insights for
466 evaluating the actual externalities from the TNC sector and will be valuable for decision and policymakers
467 in framing effective regulations. As an example, NYC recently started the congestion surcharge for TNC

20
468 and taxi trips entering Manhattan (south of 96th street) [43]. Our findings largely favor this regulation
469 as the first step to mitigate the congestion impacts from the TNC sector, but also suggest the possibility
470 for the surcharge to be varying spatially and temporally.

471 References

472 [1] Bruce Schaller. The new automobility: Lyft, uber and the future of american cities. 2018.

473 [2] Gregory D Erhardt, Sneha Roy, Drew Cooper, Bhargava Sana, Mei Chen, and Joe Castiglione. Do
474 transportation network companies decrease or increase congestion? Science advances, 5(5):eaau2670,
475 2019.

476 [3] Maarit Moran and Philip Lasley. Legislating transportation network companies. Transportation
477 Research Record, 2650(1):163–171, 2017.

478 [4] Susan Shaheen, Nelson Chan, Apaar Bansal, and Adam Cohen. Shared mobility: A sustainability
479 & technologies workshop: definitions, industry developments, and early understanding. 2015.

480 [5] Eun Hye Grace CHOI. The Effects of transportation network companies on traffic congestion. PhD
481 thesis, KDI School, 2017.

482 [6] NYC current and projected population., accessed June, 2019. Available online at https://www1.
483 nyc.gov/site/planning/planning-level/nyc-population/current-future-populations.
484 page.

485 [7] NYCDOT: major transportation projects, accessed Oct, 2019. Available online at https://www1.
486 nyc.gov/html/dot/html/about/major-transportation-proj.shtml.

487 [8] New York Department of Motor Vehicles. Statistical data and summaries.

488 [9] New York City Taxi and Limousine Commission. Aggregated reports - data reports monthly indi-
489 cators.

490 [10] Yu Zheng, Quannan Li, Yukun Chen, Xing Xie, and Wei-Ying Ma. Understanding mobility based
491 on gps data. In Proceedings of the 10th international conference on Ubiquitous computing, pages
492 312–321. ACM, 2008.

493 [11] Yu Zheng, Yukun Chen, Quannan Li, Xing Xie, and Wei-Ying Ma. Understanding transportation
494 modes based on gps data for web applications. ACM Transactions on the Web (TWEB), 4(1):1,
495 2010.

496 [12] Jinjun Tang, Han Jiang, Zhibin Li, Meng Li, Fang Liu, and Yinhai Wang. A two-layer model
497 for taxi customer searching behaviors using gps trajectory data. IEEE Transactions on Intelligent
498 Transportation Systems, 17(11):3318–3324, 2016.

21
499 [13] Favyen Bastani, Yan Huang, Xing Xie, and Jason W Powell. A greener transportation mode:
500 flexible routes discovery from gps trajectory data. In Proceedings of the 19th ACM SIGSPATIAL
501 International Conference on Advances in Geographic Information Systems, pages 405–408. ACM,
502 2011.

503 [14] Zuchao Wang, Min Lu, Xiaoru Yuan, Junping Zhang, and Huub Van De Wetering. Visual traffic
504 jam analysis based on trajectory data. IEEE transactions on visualization and computer graphics,
505 19(12):2159–2168, 2013.

506 [15] Jingbo Shang, Yu Zheng, Wenzhu Tong, Eric Chang, and Yong Yu. Inferring gas consumption
507 and pollution emission of vehicles throughout a city. In Proceedings of the 20th ACM SIGKDD
508 international conference on Knowledge discovery and data mining, pages 1027–1036. ACM, 2014.

509 [16] Yiman Du, Jianping Wu, Senyan Yang, and Liutong Zhou. Predicting vehicle fuel consumption
510 patterns using floating vehicle data. Journal of Environmental Sciences, 59:24–29, 2017.

511 [17] Conor K Gately, Lucy R Hutyra, Scott Peterson, and Ian Sue Wing. Urban emissions hotspots:
512 Quantifying vehicle congestion and air pollution using mobile phone gps data. Environmental pol-
513 lution, 229:496–504, 2017.

514 [18] Xiao Luo, Liang Dong, Yi Dou, Ning Zhang, Jingzheng Ren, Ye Li, Lu Sun, and Shengyong Yao.
515 Analysis on spatial-temporal features of taxis’ emissions from big data informed travel patterns: a
516 case of shanghai, china. Journal of cleaner production, 142:926–935, 2017.

517 [19] Kyoungho Ahn and Hesham Rakha. The effects of route choice decisions on vehicle energy consump-
518 tion and emissions. Transportation Research Part D: Transport and Environment, 13(3):151–167,
519 2008.

520 [20] Mohammad Amin Pouresmaeili, Iman Aghayan, and Seyed Ali Taghizadeh. Development of mash-
521 had driving cycle for passenger car to model vehicle exhaust emissions calibrated using on-board
522 measurements. Sustainable cities and society, 36:12–20, 2018.

523 [21] Maryam Shekarrizfard, Ahmadreza Faghih-Imani, Louis-Francois Tétreault, Shamsunnahar Yasmin,
524 Frederic Reynaud, Patrick Morency, Celine Plante, Louis Drouin, Audrey Smargiassi, Naveen Eluru,
525 et al. Regional assessment of exposure to traffic-related air pollution: Impacts of individual mobility
526 and transit investment scenarios. Sustainable Cities and Society, 29:68–76, 2017.

527 [22] Aarshabh Misra, Matthew J Roorda, and Heather L MacLean. An integrated modelling approach
528 to estimate urban traffic emissions. Atmospheric Environment, 73:81–91, 2013.

529 [23] Jianlei Lang, Shuiyuan Cheng, Ying Zhou, Yonglin Zhang, and Gang Wang. Air pollutant emissions
530 from on-road vehicles in china, 1999–2011. Science of The Total Environment, 496:1–10, 2014.

531 [24] Boski P Chauhan, GJ Joshi, and Purnima Parida. Driving cycle analysis to identify intersection
532 influence zone for urban intersections under heterogeneous traffic condition. Sustainable cities and
533 society, 41:180–185, 2018.

22
534 [25] Jean-Yves Favez, Martin Weilenmann, and Jan Stilli. Cold start extra emissions as a function of
535 engine stop time: Evolution over the last 10 years. Atmospheric Environment, 43(5):996–1007, 2009.

536 [26] SM Ashrafur Rahman, HH Masjuki, MA Kalam, MJ Abedin, A Sanjid, and H Sajjad. Impact of
537 idling on fuel consumption and exhaust emissions and available idle-reduction technologies for diesel
538 vehicles–a review. Energy Conversion and Management, 74:171–182, 2013.

539 [27] Eric Jackson, Lisa Aultman-Hall, Britt A Holmén, and Jianhe Du. Evaluating the ability of global
540 positioning system receivers to measure a real-world operating mode for emissions research. Trans-
541 portation research record, 1941(1):43–50, 2005.

542 [28] Carolien Beckx, Luc Int Panis, Davy Janssens, and Geert Wets. Applying activity-travel data for
543 the assessment of vehicle exhaust emissions: Application of a gps-enhanced data collection tool.
544 Transportation Research Part D: Transport and Environment, 15(2):117–122, 2010.

545 [29] Zihan Kan, Luliang Tang, Mei-Po Kwan, and Xia Zhang. Estimating vehicle fuel consumption and
546 emissions using gps big data. International journal of environmental research and public health,
547 15(4):566, 2018.

548 [30] Zihan Kan, Luliang Tang, Mei-Po Kwan, Chang Ren, Dong Liu, Tao Pei, Yu Liu, Min Deng, and
549 Qingquan Li. Fine-grained analysis on fuel-consumption and emission from vehicles trace. Journal
550 of cleaner production, 203:340–352, 2018.

551 [31] Taxi and ridehailing usage in New York City, accessed Sep, 2019. Available online at https:
552 //toddwschneider.com/dashboards/nyc-taxi-ridehailing-uber-lyft-data/.

553 [32] Le Chen, Alan Mislove, and Christo Wilson. Peeking beneath the hood of uber. In Proceedings of
554 the 2015 Internet Measurement Conference, pages 495–508. ACM, 2015.

555 [33] Xinwu Qian, Dheeraj Kumar, Wenbo Zhang, and Satish Ukkusuri. Understanding the operational
556 dynamics of mobility service providers: A case of Uber. ACM Transactions on Spatial Algorithms
557 and Systems (TSAS), 2019.

558 [34] TLC trip record data, accessed May, 2019. Available online at https://www1.nyc.gov/site/tlc/
559 about/tlc-trip-record-data.page.

560 [35] Rahmi Akçelik, Robin Smit, and Mark Besley. Calibrating fuel consumption and emission models
561 for modern vehicles. In IPENZ Transportation Group Conference, 2012.

562 [36] Leon Ntziachristos and Zissis Samaras. Methodology for the calculation of exhaust emissions, 2018.
563 Available online at https://www.emisia.com/utilities/copert/documentation/.

564 [37] NYC Department of Transportation. New York City mobility report, 2018.

565 [38] Xinwu Qian and Satish V Ukkusuri. Taxi market equilibrium with third-party hailing service.
566 Transportation Research Part B: Methodological, 100:43–63, 2017.

23
567 [39] Xianyuan Zhan, Xinwu Qian, and Satish V Ukkusuri. A graph-based approach to measuring the
568 efficiency of an urban taxi service system. IEEE Transactions on Intelligent Transportation Systems,
569 17(9):2479–2489, 2016.

570 [40] Paolo Santi, Giovanni Resta, Michael Szell, Stanislav Sobolevsky, Steven H Strogatz, and Carlo
571 Ratti. Quantifying the benefits of vehicle pooling with shareability networks. Proceedings of the
572 National Academy of Sciences, 111(37):13290–13294, 2014.

573 [41] Xinwu Qian, Wenbo Zhang, Satish V Ukkusuri, and Chao Yang. Optimal assignment and incentive
574 design in the taxi group ride problem. Transportation Research Part B: Methodological, 103:208–226,
575 2017.

576 [42] Uber says that 20% of its rides globally are now on UberPool, accessed
577 June, 2019. Available online at https://techcrunch.com/2016/05/10/
578 uber-says-that-20-of-its-rides-globally-are-now-on-uber-pool/?ncid=rss.

579 [43] Judge approves congestion pricing for New York City taxi, Uber and Lyft rides,
580 accessed June, 2019. Available online at https://abcnews.go.com/Business/
581 judge-approves-congestion-pricing-york-city-taxi-uber/story?id=60778450.

24

You might also like