You are on page 1of 42

Working Paper

2022/32/TOM

Don’t Fake It If You Can’t Make It:


Driver Misconduct in Last Mile Delivery
Srishti Arora
INSEAD, srishti.arora@insead.edu

Vivek Choudhary
Nanyang Technological University, vivek.choudhary@ntu.edu.sg

Pavel Kireyev
INSEAD, pavel.kireyev@insead.edu

In the last two decades, last mile delivery (LMD) firms have seen immense growth due to the rise of e-commerce
and digitalization, leading to faster and cheaper deliveries. However, this growth has also resulted in new
challenges such as increased competition amongst LMD firms and heightened customer expectations. Operating
on thin margins, LMD firms strive to be efficient in successfully delivering orders in their first attempt because
reattempted deliveries are costly in terms of lost revenues, reverse logistics costs, and reputation. A key reason
for delivery failure is the misconduct of delivery workers (field executives, or FEs). For instance, a FE can claim
that ‘the customer was not available’ without even visiting the customer’s address and record this claim as the
reason for delivery failure (i.e., a fake remarked delivery). So far, few studies have explored the impact of FEs’
behavior on LMDs’ efficiency. Most attempts at improving LMD performance focus on technical enhancements
and incentive designs, assuming workers will follow the designed processes precisely. However, this assumption
may not hold, especially for LMD firms that mostly employ gig workers. To study and quantify the impact of FE
misconduct on LMD performance we collaborated with one of the largest LMD firms in India. Using instrumental
variable regression, we identify the effect of fake remarked deliveries on future productivity. Our results suggest
that operational losses due to fake remarks on a given day spill over to the subsequent day, by reducing the next
day overall success rate of deliveries by 1.5%. This decrease is mostly driven by the reduction in first-time-right
deliveries, resulting in significant revenue losses for the LMD firms. We find evidence that opportunistic
circumstances, such as familiarity with the delivery area and cash on delivery parcels, exacerbate the detrimental
effect of fake remarks. We present some of the first results on the impact of aberrant behaviors on workers’
productivity in last mile logistics.

Keywords: Last mile Delivery; Platform Economy; Misconduct; Productivity

Electronic copy available at: http://ssrn.com/abstract=4151409

Working Paper is the author’s intellectual property. It is intended as a means to promote research to interested
readers. Its content should not be copied or hosted on any server without written permission from
publications.fb@insead.edu
Find more INSEAD papers at https://www.insead.edu/faculty-research/research
Copyright © 2022 INSEAD

Electronic copy available at: https://ssrn.com/abstract=4151409


1. Introduction

The global e-commerce industry is predicted to grow at an annual rate of 8% between 2020-2025 (Gläser
et al. 2021), leading to a 36% increase in the number of delivery vehicles in the top 100 cities by 2030
(World Economic Forum 2020). This unprecedent growth in e-commerce and digitalization has created
immense opportunities for last mile delivery (LMD) firms like DHL and FedEx as most of the e-commerce
giants rely on them to complete deliveries between sellers and customers.1 LMD is being studied in
academia with an ever-evolving definition, starting from the “extension of supply chains directly to the end
consumers” and now converging to the “last segment of a delivery process that can be a parcel locker,
collection point, or a consignee’s address” (Lim et al. 2018). However, the last mile remains a bottleneck
for many supply chains because it is massively fragmented (Winkenbach 2019) and there is a lack of
control, limited observability, and limitations in precise monitoring.2

This growth in e-commerce has resulted in additional challenges for LMD firms like increased
competition, lower margins, growing customer expectations and, to make things worse, competing in-house
delivery arms of giants like Amazon.3 Consumers are becoming more demanding in terms of service
expectations such as on-time delivery without willing to pay more for these requirements (Mangiaracina et
al. 2019). LMD firms employ freelance gig workers for their last mile operations, of which a significant
portion are hired on contractual terms (Altenried 2019) which creates additional challenges.

Even though these challenges make the last mile the most expensive leg in the shipment lifecycle
(Seghezzi and Mangiaracina 2022), they also make it strategically important for businesses to have a
competitive edge (Harrington 2019, Lim et al. 2018). Consequently, last mile delivery efficiency is a well-
studied topic in multiple fields like economics, smart city operations, and sustainability. Existing research
focuses on alternative delivery solutions such as parcel lockers and drones, route optimization, facility
location, supply chain structures, innovative business models like crowdsourced logistics, and
environmental performance (Kiba-Janiak et al. 2021, Olsson et al. 2019). Most attempts at improving last
mile delivery performance focus on technical enhancements and incentive designs, assuming workers will
follow the designed process precisely and the smartphone apps used for monitoring movement and progress
will work with precision (i.e., GPS location will be accurate). However, these assumptions may not hold in
many cases, especially involving gig workers. While there is some work in consumer behavior that studies
consumer response to alternative delivery solutions (Wang et al. 2020), there is scant research on the impact

1
https://www.forbes.com/sites/forbesbusinessdevelopmentcouncil/2021/05/06/massive-growth-challenges-and-
opportunities-for-third-party-logistics-post-pandemic
2
https://www.supplychainbrain.com/blogs/1-think-tank/post/32800-last-mile-delivery-challenges-and-how-to-
solve-them
3
https://supplychaindigital.com/sustainability/analysis-amazon-sets-sights-house-delivery-takes-aim-3pls

Electronic copy available at: https://ssrn.com/abstract=4151409


of delivery workers’ behavior on last mile efficiency. The last mile involves a lot of human interactions
between delivery workers and customers. Delivery workers’ behavior, therefore, can significantly affect
firm reputation and, as a result, profitability. This makes it important to understand the effects of behavior
on delivery performance.

Worker misconduct can have a consequential impact on their performance and is costly to the firms
(Burbano and Chiles 2021). But relatively few papers study misconduct in an operational setting (Chan et
al. 2021). In LMD, the success rate of deliveries in the first attempt plays a significant role in determining
supply chain efficiency (Rautela et al. 2021).4 Therefore, studying the impact of delivery workers’
misconduct on delivery success is both important and interesting. In this paper, we not only quantify the
impact of misconduct of the delivery worker on performance but also explore the conditions that exacerbate
or mitigate the impact. To the best of our knowledge, we present some of the first works on the impact of
aberrant behaviors on workers’ productivity in last mile logistics.

To investigate this research question, we collaborated with one of the largest LMD firms in India to
analyze the behavior of their delivery workers, called Field Executives (FEs). The firm confirmed that one
of the key reasons for delivery failure is delivery workers’ misconduct. FEs attempt several parcels each
day but not all of them are successfully delivered. The FE provides a textual remark for each failed delivery
indicating the reason for failure. Each parcel is attempted a maximum of three times before it is sent back
to the e-commerce platform. Sending back to the merchant is costly in terms of lost revenues, reverse
logistics costs, and reputation. To minimize such returns, LMD firms allow multiple attempts as they expect
that a few failed deliveries will occur because of genuine reasons (e.g., the customer was not available when
the delivery was made). However, this flexibility provides an opportunity for the FE to intentionally not
deliver in their first attempt. A FE can enter a remark that ‘the customer was not available’ without even
visiting the customer’s address (fake remarked deliveries, henceforth). There can be several reasons for an
FE to enter a fake remark: she does not want to travel a long distance for a single delivery or does not want
to go to the region of the delivery address. Such fake remarked deliveries are prone to cause customer
complaints and even customer churn.

The problem of fake remarks is increasingly being recognized in the industry and several startups are
working on it.5,6 Yet again, the solutions are either technical (e.g., using smartphone sensors data) or require
manual verification (e.g., calling the customer after a failed delivery). In India, infrastructure constraints
fan the flames. In addition to the large volume of parcels, postal codes do not match the precise geocodes

4
https://postandparcel.info/93399/news/e-commerce/true-cost-implications-failed-deliveries/
5
https://www.clickpost.ai/blog/top-8-reasons-for-ndr-in-ecommerce-and-how-to-solve-them#title_11
6
https://www.shiprocket.in/blog/know-how-you-can-prevent-fake-delivery-attempts/

Electronic copy available at: https://ssrn.com/abstract=4151409


in non-metro Indian cities, and addresses are largely unstructured. To add to the complexity, GPS locations
provided by smartphones have a high degree of errors. These challenges dampen the benefits of technical
enhancements and provide ample opportunity for FEs to shirk as tracking becomes difficult. But this
problem is not just limited to India. There are lots of similar complaints globally regarding missing parcels,
incorrect delivery updates, and so on.7,8 Hence, understanding the impact of driver behavior on efficient
delivery is essential.

The natural approach to deter such behaviors is to enforce a financial penalty. However, the freelance
contractual nature of the job (with almost no cost of switching between several outside options) and the
surplus of job opportunities for gig workers in the nascent e-commerce industry9 makes it very difficult to
implement financial penalties. Also, the downside of increasing monitoring for workers in gig economy
makes it infeasible to keep a check on workers’ behavior (Liang et al. 2022). Therefore, we study the effect
of fake-remarked deliveries on FE productivity and suggest ways to mitigate the effect. Employing an
instrumental variable regression approach, we identify the effect of fake remarked deliveries on future
productivity. We borrow from the peer effects literature and construct our instrument based on the fake
remark behavior of an FE’s peers, which we argue influences FE’s own fake remark activity but is unrelated
to the focal FE’s future productivity.

Our empirical analysis suggests that fake remarks, an example of aberrant behavior, by an FE can have
a negative effect on her future productivity. Our estimates imply that an increase in one fake remarked
delivery today not only reduces success today (as expected) but also leads to a 1.5% decrease in overall
successful deliveries on the subsequent day, which can lead to million-dollar losses annually for the LMD
firm. This effect is explained by the suboptimal allocation of effort between the types of deliveries which
results in a decrease in first-time-right deliveries (FTR – successful deliveries on the first attempt). The
spillover effect of fake remarks results in a 1.51% decrease in FTR deliveries on the next day. Our analysis
suggests that this reduction in FTR deliveries is explained by the disproportionate amount of additional
effort invested by the FE to deliver parcels that she had fake remarked on the previous day. This hurts the
operational revenue of the LMD firm even more, because reattempts are costly, but does not hurt the FE as
a major part of her compensation is fixed.

Furthermore, we find evidence that the impact of fake remarks worsens in the presence of opportunistic
circumstances such as when the FE is familiar with the delivery area or for cash-on-delivery (COD) parcels.
Here, more plausible excuses exist for a failed delivery i.e., a bigger opportunity for entering a fake remark.

7
https://www.yugatech.com/business/why-your-lazada-orders-are-flagged-as-delivered-even-if-theyre-not/
8
https://thepackageguard.com/amazon/amazons-delivery-service-can-sometimes-leave-guessing/
9
https://www.bcg.com/en-in/unlocking-gig-economy-in-india

Electronic copy available at: https://ssrn.com/abstract=4151409


These findings suggest several practical implications. Businesses should target first-time-right success in
addition to overall delivery success. Managers should take into account potential misconduct and situations
favorable to such behaviors while designing processes such as parcel allocation in our case. For example,
we know from the extant literature that location familiarity enhances productivity (Mao et al. 2019).
However, we find a counterintuitive result that the effect of fake remarks worsens when the FE is familiar
with the area. Therefore, FEs with a history of misconduct should be strategically allocated to newer areas
rather than the ones visited before.

We contribute to the literature that studies productivity and impact of human behavior on productivity.
We not only study the impact of workers’ aberrant behavior in the last mile context but also show that the
consequences of such behavior go beyond immediate productivity loss. Our work highlights the need for
studying misconduct in delivery firms because of the limits of technical solutions to deter such behaviors.
We also contribute to the growing work on empirical behavioral operations (Gallien et al. 2016, Simchi-
Levi 2014, Tang 2016).

2. Last Mile Delivery Process

Our collaborator, the LMD firm, works with most of the e-commerce giants in India. The firm has over
2,000 delivery centers as of June 2021. When a consumer places an order on the partner e-commerce
website, the LMD firm picks up the parcel from the designated warehouse, and the parcel is received at the
delivery center via regional centers. The delivery center is the last stop for the parcel before it is delivered
to the customer by the FEs (i.e., the last mile leg which is the focus of this paper). The LMD firm employs
contractual FEs, delivering primarily using two-wheelers.

Electronic copy available at: https://ssrn.com/abstract=4151409


Figure 1: Dispatch Lifecycle

Once a FE signs up with the LMD firm, they need to visit the delivery center every morning. The
manager in charge of the center will assign a dispatch (assortment of parcels) to the FE for the day. After
the allocation, FEs decide on the sequence in which they will attempt to deliver the parcels. Naturally, most
of the FEs order the attempts based on distance from the delivery center10, but some may prioritize based
on the type of parcel. However, there is no recommendation or instructions provided to the FEs on how to
make the deliveries, meaning that FEs have full autonomy to choose the delivery sequence, including which
areas to visit and when. Not all attempted parcels are successfully delivered. Failed deliveries, also called
re-attempts, are taken back by the FE to the delivery center at the end of the day and are attempted again
on the subsequent day. Parcels attempted unsuccessfully (three times) are sent back to the merchant. FEs
are paid a fixed salary with a small variable component which is based on the successful deliveries made.

The process is outlined in Figure 1. Each day, the FE may handle three types of parcels – a) fresh parcels
that will be attempted for the first time on that day, b) fake remarked reattempt parcels that were fake
remarked earlier by the same FE and will be reattempted on that day and, c) genuine reattempt parcels that
were not delivered earlier by the same FE because of genuine reasons but will be attempted again on that
day.

10
We also confirm this by interviewing 15 FEs.

Electronic copy available at: https://ssrn.com/abstract=4151409


Figure 2: Screenshot of the FE smartphone app

c) Remarks for a Failed Delivery

Electronic copy available at: https://ssrn.com/abstract=4151409


Once a fresh parcel is assigned to the FE, it is assigned to the same FE the next day if it is not delivered
the previous day. As a result, the FE also has full information about how many attempts have been made
for each parcel. However, the allocation of the fresh parcels does not depend on the carryover reattempts
from the previous day. Each day, the center manager distributes the fresh deliveries to the FEs based on
their areas of allocation, therefore, the FEs do not control the number of parcels allocated on a day-to-day
basis.

To facilitate the delivery process, the firm provides a smartphone app to the FEs. Figure 2-a) provides
the screenshots of the FE app where parcels along with delivery details are displayed. FEs’ can record
delivery status and contact the consignee. Before making a delivery attempt, the FE is advised but not
mandated to call the customer. Calls can be made through the app, and the firm can track their duration.
When a FE makes a delivery attempt, they need to enter in a remark to indicate whether the delivery was
successful, and if not, then why? Figure 2-b) shows the screenshot of the app in the case of a successful
delivery. For unsuccessful deliveries, FEs have a pre-populated list of remarks on their app, and they simply
need to select the right one for each attempt. In Figure 2-c), some examples from this list - “Consignee
unavailable”, “Entry restricted area”, etc. - are shown.

According to the firm, the bulk of the returned parcels is due to fake remarks, which is also the case with
other LMD firms. Therefore, the firm conducts a daily audit of remarks against failed deliveries as well as
seller complaints of a false attempt. The center manager calls the consignees of the failed deliveries to
confirm the reason for the failure. Given the scale of the business, it is not possible to manually audit all
failed deliveries and calls are made at random. To identify whether the remark was fake for the remaining
failed deliveries, a multi-step automated process is employed based on two parameters: i) the distance of
the FE at the time of the remark from the consignee’s address and ii) records of the call that the FE make
to the consignee from the app. The remarks are then categorized into four groups based on these two factors.
Remarks that require both conditions to be met (the FE should have been near the consignee’s address and
have made a call for at least 15 seconds), remarks that require only one of these conditions, and remarks
that need either one of these conditions. In each of the groups, the remarks that do not satisfy the required
conditions are classified as fake remarks. At the end of the day, FEs are required to answer for the failed
delivers and are scolded for any misconduct that is caught during audit or through customer complaints.
This happens on the floor of the delivery center and in front of the co-workers. The FEs are not made aware
of the logic of fake remarks to prevent gaming. Only the fake remarks that are identified daily are
communicated to the respective FE. Additionally, weekly reports are generated to analyze the performance
of the FEs and the center.

Electronic copy available at: https://ssrn.com/abstract=4151409


3. Literature and Hypothesis Development

This paper draws on three research streams studying last mile delivery, people-centric operations, and
the impact of behavioral misconduct on worker productivity. The first stream, last mile delivery, is studied
in multiple fields such as economics, sustainability, and operations management. In a recent review of this
literature, Olsson et al. (2019) classified last mile logistics research into five different themes – emerging
trends and technology, operational optimization, supply chain structures, performance measurement, and
policy. Beyond these themes, there has recently been a lot of interest in last mile delivery research that
focuses on the behavior of delivery workers. For example, Mao et al. (2019) identified factors, such as the
driver's knowledge of the delivery area, that impact delivery performance for a meal delivery platform,
whereas Xu et al. (2020) investigated the moderating effects of ratings and penalties on workers’ work time
for a crowdsourced grocery delivery platform. We are not aware of research that studies workers’
misconduct and strategic behavior in last mile delivery productivity.

The second stream of literature our work speaks about is People Centric Operations (PCO), which is
defined as “the study of how people affect the performance of operational processes” (Roels and Staats
2021). PCO is gaining traction in all branches of operations management as recently reviewed by Fahimnia
et al. (2019). Field et al. (2018) highlighted that understanding employee behavior is a growing area of
research. Most PCO studies are either based on laboratory experiments or analytical modeling. These papers
talk about incorporating workers’ behavior in decision-making processes to maximize performance. Haruvy
et al. (2020) examined retailers’ behavior to investigate the effect of bargaining on contract performance in
supply chains, Cho et al. (2019) incorporated speedup and slowdown behaviors in workforce staffing
decisions, Armony et al. (2021) assessed the performance of pooling in the presence of servers’ degree of
customer ownership, and Jiang et al. (2021) studied the impact of regret aversion on workers’ relocation
decisions and system performance. Besides these analytical papers, few empirical papers also study
employee behavior. For example, Chan et al. (2021) analyzed how workers’ misconduct varies in the
presence of counterproductive peers, Batt et al. (2019) examined the behavioral implications of work
environment on physician productivity in a hospital emergency department, and Kamalahmadi et al. (2021)
explored how schedule unpredictability impacts workers’ behavior These papers do not study the impact
of workers’ misconduct on productivity and profitability. Moreover, even though the strategic behavior of
customers has been studied in detail (Gallino et al. 2022, Li et al. 2014), employees’ strategic behavior and
its effects on firm performance have not been extensively studied in the OM literature (Roels and Staats
2021). In one of the very few studies related to employees’ strategic behavior, Diwas et al. (2020) showed
how the impact of an increased workload on performance can be explained by a physician’s task completion

Electronic copy available at: https://ssrn.com/abstract=4151409


preferences. In contrast to this literature, we study the impact of aberrant behaviors on workers’ productivity
and investigate mitigation strategies.

The third stream of literature focuses on the misconduct of employees. Aberrant behaviors that lead to
worker productivity declines are formalized in multiple ways in both economics and business literature
(Eliyana and Sridadi 2020). Robinson and Benett (1995) classified behaviors like taking excessive breaks
and intentionally working slow as production deviance, a subcategory of aberrant behaviors, whereas
Kidwell and Bennett (1993) defined shirking as a process in which employees withhold efforts for reasons
like self-interest and opportunism. They also hypothesized the reasons behind employees’ propensity to
shirk. We will list a few that are relevant in our context. First, the focal worker might withhold effort
because she believes that others are also planning to do so. Second, the worker will withhold effort when
she perceives that the task is less visible to the supervisor. Third, the worker’s effort is positively impacted
by the perceived lack of alternative employment opportunities. These reasons to withhold efforts are
applicable to LMD. Moreover, workers’ behavior is crucial in LMD. Overall, literature indicates that
worker’s misconduct and peer effects should play a significant role in LMD efficiencies. The impact of
workers’ aberrant behaviors on their productivity is not researched at length in the OM literature and not
studied at all in last mile delivery context.

The LMD setting features a lack of granular audit and a high availability of alternative employment
opportunities if the FE decides to quit, leading to a high propensity to withhold effort or shirk. We study
the impact of one such measure of shirking, i.e., the “fake remarks” input by the FEs, on productivity. A
FE can have several reasons for entering a fake remark. She might have already achieved her daily target
of successful deliveries, or she might have to travel a long distance just to deliver a few parcels. We measure
productivity by the number of successful deliveries made by the FE because that is the most important
success metric for any LMD firm. Drawing on the literature discussed we believe that fake remarks can
negatively impact workers’ productivity.

It is also important to understand what is driving this decrease in productivity. Of course, fake remarks
will lead to a decrease in the number of successful deliveries on the same day and create a backlog for the
next day. But how FEs prioritize and complete their tasks is also crucial to the performance of the firm.
Ackerman et al. (2020) discuss that effort is a disutility for an individual, and subjective judgment of the
disutility of a task results in different levels of effort. The effort allocation depends on a lot of factors
including personality, interest, and punishment/rewards associated with the job. Similarly, KC et al. (2020)
explain that, in the situation of stress or when facing threats, individuals use simpler rules to prioritize their
tasks and acknowledge that it is important to understand task completion preferences of workers when they
have the discretion to manage their tasks. Fake remarks, caught in the audit, result in scolding of the FE

Electronic copy available at: https://ssrn.com/abstract=4151409


and create a sense of fear of getting caught through customer complaints or audits. This might lead to the
FEs prioritizing the fake remarked delivery on the next day to cover up for their misconduct and to reduce
future customer complaints. For instance, the FE might make suboptimal decisions regarding her delivery
schedule the next day by putting extra time and effort into the fake remarked deliveries. This could result
in a cascading effect of the fake remarks on future productivity. Consequently, due to the suboptimal
allocation of effort, we predict that fake remarks will affect future productivity. Formally,

Hypothesis 1. Fake remarks will have a negative impact on future productivity

Our data does not capture many behavioral traits (e.g., the willingness to enter a fake remark) of the FEs.
Therefore, we conducted interviews with FEs to understand their mindset. Our interviews provided key
insights stemming from comments such as

• “We have an idea that some customers will not accept the parcel. I have two such parcels today
as well – one from c197 and another from d116. These two customers never accept the parcels.
But they order frequently. I have to go all the way to their floor, I get tired, and my time is
wasted. They don’t even pick up the call”.

• “I know for which customer I can go directly to their address and for whom I need to call 5-10
minutes before I reach there”.

Therefore, based on our interaction with the FEs, we believe that an FE can anticipate the probability of
success of a parcel based on her past experiences with customers and the delivery area. This foresight might
help her strategically select which parcels to fake remark as she expects to successfully deliver them the
next day.

Beyond our interactions with the FEs, in the literature, both agency theory and transaction cost
economics have an underlying assumption that agents are opportunistic. In principal-agent theory, Arrow
(1985) explains, “Effort is a disutility to the agent, but it has a value to the principal in the sense that it
increases the likelihood of a favorable outcome”. Similarly, Williamson (1993) states that, “economic
agents be described as opportunistic, where this contemplates self-interest seeking with guile”. Also, Leider
(2018) explains how principal-agent theory is used in operations management to understand strategic
interactions. In our case, the FE is an agent and delivering a parcel requires effort from the FE that generates
value for the LMD firm. Therefore, connecting back to our FE interviews, we believe that opportunism in
our context is self-evident.

Fake remarked deliveries that are caught during the manual audit at the end of day will be prioritized the
next day. For these parcels, the FE has no choice but to deliver them diligently to avoid shame and scolding.
In addition, it is likely that the FE will put more effort into delivering fake remarked parcels that were not

10

Electronic copy available at: https://ssrn.com/abstract=4151409


identified through audits to avoid getting caught. As a result, the FE will undertake two strategies: i)
strategically select which parcels to fake remark as she expects to successfully deliver them the next day,
ii) exert disproportionate effort to deliver the fake remarked parcels. Hence, we hypothesize that fake
remarks will increase the success of fake remarked reattempt parcels on the subsequent day:

Hypothesis 2. An increase in fake remarks will have a positive impact on future fake remarked reattempt
success.

Also, since fake remarked reattempts are positively affected, the impact of fake remarks on the remaining
parcels should be negative and stronger for overall productivity to decrease. This is also due to the fact that
the FE has no information about the fresh parcels that will be assigned to her on the following day.
Therefore, we hypothesize that:

Hypothesis 3. An increase in fake remarks will have a negative impact on future first-time-right (FTR)
delivery success.

As defined by Luo and Meyer (2017), “Opportunism is a behavior that is motivated by self-interest and
takes advantage of relevant knowledge asymmetry to achieve own gains, regardless of the principles,” and
a low cost of switching jobs is associated with opportunism (Ping 1993, Schwartz and Hirschman 1972).
Furthermore, information asymmetry or a lack of monitoring capabilities facilitates shirking (Alchian and
Demsetz 1972). LMD firms suffer from both – low switching costs and modest monitoring capabilities.
Therefore, circumstances and environments that increase knowledge asymmetry will facilitate
opportunism.

We define opportunistic circumstances as the conditions in which knowledge asymmetry is higher and
opportunism is facilitated. In our context, opportunistic circumstances are those in which it is easier to enter
a fake remark and the probability of getting caught is lower. We believe that FEs are opportunistic agents,
and opportunistic circumstances will enhance their opportunism which, in turn, will hurt their productivity.
Overall, the opportunistic circumstances will negatively moderate (or worsen) the effect of fake remarks
on productivity losses. Therefore, we predict that:

Hypothesis 4. Opportunistic circumstances will negatively moderate the effect of fake remarks on future
productivity.

4. Data analysis

Our dataset consists of granular data for six months (Jun - Nov 2019) from nine different cities (both
metros and non-metros) in India for 867 FEs. During this period, FEs worked on more than 3.4 million
parcels across 71 delivery centers. Other than the delivery details, the company collects data on the

11

Electronic copy available at: https://ssrn.com/abstract=4151409


movement of the FE using their phone’s GPS location and keeps a record of parameters such as the number
of attempts and characteristics of the delivery location and parcel. Our dataset consists of more than 135
million location traces to precisely understand the movement of the FE for each parcel. To the best of our
knowledge, this is one of the biggest datasets that has been analyzed in the last mile logistics context.

Our dataset consists of the following information - i) dispatch details ii) parcel traces iii) FE details and
iv) delivery center information. Dispatch data consists of high-level details like which FE is assigned a
particular dispatch, the number of parcels in the dispatch, and the aggregate number of parcels canceled,
delivered, pending, etc. Parcel traces provide more granular details like the promised delivery date for each
parcel, FE interactions with each consignee (call duration), whether the consignee’s address was served by
the company before, FE location traces with timestamps for the entire day, and the remark entered by the
FE for each parcel. Then, we have FE particulars such as joining date, work area, and shift in the FE details
table. Finally, the dispatch center location and related information are present in the delivery center table.

Using this data, we estimate the effect of fake remarks made on the previous day on the overall success
and FTR deliveries on the subsequent day using a fixed effect panel regression with instrumental variables.
We only consider FEs that have worked on more than 6 dispatches in the 6 months that we observe. In the
final panel, we have 856 FEs who worked on approximately 2.9 million parcels in these 6 months.

4.1. Definitions of key variables

We provide the definitions of key variables used in our analysis. We created a day-level panel by
aggregating dispatch-level data. Each row in the panel corresponds to a FE working on a particular day.

First, we will describe our dependent variables.

success: the number of parcels that were successfully delivered by the FE in a day. This is our main
dependent variable.

FTR: the number of parcels that were successfully delivered in the first attempt (first-time-right
deliveries) by the FE in a day.

re_success: the number of parcels that were unsuccessful earlier but were successfully delivered by the
FE in a day, i.e., the reattempts that were successfully delivered.

fr_success: the number of parcels that were fake remarked earlier but were successfully delivered by the
FE in a day. i.e., the total number of fake remarked reattempt success

non_fr_success: the number of parcels that were not delivered earlier because of genuine reasons but
were successfully delivered by the FE in a day, i.e., the total number of non-fake remarked successful
reattempts.

12

Electronic copy available at: https://ssrn.com/abstract=4151409


It is important to understand that success is the sum of FTR and re_success, where re_success is the sum
of fr_success and non_fr_success.

fake_remark: the number of fake remarks entered by the FE in a day. This is the primary treatment
variable of interest.

In addition to the above, we include several control variables that are likely to affect our dependent
variables. As workload can affect a worker’s behavior (KC et al. 2020), we control for the total number of
parcels assigned to the FE. Similarly, it is important to control for the number of co-workers working on
the same day as it may affect the focal workers' behavior. Next, the types of parcels – fresh, reattempts, or
cash on delivery, might affect the FE’s behavior. Additionally, metrics like the FE’s idle time and the
number of parcels that the FE didn’t even attempt, provide information about the FE’s behavior. We
describe all control variables below.

parcels: the number of parcels that are allocated to the FE in a day.

reattempts: the number of parcels that were attempted earlier but were not delivered and are assigned to
the FE in a day.

fr_reattempts: the number of parcels that were fake remarked earlier and are assigned to the FE in a day.

non_fr_reattempts: the number of parcels that were not delivered earlier for genuine reasons and are
assigned to the FE in a day.

fe_count: the number of FEs working at the same local delivery center as the focal FE in a day.

first_attempt: the time in minutes that the FE takes to make the first delivery attempt of the day, i.e., the
time to the first attempt.

unattempted: the number of parcels that the FE didn’t attempt in a day.

cod: the number of parcels that are assigned to the FE in a day that requires the FE to collect cash on
delivery.

idle_time: total FE idle time in minutes in a day. Idle time is defined as the time during which a FE does
not maneuver on her vehicle between two deliveries. The firm allows up to 15 minutes of idle time per
delivery for reasons like the FE walking between two apartments in the same building, waiting at the door,
and so on. Therefore, no idle time is associated with such deliveries. In all other cases, idle time is calculated
using GPS traces. The duration for which the FE’s vehicle moves slower than a threshold speed is classified
as idle time. The firm uses an established threshold of 6 km/hr. The firm calculates idle time as the

13

Electronic copy available at: https://ssrn.com/abstract=4151409


proportion of a rolling 5-minute time window in which the FE is moving at a speed < 6km/hr based on the
distance traveled by the FE.11

4.2. Summary statistics and correlations

Descriptive statistics for the main variables are provided in Table 1. The final panel is unbalanced and
contains 46,300 FE-day observations. On average, a FE is assigned 63 parcels per day (including both new
deliveries and reattempts), and nearly two-thirds (36) of them are delivered on a daily basis (90% of which
are delivered in the first attempt). We observe 2.7 fake remarks by an FE on average per day, but sometimes
this number can be as high as 41. FEs have a pre-defined work area and are assigned parcels for that area.
On average 88% of the assigned parcels belong to sub-areas that the FE has visited before. Also, 45% of
the parcels required the FE to collect cash on delivery from the customer.

Pairwise correlations are reported in Table A.1, and we do not find multi-collinearity issues in our data.
In Figure A.1, we provide the distributions of all variables.

𝑠𝑝𝑒𝑒𝑑 𝑜𝑓 𝐹𝐸 𝑖𝑛 𝑘𝑚/ℎ𝑟
11 (1 − ( )) ∗ (𝑡𝑖𝑚𝑒 𝑤𝑖𝑛𝑑𝑜𝑤), 𝑖𝑓 𝑠𝑝𝑒𝑒𝑑 < 6𝑘𝑚/ℎ𝑟
𝑖𝑑𝑙𝑒_𝑡𝑖𝑚𝑒 = { 6
0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒

14

Electronic copy available at: https://ssrn.com/abstract=4151409


Table 1: Descriptive Statistics for Panel Data

Variable Min Mean Median Max SD


success 6.00 38.88 36.00 129.00 19.13
FTR 0.00 36.62 34.00 127.00 18.39
re_success 0.00 2.23 1.00 40.00 2.70
fr_success 0.00 0.31 0.00 24.00 0.89
non_fr_success 0.00 1.92 1.00 36.00 2.36
fake_remark 0.00 2.72 2.00 41.00 4.05
visited_before 1.00 55.41 53.00 147.00 24.20
parcels 14.00 62.62 59.00 149.00 24.92
reattempts 0.00 5.32 4.00 64.00 5.37
fr_reattempts 0.00 0.79 0.00 34.00 1.63
non_fr_reattempts 0.00 4.53 3.00 46.00 4.61
fe_count 1.00 6.18 6.00 17.00 2.79
first_attempt 2.40 59.19 49.51 204.70 41.37
unattempted 0.00 1.76 1.00 34.00 2.88
idle_time 0.00 69.54 64.71 189.77 41.94
cod 0.00 28.30 27.00 113.00 14.67
Note: N=46,300

4.3. Econometric Model & Identification

4.3.1. Econometric Model

We identify the effect of fake remarked parcels on the next-day productivity of the FE employing a fixed
effect linear regression model below.

𝐷𝑉𝑖𝑡 = 𝛽1 𝑓𝑎𝑘𝑒_𝑟𝑒𝑚𝑎𝑟𝑘𝑖(𝑡−1) + 𝛽2 𝑐𝑜𝑛𝑡𝑟𝑜𝑙𝑠𝑖𝑡 + 𝑓𝑒𝑖 + 𝑑𝑎𝑡𝑒𝑡 + 𝑐𝑖 𝑤𝑡 + 𝜀𝑖𝑡 (1)

In equation (1), index i represent an FE and t denotes day. 𝐷𝑉𝑖𝑡 represents our dependent variables:
success, FTR, fr_success, non_fr_success. Using the DV success, we estimate the impact of fake remarks
on the next day delivery success whereas the other three DVs, provide the effect on a specific component
of success.

𝑓𝑎𝑘𝑒_𝑟𝑒𝑚𝑎𝑟𝑘𝑠𝑖(𝑡−1) is our treatment variable, which is lagged by one day for each FE, meaning that
we estimate the effect of yesterday’s fake remarks on today’s productivity.

15

Electronic copy available at: https://ssrn.com/abstract=4151409


We control for several variables that can affect the delivery success rate as described in section 4.1.
Specifically, we control for parcels, fe_count, first_attempt, and cod. In addition to these contemporaneous
variables, we control for lagged variables lag.unattempted and lag.idle_time that could affect next-day
success.

We also control for time-invariant and time-variant fixed effects, i.e., FE fixed effects (𝑓𝑒𝑖 ) as well as
date fixed effects (𝑑𝑎𝑡𝑒𝑡 ). FE fixed effects take care of individual-level heterogeneity, and date fixed effects
control for factors such as the increase in gasoline prices or seasonality (festive seasons, weather, etc.).
Furthermore, after each weekly review, there could be center-level policy changes (e.g., changes in
reporting time, management change, etc.). Therefore, we control for center-week (𝑐𝑖 𝑤𝑡 ) fixed effects.

4.3.2. Endogeneity

The decision to enter a fake remark is likely to be endogenous because of the archival nature of the data.
There can be multiple threats to the identification of 𝛽1 like sample selection, unobserved heterogeneity,
reverse causality, omitted correlated variables (Antonakis et al. 2014), and interference. We will discuss
these in detail along with the solutions we adopted.

Selection Bias - One may argue that there might be an endogenous sorting of FEs across delivery
centers. Namely, FEs who are more likely to make fake remarks choose delivery centers in areas where
completing a delivery is more complicated. We can rule out this concern, given the operating style of the
LMD firm we collaborated with. The delivery centers are strategically placed in a city, and they hire FEs
who are in the service area of that delivery center and are well aware of the work area. In most cases, these
delivery areas are disjoint and do not serve overlapping regions. Hence, there are not many choices for an
FE to select a delivery center. We control for FE fixed effects, which absorb time-invariant factors such as
the center in which the FE works.

Reverse Causality - Productivity or success is dependent on fake remarks. At the same time, success
can also affect an FE’s actions. Therefore, we use a temporal lag and study the impact of the previous day’s
fake remarks on the next day’s productivity to rule out reverse causality (simultaneity) concerns. It is
common to use lagged variables in both management (Ghose and Han 2011) and economics (Bollinger et
al. 2020) research to address reverse causality. FEs are also not aware of the new parcels that they will be
assigned tomorrow, which prevents them from making fake remarks today in anticipation of tomorrow’s
load.

Omitted Variable Bias - We cannot observe all FE- and time-varying factors that are correlated with
both the DV as well as our treatment variable fake_remarks. Consequently, our identification is threatened
by omitted variable bias. For instance, “strategically selecting deliveries for fake remarks based on

16

Electronic copy available at: https://ssrn.com/abstract=4151409


reattempt success foresight” is an omitted variable can impact both the productivity of the FE and her
probability of entering a fake remark. In other words, a FE may enter a fake remark on a parcel based on
her anticipated probability of success in delivering that parcel the next day. This is also supported by our
interviews, where one FE mentioned, “Customers of reputed e-commerce merchants like Amazon are easily
available and accept easily. For them, I don’t feel the need to make a call. For parcels from other
merchants, it is always better to call the customer beforehand”. Similar sentiments were echoed by another
FE, “I know the customers who always take a lot of time, so now I call them before going to their address
and give them the information that they have a parcel today”. We provide some findings from our data in
section 4.1 to support this claim of strategic behavior.

Figure 3: Omitted Correlated Variable

Figure 3 represents the omitted strategic behavior of the FE that should have a positive effect on her
future productivity as well as her propensity to enter fake remarks. If we do not account for the FE’s
strategic behavior, the positive correlation with subsequent success is absorbed in 𝛽1 , resulting in an
underestimation of the effect of fake remarks on productivity.

An alternative source of omitted variable bias is that FEs are more likely to make fake remarks when
they are in conditions that would lead to less success in the future (e.g., if there is anticipated unobserved
traffic congestion in their delivery region). Following similar logic, this would lead to an overestimation of
the effects of fake remarks on productivity, although we do not find this to be the case in our analysis and
find more evidence in favor of the strategic behavior mechanism depicted in Figure 3.

To address the bias, we use instrumental variable (IV) regression to estimate the effect of fake remarks
on productivity using two-stage least square (2SLS) regression. Details of the two stages and the instrument
used are provided in the next section.

Unobserved Heterogeneity - FEs may be heterogeneous in many ways, and the same is true for delivery
centers. Namely, FEs who have a lower work ethic may also be more prone to make fake remarks, leading
to a spurious relationship between fake remarks and failed deliveries. We address this concern by
controlling for two sets of fixed effects. We use FE fixed effects to account for the FE’s time invariant
characteristics like gender and education, as well as unobservables like work ethic. We use date fixed effects

17

Electronic copy available at: https://ssrn.com/abstract=4151409


to control for the day-specific heterogeneity such as weather or festive occasions. Since 87% of FEs worked
at only one center, the center is largely time-invariant, and hence, FE fixed effects take care of center-level
heterogeneity. However, it is important to account for time-variant heterogeneity, commonly known as
interactive fixed effects (Bai 2009). We include additional center-week fixed effects to capture time-varying
unobservables. We choose week-level interactive fixed effects because weekly reports and audits are in
place to analyze and course correct the performance of a center. In the next sub-section on instrumental
variables (section 4.3.3), we explain why this is needed for our regression estimates to be unbiased.

Interference – Because of the platform nature of our setting, decisions of an FE might affect the
decisions of other FEs. Therefore, such network effects, known as ‘interference between units (treated and
control)’, will threaten our identification (Manski 2013). For instance, if we have limited parcels for
delivery in one area with two FEs, then identification of 𝛽1 will be difficult. For unbiased estimates we
need the units to not interfere, and the outcome for one unit must not depend on the treatment of other units
(Rubin 2005). In our setting, FEs’ have expansive pre-defined work areas and an ample supply of parcels,
which eliminates competition among the FEs for successful deliveries. Therefore, we believe that
interference does not affect our estimates.

Serial Correlation – One might argue that the fake remarks of a FE exhibit serial correlation which
would require that we control for serially correlated unobservables to identify the effect. Bollinger and
Gillingham (2012) suggest that time variant fixed effects (center-week in our case) mitigate this problem
to some extent. Angrist and Pischke (2008), Leszczensky and Wolbring (2022), and Bollinger and
Gillingham (2012) discuss that adding a lagged dependent variable is a better method to account for serial
correlation in dynamic panel models. However, including a lagged dependent variable in a fixed effects
model results in bias in the estimates, particularly in small panels (Nickell 1981). It is a well-known issue
in the literature and Leszczensky and Wolbring (2022) talk about how the solutions to this problem have
evolved over time starting from the Arellano–Bond estimator to system generalized method of moments
and, more recently, the maximum likelihood – structural equation modeling (ML-SEM) method. But the
authors also highlight the shortcomings of all these methods and acknowledge that neither method can
completely remove the bias.

Therefore, we apply the estimation strategy proposed by Angrist and Pischke (2008), leveraging the
bracketing property of fixed effects and lagged dependent variables estimates. This strategy is also used in
some empirical research such as (Falk et al. 2018, Lee et al. 2014, Pylypchuk et al. 2022). The bracketing
property implies that the true effect of interest (say 𝛽1 ) is bounded by two estimates – a) estimate with FE
and without lagged dependent variable (𝛽𝐹𝐸 ) and b) estimate with lagged dependent variable without FE
(𝛽𝐿𝐷𝑉 ). Therefore, if we employ a fixed effects model with no lagged dependent variable (equation 2), then

18

Electronic copy available at: https://ssrn.com/abstract=4151409


the estimated effect will be too large, i.e., 𝛽1 < 𝛽𝐹𝐸 . However, if the fixed effects model captures the true
data generating process and we employ a lagged dependent variable model (equation 3), then the estimated
effect will be too small, i.e., 𝛽𝐿𝐷𝑉 < 𝛽1 . Consequently, 𝛽𝐿𝐷𝑉 is the lower bound and 𝛽𝐹𝐸 is the upper
bound, i.e., 𝛽𝐿𝐷𝑉 < 𝛽1 < 𝛽𝐹𝐸 .

𝐷𝑉𝑖𝑡 = 𝛽𝐹𝐸 𝑓𝑎𝑘𝑒_𝑟𝑒𝑚𝑎𝑟𝑘𝑖(𝑡−1) + 𝛽2 𝑐𝑜𝑛𝑡𝑟𝑜𝑙𝑠𝑖𝑡 + 𝑓𝑒𝑖 + 𝑑𝑎𝑡𝑒𝑡 + 𝑐𝑖 𝑤𝑡 + 𝜀𝑖𝑡 (2)

𝐷𝑉𝑖𝑡 = 𝛽𝐿𝐷𝑉 𝑓𝑎𝑘𝑒_𝑟𝑒𝑚𝑎𝑟𝑘𝑖(𝑡−1) + 𝛿𝐷𝑉𝑖(𝑡−1) + 𝛽2 𝑐𝑜𝑛𝑡𝑟𝑜𝑙𝑠𝑖𝑡 + 𝑑𝑎𝑡𝑒𝑡 + 𝑐𝑖 𝑤𝑡 + 𝜀𝑖𝑡 (3)

Since we expect our 𝛽1 to be negative, we will use the conservative estimate 𝛽𝐹𝐸 as our main effect and
provide the comparisons in the Appendix. Table A.5 and Table A.6 present the fixed effect and lagged
dependent variable estimation for 𝑠𝑢𝑐𝑐𝑒𝑠𝑠𝑖𝑡 and 𝐹𝑇𝑅𝑖𝑡 as DVs. We find that the interval [𝛽𝐿𝐷𝑉 , 𝛽𝐹𝐸 ] is
small and the difference between the two coefficients is statistically insignificant, which limits our concerns
about bias induced by serially correlated unobservables.

4.3.3. Instrumental variables

Although, there is no established empirical test for endogeneity, we conduct the Wu-Hausman test and
failed to reject the null hypothesis that the fake_remark variable is endogenous. We use the fake remarks
entered by the co-workers of our focal FE to instrument for the endogenous variable fake_remark. Norton
et al. (2003) suggest that the peers or co-workers (people whom we identify with) motivate behavior change.
Misconduct and peer effects are not extensively studied in OM but are a well-studied topic in different
contexts. For example, Gino et al. (2009) conducted laboratory experiments to study the effects of unethical
peer behavior, Dimmock et al. (2018) studied the peer effects of misconduct in finance, and Pierce and
Snyder (2008) showed the ethical spillover from an individual to the firm. Therefore, we borrow from the
peer effects literature to construct our instrument.

We identify co-workers as the FEs that work at the same delivery center as our focal FE. Since FEs have
different days off in the week, and they might take leave on some days, we calculate the co-worker set for
each day and include only those FEs that are working on that day. Let 𝐶𝑡 (𝑖) be the set of co-workers of FE
i on day t, not including the FE i. The number of co-workers of FE i on day t is represented by |𝐶𝑡 (𝑖)|. Next,
we define the variable 𝑐𝑜_𝑓𝑟_𝑎𝑣𝑔𝑖𝑡 (mean = 2.7, sd = 2.0) as the past seven days’ moving average of the
fake remarks per co-workers of our focal FE i starting from day t, i.e.,

𝑡
1 1
𝑐𝑜_𝑓𝑟_𝑎𝑣𝑔𝑖𝑡 = ∑ ( ∑ 𝑓𝑎𝑘𝑒_𝑟𝑒𝑚𝑎𝑟𝑘𝑗𝑘 )
7 |𝐶𝑘 (𝑖)|
𝑘=𝑡−6 𝑗∊𝐶𝑘 (𝑖)

We provide robustness checks in section 7 using different windows for our moving average such as 1,
3, and 10.

19

Electronic copy available at: https://ssrn.com/abstract=4151409


FEs working at the same center tend to interact at the start and the end of their shifts. So, it is realistic to
believe that the focal FE’s fake remarks may be influenced by her co-workers’ fake remarks, and the effect
of this influence will be more prominently visible on the next-day behavior of the focal FE. At the same
time, past co-worker fake remarks should not directly affect the focal FE’s current-day delivery success.
Therefore, we use the lagged 𝑐𝑜_𝑓𝑟_𝑎𝑣𝑔𝑖𝑡 as an instrument. In addition, it helps us avoid reverse causality
issues. As shown in Figure 4, to estimate equation (1) we use 𝑐𝑜_𝑓𝑟_𝑎𝑣𝑔𝑖(𝑡−2) to instrument for the fake
remarks of the focal FE on day (t-1), 𝑓𝑎𝑘𝑒_𝑟𝑒𝑚𝑎𝑟𝑘𝑖(𝑡−1) , when studying productivity in day t, 𝑠𝑢𝑐𝑐𝑒𝑠𝑠𝑖𝑡 .

Figure 4: Co-worker Fake Remarks as an Instrumental Variable

We estimate the two stages of our 2SLS model separately to understand better how this instrument
works:

Stage 1: We estimate an OLS regression of the endogenous independent variable, 𝑓𝑎𝑘𝑒_𝑟𝑒𝑚𝑎𝑟𝑘𝑖(𝑡−1)


(fake remarks of the focal FE on day (t-1)), on the instrument 𝑐𝑜_𝑓𝑟_𝑎𝑣𝑔𝑖(𝑡−2) . We represent the estimated
̂ 𝑖(𝑡−1). This stage models the effect of co-workers’ fake remarks on fake remarks of
value as 𝑓𝑎𝑘𝑒_𝑟𝑒𝑚𝑎𝑟𝑘
the focal worker.

Stage 2: Using the estimated value of fake remarks for the focal FE from Stage 1, we use OLS to estimate
the effect on her productivity with different dependent variables like success and FTR. We use clustered
standard errors for inference to account for any autocorrelation. Note that we use the same set of controls
and fixed effects in both stages as represented in equation 1.

To make sure that our identification is robust, we need to take care of any relevant bias that can arise in
a peer effects setting (first stage). There are three main identification challenges in the study of peer effects
(Bollinger and Gillingham 2012, Bramoullé et al. 2009, Manski 1993) – selection bias (also known as
endogenous group formation), correlated unobservables, and reverse causality. Even though identification
of peer effects is not the main goal of this paper, we do need to make sure that our instrument does not have
a spurious correlation with the endogenous variable and that there is no reverse causality. It is sufficient if
we show that some peer action (which may be correlated with peer fake remarks) affects the focal FE’s

20

Electronic copy available at: https://ssrn.com/abstract=4151409


fake remarks, as long as that action does not affect the FE’s subsequent deliveries. Hence, our identification
requirements are less stringent than if we wanted to show evidence of a peer effect. Nevertheless, we use
standard techniques to ensure our first-stage estimate is as accurate as possible.

Reverse causality in peer behavior makes it difficult to determine whether peers affect the focal agent
or vice versa (Bramoullé et al. 2009). To avoid reverse causality, we take lagged co-worker fake remarks
as is common in the peer effects literature (similar to Bollinger et al. 2020 and Ghose and Han 2009). In
general, a lagged variable is a good proxy for current behavior (Hanushek et al. 2003).

To eliminate correlated unobservables or spurious correlations, we use a rich set of fixed effects (similar
to Bollinger and Gillingham 2012, Hanushek et al. 2003 and Oestreicher-Singer and Sundararajan 2012).
As mentioned earlier, we control for FE and date fixed effects to take care of any time-invariant unobserved
heterogeneity and time-specific correlated unobservables. For the time-variant characteristics, like changes
in center management or center-level policies based on weekly performance reports, we include the center-
week fixed effects. Our comprehensive set of fixed effects should account for correlated unobservables that
may affect both the focal FE’s and their peers’ behavior. Finally, selection or endogenous group formation
is less of a concern in this setting as FEs usually do not choose which center to sign up to as discussed
previously.

A valid instrument must satisfy three conditions. The first two, the exclusion restriction and relevance
condition, are explained (Wooldridge 2002). The exclusion restriction requires the IV to be uncorrelated
with the error term, and the relevance condition requires the IV to be correlated with the endogenous
variable. In simpler terms, the IV should explain the dependent variable only through its effect on the
endogenous variable. Lastly, the instrument needs to be strong for an unbiased estimate of the effect using
instrumental variable regression (Stock and Yogo 2005). The first stage of 2SLS confirms that our
instrument is relevant as the coefficient is significant (-0.401, p<0.01). The first stage F-stat is 276.3,
confirming that our instrument is strong (Stock and Yogo 2005).

Although there is no statistical test for exclusion restrictions, to support the exclusion condition (i.e.,
𝑐𝑜_𝑓𝑟_𝑎𝑣𝑔 does not directly impact 𝑠𝑢𝑐𝑐𝑒𝑠𝑠), it is reasonable to argue that past fake remarks of co-workers
cannot directly affect the ability of the focal FE to successfully deliver a parcel. Our interviews with FEs
also support this. As a result, any effect of co-worker fake remarks must be an indirect effect through the
focal FE’s fake remarks. Additionally, to mitigate any concerns regarding the correlated unobservables or
serial correlation threatening the exclusion restriction, we do the robustness analysis using lagged
dependent variable estimation and the bracketing property (as discussed in the previous section). Therefore,
we believe that our instrument satisfies all the three conditions – the relevance condition, the exclusion
restriction, and is strong.

21

Electronic copy available at: https://ssrn.com/abstract=4151409


5. Results

We provide the results of our econometric model to estimate the impact of fake remarks on productivity
using success as the dependent variable in equation (1).

5.1. Effect of Misconduct

We present our results in Table 2. In column 1, we show the results using only control variables. We
observe that an increase in the total number of parcels assigned to the FE increases the number of parcels
successfully delivered (0.45, p<0.01). With more parcels, the FE has more scope to deliver. However, the
number of co-workers and the number of parcels not attempted on the previous day does not significantly
affect the FE’s productivity today. We also find that if more time is required to attempt the first parcel in
the morning, then productivity is lower on that day (-0.02, p<0.01). Idling or taking some rest on the
previous day positively affects productivity today (0.004, p<0.01). Finally, we observe that cash on delivery
has a positive impact on successful deliveries (0.203, p<0.01). We will discuss cash on delivery parcels in
detail in Section 6. In column 2, we include our treatment variable lag.fake_remark and estimate the model
using OLS. We find that previous day fake remarks do not affect productivity today. However, this result
does not seem logical and suffers from the identification challenges mentioned before.

In column 3, we estimate equation (1) using 2SLS. We provide the results of the first stage in Table A.2.
We discover that our IV, lag2.co_fr_avg, is relevant and negatively associated with lag.fake_remark (-
0.401, p<0.01), i.e., an increase in the average fake remarks by co-workers decreases the focal FE’s fake
remarks. A plausible rationale for this finding is that the focal FE is responding strategically. She might be
concerned about getting noticed if there are lots of fake remarks happening at the center, which can increase
the manager’s scrutiny of FE activity. Chan et al. (2021) find a similar effect in a misconduct (e.g.,
restaurant theft) context where workers are less likely to steal if their co-workers do so more on a given
day. The authors call it a strategic peer response because managers are more likely to intervene if they
observe a high level of theft, resulting in employees responding strategically. There are also several studies
in psychology and economics research that identify negative peer effects (Angrist 2014, Brady et al. 2017,
Rauhut 2013, Schreiner and Bremer 2013). For example, Emerson & Hill (2018) showed a negative peer
effect in marathon racing, and Pascual-Ezama, Dunfield, Liaño, & Prelec (2015) explained how unethical
behavior in the labor market can generate negative peer effects in the presence of supervision or due to the
risk of losing reputation.

We also validate the strength of our IV using the F-stat (276.3). In the second stage, we find that the
effect of the variable lag.fake_remark is significant and negative (-0.584, p<0.01). This negative coefficient
suggests that an increase in fake remarks on the previous day results in a decrease in the number of

22

Electronic copy available at: https://ssrn.com/abstract=4151409


successful deliveries today. This supports Hypothesis 1. The average number of parcels assigned to an FE
is 62.62 and 38.88 of them are successfully delivered. A decrease of 0.584 is equivalent to a 1.5%
(0.584/38.88) reduction in successful deliveries per day. During the period of study, the company was
delivering an average of 1.5 million parcels per day. Therefore, an additional fake remark can lead to 22,500
(0.015*1.5 million) fewer successful deliveries daily.

Table 2: Effect of Fake Remarks on Productivity

success
(1) (2) (3)
lag.fake_remark 0.014 -0.584***
(0.017) (0.206)
parcels 0.450*** 0.450*** 0.452***
(0.013) (0.013) (0.014)
fe_count 0.003 0.003 0.002
(0.041) (0.041) (0.046)
first_attempt -0.020*** -0.020*** -0.021***
(0.002) (0.002) (0.002)
lag.unattempted -0.026 -0.029 0.126**
(0.022) (0.022) (0.054)
lag.idle_time 0.004*** 0.004*** 0.007***
(0.001) (0.001) (0.002)
cod 0.203*** 0.203*** 0.215***
(0.021) (0.021) (0.023)
Fixed Effects:
fe Yes Yes Yes
date Yes Yes Yes
center-week Yes Yes Yes
IV No No Yes
2
𝑅 0.837 0.837 0.823
2 0.827 0.827 0.811
adj-𝑅
N 45,317 45,317 38,602
Note: *** p<0.01, ** p<0.05, * p<0.1; Standard errors are in parentheses.
Standard errors clustered at FE level.

5.2. Effect of Misconduct on Different Types of Deliveries

To test H2 and H3, we use data on the parcels that were fake remarked earlier and reattempted today as
well as first-time right parcels. To do so, we identify the three different kinds of parcels (fresh parcels, fake
remarked reattempts and genuine reattempts) meaning that we split success into its three components, one

23

Electronic copy available at: https://ssrn.com/abstract=4151409


for each parcel type. We estimate the effect of fake remarks on these specific types of deliveries, estimating
equation (1) with each of the three outcomes as dependent variables. We present our results in Table 3.

In column 1 of Table 3, we observe that fr_success has a positive coefficient of 0.1 (p<0.01). As a result,
the parcels that were fake remarked earlier by the FE are more likely to be successfully delivered on the
next day, supporting our Hypothesis 2. Interestingly, the coefficient of lag.fake_remark for fr_success
(0.101) is positive while that for success is negative (-0.584). Since fr_success is just one component of
success, there must a negative factor that is even higher than that for success which is driving the overall
negative effect on productivity. Some deliveries are getting negatively impacted by the increase in fake
remarks.

Table 3: Effect of Fake remarks on Different Delivery Types


Fake remarked Non-fake remarked
First time right
success success
fr_success FTR non_fr_success
(1) (2) (3)
lag.fake_remark 0.101*** -0.587*** -0.100**
(0.028) (0.196) (0.048)
Fixed Effects:
fe Yes Yes Yes
date Yes Yes Yes
center-week Yes Yes Yes
controls Yes Yes Yes
2
𝑅 0.301 0.812 0.296
2 0.256 0.800 0.250
adj-𝑅
Note: *** p<0.01, ** p<0.05, * p<0.1; N=38,602; Standard errors are in parentheses.
Standard errors clustered at FE level. Estimated with instrumental variable using 2SLS.

We use FTR as our dependent variable in equation (1) to find the missing piece of the puzzle. In column
2 of Table 3, we estimate the effect of an increase in fake remarks yesterday on the fresh parcels today.
Using the same IV, we find that the coefficient is significant and negative (-0.587, p<0.01), i.e., an increase
in fake remarks decreases FTR deliveries by 1.51% (or 0.587/38.88) on the subsequent day. Therefore, FTR
deliveries are negatively impacted by the increase in fake remarks, which supports our Hypothesis 3.

Column 3 of Table 3 presents the estimate for equation (1) with non_fr_success as the dependent
variable. This variable captures the delivery success of parcels, which were genuinely failed earlier. We see
that the coefficient is significant and the increase in fake remarks impacts the success of these parcels
negatively (-0.1, p<0.01). These results suggest that FEs invest more effort to deliver fake remarked
reattempt parcels, thereby negatively affecting their success rate for other parcels (fresh parcels and genuine

24

Electronic copy available at: https://ssrn.com/abstract=4151409


reattempts). In summary, out of the three parcel types, fr_success is positively affected while FTR and
non_fr_success are negatively affected by an increase in fake remarks on the previous day.

There are two plausible explanations for these results. First, the pressure from the delivery center
manager to deliver the fake remarked parcels that are identified in the daily end-of-day audit may encourage
FEs to focus more on these parcels. Second, FEs may anticipate which of today’s parcels they are likely to
successfully deliver on the subsequent day and enter fake remarks for these parcels to reduce their workload
today. We believe that both forces are in action. However, we can’t separately test for an increase in
managerial oversight. As a result, we focus on FE strategic behavior, which is supported by the extant
literature (as discussed in section 3), our interviews with FEs and by patterns in the data (as discussed
below).

FEs have a sense of which parcels have a higher probability of successful delivery based on their
experience in the work area. As a result, they select to enter fake remarks for such parcels. The higher the
probability of success on the following day, the greater the propensity to enter fake remarks and the lower
the chances of the FE getting caught for this misbehavior as they can easily complete the delivery on the
following day. Therefore, we see a positive impact of fake remarks on the success of parcels that were fake
remarked earlier (fr_success) in Table 3.

FEs have no information about what kinds of fresh parcels they will have to deliver on the subsequent
day. Hence, the FE has no control over fresh parcels. However, as FEs have tactically selected the fake
remarked reattempts, their positive efforts on fake remarked reattempt parcels result in productivity loss for
fresh parcel deliveries. Hence, the majority of the overall productivity loss of an FE is dictated by the impact
of fake remarks on FTR deliveries. We don’t see a similarly strong effect for the genuine reattempt parcels
because we believe that FEs are not concerned about getting caught in these cases and don’t put in any
additional effort to affect their overall productivity. In fact, genuine reattempt success is also negatively
affected by fake remarked reattempt parcels.

In summary, these results support our claim that the FE can anticipate the chances of reattempt success
for a fake remarked parcel. Table 3 shows that fake remarks are correlated with the FE’s knowledge of
next-day success probability – fake remarked reattempt success increases while both FTR and genuine
reattempt success decreases. As additional descriptive evidence of this mechanism, we find that the median
time to deliver a fake remarked reattempt parcel is 4.18 mins, whereas the same metric for non-fake
remarked reattempt parcels is 5.28 mins and 6.58 mins for fresh parcels. All these values are statistically
different and support our hypothesis that fake remarks are a strategic behavior by the FE.

25

Electronic copy available at: https://ssrn.com/abstract=4151409


6. Effects of Opportunistic Environments

In addition to the main effect, we examined situations that can exacerbate the effect of fake remarks. We
define opportunistic circumstances as the conditions when it is easier to enter a fake remark and the
probability of getting caught is lower. Although we cannot measure opportunism precisely, we can identify
variables that strongly signal opportunistic circumstances associated with parcel delivery and are supported
by the literature (Bryson and Forth 2007, Mao et al. 2019) and practice.12 In this section, we will analyze
our results using two such variables: visited_before and cod, and study the impact of their interaction effects
with fake remarks on FE productivity.

Table 4: Effect of Familiarity on Productivity

success
(1) (2)
lag.fake_remark -0.500*** 1.659***
(0.100) (0.299)
lag.fake_remark × visited_before -0.028***
(0.004)
visited_before 0.868*** 0.935***
(0.009) (0.010)
Fixed Effects:
fe Yes Yes
date Yes Yes
center-week Yes Yes
controls Yes Yes
𝑅2 0.930 0.926
adj-𝑅 2 0.926 0.921
Note: *** p<0.01, ** p<0.05, * p<0.1; N=38,602; Standard errors are in parentheses.
Standard errors clustered at FE level. Estimated with instrumental variable using 2SLS.

Each parcel has a consignee address that belongs to a pin code. In India, there are around 19,100 postal
codes13 and the land area of India is 2,973,190 sq. km.14 This implies that, on average, each postal code
covers a 155 sq. km area (a radius of 7 km). If the FE has visited the consignee’s pin code before (since the
start of our data) we say that the FE has visited the pin. We define the variable visited_before which

12
https://tedium.co/2020/06/12/cod-cash-on-delivery-history/
13
https://data.gov.in/catalog/all-india-pincode-directory
14
https://data.worldbank.org/indicator/AG.LND.TOTL.K2?locations=IN

26

Electronic copy available at: https://ssrn.com/abstract=4151409


indicates the total number of pin codes in the dispatch that have been visited by the FE until the previous
day. We use this variable as a proxy for the familiarity of the FE with the delivery area.

We study the moderating effect of visited_before using the interaction term as reported in Table 4. We
find that the interaction effect is significant and negative (-0.028, p<0.01), i.e., the effect of fake remarks
exacerbates as the FE becomes more acquainted with the delivery area. Note that the main effect of fake
remarks becomes positive. However, the overall effect is always negative within the range of our data as
the interaction effect dominates. Although the literature (Mao et al. 2019) suggests that familiarity with the
delivery area increases the efficiency of the delivery person, our results indicate that if the FE has visited
an address before, then this location familiarity leads to a lower success rate (in the presence of fake
remarks). When a FE becomes familiar with the delivery area, it becomes easier for her to shirk and justify
lower productivity with more authentic excuses. This gives her the opportunity to allocate more effort in
covering up for the previous day’s fake remarks. We also confirm that the propensity to enter fake remarks
has a positive association with visited_before (0.007, p<0.05) which can be verified from stage 1 of the
2SLS model, suggesting that FEs are more likely to enter fake remarks for parcels from a familiar area. We
do not use visited_before as a control in our main model because it is highly correlated with parcels (0.85).
Table A.4 provides the first stage of the model with visited_before as an additional control.

Familiarity is good if people are honest but can hurt if they are entering more fake remarks. In other
words, fake remarks reduce the benefits of experience in the delivery area because of an increase in the
likelihood of strategic behavior. This provides additional support to our mechanism that the observed results
are driven by the strategic behavior of the FEs as familiarity with the delivery area is not affected by the
manager’s pressure.

Next, we investigate the moderating effect of COD parcels. Table 5 provides both the direct effect and
the interaction model. Interestingly, we observe that the direct effect of cod is positive (0.215, p<0.01)
whereas the interaction effect is negative and significant (-0.047, p<0.01), i.e., an increase in the number
of COD parcels increases overall success but exacerbates the effect of fake remarks on FE productivity. To
understand this result, we need to examine it from two different perspectives. From the perspective of a
consignee, COD is still a preferred mode of payment because the consignee doesn’t want to pay upfront
and may fear online scams15, thereby improving delivery success for COD parcels (main effect). Now, for
the interaction effect, let us look at the FE’s point of view. In our interviews, one of the FEs mentioned, “I
deliver 10-11 prepaid parcels in an hour. But if you get COD parcels to deliver, customers take a lot of
time. Some customers take the parcel inside and then forget that I am waiting outside for the cash”. Another

15
https://tedium.co/2020/06/12/cod-cash-on-delivery-history/

27

Electronic copy available at: https://ssrn.com/abstract=4151409


FE said, “Some customers take the parcel inside; open it and then return the parcel to me and refuse to pay
for it also. I have faced this couple of times”. COD parcels are generally not preferred by FEs because they
require more waiting time, a greater hassle to collect cash and provide change to the customer, and the
eventual depositing of the cash at the delivery center at the end of the day. In Table A.2, we observe fake
remarks have a positive association with cod (0.013, p<0.01), meaning that FEs are more likely to fake
remark such parcels.

Table 5: Effect of Cash on Delivery (COD) on Productivity

success
(1) (2)
lag.fake_remark -0.584*** 1.397***
(0.206) (0.353)
lag.fake_remark × cod -0.047***
(0.006)
cod 0.215*** 0.379***
(0.023) (0.029)
Fixed Effects:
fe Yes Yes
date Yes Yes
center-week Yes Yes
controls Yes Yes
𝑅2 0.823 0.820
adj-𝑅 2 0.811 0.809
Note: *** p<0.01, ** p<0.05, * p<0.1; N=38,602; Standard errors are in parentheses.
Standard errors clustered at FE level. Estimated with instrumental variable using 2SLS.

Moreover, since the customers have not paid upfront, there is a lower probability of complaints in the
case of delivery failure. Furthermore, there are more remarks that seem genuine in the case of a COD parcel,
like “cash not available with the consignee”, which allows the FE to strategically focus on her previous-
day deviance. Overall, it is reasonable to find evidence that COD parcels aggravate the impact of deviance
on productivity. Moreover, COD increases the propensity to make fake remarks. We additionally verify
this by regressing fake_remark on cod and find that the coefficient of cod is positive and significant,
supporting our statement.

In a nutshell, all the interaction models we studied in this section imply that opportunism does exist, and
opportunistic circumstances intensify the effect of fake remarks. It is easier to shirk for an FE in a familiar

28

Electronic copy available at: https://ssrn.com/abstract=4151409


geography. COD parcels also provide an opportunity to shirk, thereby exacerbating the effect of aberrant
behaviors. These results provide support for our Hypothesis 4 that productivity loss due to aberrant behavior
exacerbates in opportunistic circumstances.

7. Robustness Tests

Table 6: IV Robustness

success
(1) (2) (3) (4) (5)
lag.fake_remark -0.584*** -0.612** -0.508*** -0.546** -0.546**
(0.206) (0.254) (0.166) (0.261) (0.261)
Fixed Effects:
fe Yes Yes Yes Yes Yes
date Yes Yes Yes Yes Yes
center-week Yes Yes Yes Yes Yes
controls Yes Yes Yes Yes Yes
IV co_fr_avg co_fr_avg1 co_fr_avg3 co_fr_avg10 co_fr_shift_avg
𝑅2 0.823 0.825 0.827 0.824 0.830
adj-𝑅 2 0.811 0.814 0.816 0.812 0.819
N 38,602 43,414 42,087 36,163 41,588
Note: *** p<0.01, ** p<0.05, * p<0.1; Standard errors are in parentheses. Standard errors clustered at FE
level.

In this section, we discuss the robustness of our estimates using alternative instrumental variables. We
calculate the historical moving average of co-workers’ fake remarks by varying the length of the time
window such as 1, 3, and 10 days to validate the robustness of our main result that uses past week’s peer
fake remarks. To avoid any simultaneity issues, we consider start of the window from day (t-2) because our
endogenous variable lag.fake_remark contains the fake remarks of our focal FE on day (t-1). Column 2 of
Table 6 reports our results using co_fr_avg1, i.e., the average fake remarks by coworkers on day (t-2) as
the IV. Comparing with column 1 of Table 6 (same as the main result from Table 2), we observe that the
results are robust. In columns 3 and 4 of Table 6, we use the time window lengths of 3 and 10 days,
respectively, and confirm the robustness of our results.

We also verify the consistency of our results by using an alternative definition for co-workers. In this
alternative definition, we consider only FEs who have the same shift timings and work at the same delivery

29

Electronic copy available at: https://ssrn.com/abstract=4151409


center are coworkers. We have seven different shift timings: 07:00 - 16:00, 08:00 - 17:00, 09:00 - 18:00,
etc., in our data. Based on this new definition, we calculate co_fr_shift_avg as the average number of fake
remarks that coworkers of the focal FE enter on day (t-2) and use it as the IV. Column 5 of Table 6 presents
the result using co_fr_shift_avg as the instrumental variable. Once again, the coefficient of lag.fake_remark
is negative and significant (-0.546, p<0.01). The first stage estimates for each IV are produced in Table
A.3. We observe that all of our alternative instruments are relevant and strong.
𝑠𝑢𝑐𝑐𝑒𝑠𝑠
In the end, we ran the analysis with percent success (= 𝑝𝑎𝑟𝑐𝑒𝑙𝑠) as the DV instead of just 𝑠𝑢𝑐𝑐𝑒𝑠𝑠 to

address concerns regarding the effect is driven by higher workload. We find robust results for Table 2
(effect of misconduct on productivity), Table 4 (effect of familiarity on productivity), and Table 5 (effect
of cash on delivery on productivity). Results are available from the authors upon request.

8. Discussion and Concluding Remarks

Our paper provides one of the first analysis of last mile delivery operations studying the impact of
misconduct on productivity. The results suggest that fake remarks on a given day not only affect
productivity on that day but also cause operational losses to spill over to the future. To establish our results,
we estimated a panel regression on a very large dataset of last mile delivery drivers. We use instrumental
variables to address the endogeneity concerns and our instrument is inspired by the peer effects literature.
Assuming a nominal commission for each successful delivery in a month, this result could imply million-
dollar losses per year for the LMD firm alone. The logic used to identify fake remarks is conservative
because false positives will be seen as accusations and will be costly to the firm. As a result, we believe
that our estimate of the losses is smaller than the actual damage.

The decrease in total successful deliveries is mostly driven by the reduction in first-time-right deliveries.
When workers strategically shirk on any given assignment, their productivity on new assignments is also
affected because it takes time and effort to make up for the shirking. This hurts the third-party logistics firm
even more because reattempts are costly in such a low-margin, highly competitive industry like LMD. The
consequences of aberrant behavior in last mile delivery go beyond the immediate productivity loss studied
in the extant literature.

This paper provides managerial insights that can be applied in practice as mitigation strategies.
Opportunistic environments facilitate aberrant behaviors and consequently exacerbate the impact of fake
remarks. In the current context, we observed that familiarity with the work area and cash on delivery parcels
provide favorable circumstances for the FEs to shirk. Center managers should anticipate aberrant behaviors
in these settings and allocate parcels accordingly.

30

Electronic copy available at: https://ssrn.com/abstract=4151409


As a side note, Ibanez et al. (2018) studied the impact of employees’ discretion in task completion
sequence on productivity in a healthcare setting. A similar study can be conducted in LMD to understand
whether FE’s discretion of parcel attempt sequence affects her productivity. Another interesting question
to pursue might be to investigate how workers’ productivity varies with parcels originating from different
e-commerce platforms.

There are certain limitations to our work. In some rare cases, remarks that are classified as fake could
be legitimate e.g., the customer makes a fake complaint. But we believe that there will be very few such
instances and not much that can be done to identify them. Also, the archival nature of the data prevents us
from disentangling some mechanisms that might be driving the spillover effect we identify. These might
include pressure from the center manager, coordination among FEs, etc. Furthermore, the FE’s behavior
might also depend on her perception of the consignee and how consignees behave on a particular day
(Altman et al. 2021). Although this is outside the scope of our paper, it might be an interesting question to
study in future research. One might question the generalizability of our results to other demographics. As
mentioned earlier, this problem is not just limited to India, and the purpose of this paper is to highlight the
importance of workers’ behavior in last mile efficiency. We hope that this study will motivate future
research on individual behavioral biases in service operations.

9. References

Ackerman PL, Tatel CE, Lyndgaard SF (2020) Subjective (dis)utility of effort: mentally and physically
demanding tasks. Cogn. Res. Princ. Implic. 5(1):26.
Alchian AA, Demsetz H (1972) Production, Information Costs, and Economic Organization. Am. Econ.
Rev. 62(5):777–795.
Altenried M (2019) On the Last Mile: Logistical Urbanism and the Transformation of Labour. Work
Organ. Labour Glob. 13(1):114–129.
Altman D, Yom-Tov GB, Olivares M, Ashtar S, Rafaeli A (2021) Do Customer Emotions Affect Agent
Speed? An Empirical Study of Emotional Load in Online Customer Contact Centers. Manuf. Serv.
Oper. Manag. 23(4):854–875.
Angrist JD (2014) The Perils of Peer Effects. Labour Econ. 30:98–108.
Angrist JD, Pischke JS (2008) Chapter 5. Parallel Worlds: Fixed Effects, Differences-in-Differences, and
Panel Data. (Princeton University Press), 221–248.
Antonakis J, Bendahan S, Jacquart P, Lalive R (2014) Causality and Endogeneity (Oxford University
Press).
Armony M, Roels G, Song H (2021) Pooling Queues with Strategic Servers: The Effects of Customer
Ownership. Oper. Res. 69(1):13–29.
Arrow KJ (1985) The Economics of Agency. John W. Pratt and Richard J. Zeckhauser, ed. Princ. Agents
Struct. Bus. (Boston, Mass. : Harvard Business School Press), 37–51.
Bai J (2009) Panel Data Models With Interactive Fixed Effects. Econometrica 77(4):1229–1279.

31

Electronic copy available at: https://ssrn.com/abstract=4151409


Batt RJ, KC DS, Staats BR, Patterson BW (2019) The Effects of Discrete Work Shifts on a
Nonterminating Service System. Prod. Oper. Manag. 28(6):1528–1544.
Bollinger B, Burkhardt J, Gillingham KT (2020) Peer Effects in Residential Water Conservation:
Evidence from Migration. Am. Econ. J. Econ. Policy 12(3):107–133.
Bollinger B, Gillingham K (2012) Peer Effects in the Diffusion of Solar Photovoltaic Panels. Mark. Sci.
31(6):900–912.
Brady RR, Insler MA, Rahman AS (2017) Bad Company: Understanding Negative Peer Effects in
College Achievement. Eur. Econ. Rev. 98:144–168.
Bramoullé Y, Djebbari H, Fortin B (2009) Identification of Peer Effects Through Social Networks. J.
Econom. 150(1):41–55.
Bryson A, Forth J (2007) Productivity and Days of the Week [Discussion paper]. R. Soc. Encourag. Arts,
Manuf. Commer. (2007).
Burbano VC, Chiles B (2021) Mitigating Gig and Remote Worker Misconduct: Evidence from a Real
Effort Experiment. Organ. Sci.
Castillo VE, Mollenkopf DA, Bell JE, Esper TL (2022) Designing technology for on‐demand delivery:
The effect of customer tipping on crowdsourced driver behavior and last mile performance. J. Oper.
Manag.
Chan TY, Chen Y, Pierce L, Snow D (2021) The Influence of Peers in Worker Misconduct: Evidence
from Restaurant Theft. Manuf. Serv. Oper. Manag. 23(4):952–973.
Cho DD, Bretthauer KM, Cattani KD, Mills AF (2019) Behavior Aware Service Staffing. Prod. Oper.
Manag. 28(5):1285–1304.
Dimmock SG, Gerken WC, Graham NP (2018) Is Fraud Contagious? Coworker Influence on Misconduct
by Financial Advisors. J. Finance 73(3):1417–1450.
Eliyana A, Sridadi AR (2020) Workplace Spirituality and Job Satisfaction Toward Job Performance: The
Mediation Role of Workplace Deviant Behavior and Workplace Passion. Manag. Sci. Lett.
10(11):2507–2520.
Emerson J, Hill B (2018) Peer Effects in Marathon Racing: The Role of Pace Setters. Labour Econ.
52(April):74–82.
Fahimnia B, Pournader M, Siemsen E, Bendoly E, Wang C (2019) Behavioral Operations and Supply
Chain Management–A Review and Literature Mapping. Decis. Sci. 50(6):1127–1183.
Falk A, Kosse F, Menrath I, Verde PE, Siegrist J (2018) Unfair Pay and Health. Manage. Sci.
64(4):1477–1488.
Field JM, Victorino L, Buell RW, Dixon MJ, Meyer Goldstein S, Menor LJ, Pullman ME, Roth A V.,
Secchi E, Zhang JJ (2018) Service Operations: What’s Next? J. Serv. Manag. 29(1):55–97.
Gallien J, Graves SC, Scheller-Wolf A (2016) OM Forum—Practice-Based Research in Operations
Management: What It Is, Why Do It, Related Challenges, and How to Overcome Them. Manuf.
Serv. Oper. Manag. 18(1):5–14.
Gallino S, Karacaoglu N, Moreno A (2022) Need for Speed: The Impact of In-Process Delays on
Customer Behavior in Online Retail. Oper. Res. (May).
Ghose A, Han SP (2011) An Empirical Analysis of User Content Generation and Usage Behavior on the
Mobile Internet. Manage. Sci. 57(9):1671–1691.
Gino F, Ayal S, Ariely D (2009) Contagion and Differentiation in Unethical Behavior. Psychol. Sci.

32

Electronic copy available at: https://ssrn.com/abstract=4151409


20(3):393–398.
Gläser S, Jahnke H, Strassheim N (2021) Opportunities and Challenges of Crowd Logistics on the Last
Mile for Courier, Express and Parcel Service Providers – a Literature Review. Int. J. Logist. Res.
Appl. 0(0):1–29.
Hanushek EA, Kain JF, Markman JM, Rivkin SG (2003) Does Peer Ability Affect Student Achievement?
J. Appl. Econom. 18(5):527–544.
Harrington L (2019) Change at the Speed of the Consumer: How E-Commerce is Accelerating Logistics
Innovations. Dhl:1–12.
Haruvy E, Katok E, Pavlov V (2020) Bargaining Process and Channel Efficiency. Manage. Sci.
66(7):2845–2860.
Ibanez MR, Clark JR, Huckman RS, Staats BR (2018) Discretionary Task Ordering: Queue Management
in Radiological Services. Manage. Sci. 64(9):4389–4407.
Jiang ZZ, Kong G, Zhang Y (2021) Making the Most of Your Regret: Workers’ Relocation Decisions in
On-Demand Platforms. Manuf. Serv. Oper. Manag. 23(3):695–713.
Kamalahmadi M, Yu Q, Zhou YP (2021) Call to Duty: Just-in-Time Scheduling in a Restaurant Chain.
Manage. Sci. 67(11):6751–6781.
KC DS, Staats BR, Kouchaki M, Gino F (2020) Task Selection and Workload: A Focus on Completing
Easy Tasks Hurts Performance. Manage. Sci. 66(10):4397–4416.
Kiba-Janiak M, Marcinkowski J, Jagoda A, Skowrońska A (2021) Sustainable Last Mile Delivery on E-
Commerce Market in Cities from the Perspective of Various Stakeholders. Literature Review.
Sustain. Cities Soc. 71(December 2020):102984.
Kidwell RE, Bennett N (1993) Employee Propensity to Withhold Effort: A Conceptual Model to Intersect
Three Avenues of Research. Acad. Manag. Rev. 18(3):429.
Lee C, Rodríguez G, Glei DA, Weinstein M, Goldman N (2014) Increases in Blood Glucose in Older
Adults. J. Aging Health 26(6):952–968.
Leider S (2018) Behavioral Analysis of Strategic Interactions. Handb. Behav. Oper. (John Wiley & Sons,
Inc., Hoboken, NJ, USA), 237–285.
Leszczensky L, Wolbring T (2022) How to Deal With Reverse Causality Using Panel Data?
Recommendations for Researchers Based on a Simulation Study. Sociol. Methods Res. 51(2):837–
865.
Li J, Granados N, Netessine S (2014) Are Consumers Strategic? Structural Estimation from the Air-
Travel Industry. Manage. Sci. 60(9):2114–2137.
Liang C, Peng J, Hong Y, Gu B (2022) The Hidden Costs and Benefits of Monitoring in the Gig
Economy. Inf. Syst. Res.
Lim SFWT, Jin X, Srai JS (2018) Consumer-Driven E-Commerce. Int. J. Phys. Distrib. Logist. Manag.
48(3):308–332.
Luo J, Meyer JJ (2017) A formal account of opportunism based on the situation calculus. AI Soc.
32(4):527–542.
Mangiaracina R, Perego A, Seghezzi A, Tumino A (2019) Innovative Solutions to Increase Last-Mile
Delivery Efficiency in B2C E-Commerce: A Literature Review. Int. J. Phys. Distrib. Logist. Manag.
49(9):901–920.
Manski CF (1993) Identification of Endogenous Social Effects: The Reflection Problem. Rev. Econ. Stud.

33

Electronic copy available at: https://ssrn.com/abstract=4151409


60(3):531.
Manski CF (2013) Identification of Treatment Response with Social Interactions. Econom. J. 16(1):S1–
S23.
Mao W, Ming L, Rong Y, Tang CS, Zheng H (2019) Faster Deliveries and Smarter Order Assignments
for an On-Demand Meal Delivery Platform. SSRN Electron. J.
Nickell S (1981) Biases in Dynamic Models with Fixed Effects. Econometrica 49(6):1417.
Norton MI, Monin B, Cooper J, Hogg MA (2003) Vicarious Dissonance: Attitude Change from the
Inconsistency of Others. J. Pers. Soc. Psychol. 85(1):47–62.
Oestreicher-Singer G, Sundararajan A (2012) The Visible Hand? Demand Effects of Recommendation
Networks in Electronic Markets. Manage. Sci. 58(11):1963–1981.
Olsson J, Hellström D, Pålsson H (2019) Framework of Last Mile Logistics Research: A Systematic
Review of the Literature. Sustainability 11(24):7131.
Pascual-Ezama D, Dunfield D, Gil-Gómez de Liaño B, Prelec D (2015) Peer Effects in Unethical
Behavior: Standing or Reputation? Espinosa M, ed. PLoS One 10(4):e0122305.
Pierce L, Snyder J (2008) Ethical Spillovers in Firms: Evidence from Vehicle Emissions Testing.
Manage. Sci. 54(11):1891–1903.
Ping RA (1993) The Effects of Satisfaction and Structural Constraints on Retailer Exiting, Voice,
Loyalty, Opportunism, and Neglect. J. Retail. 69(3):320–352.
Pylypchuk Y, Parasrampuria S, Smiley C, Searcy T (2022) Impact of Electronic Prescribing of Controlled
Substances on Opioid Prescribing: Evidence From I-STOP Program in New York. Med. Care Res.
Rev. 79(1):114–124.
Rauhut H (2013) Beliefs about Lying and Spreading of Dishonesty: Undetected Lies and Their
Constructive and Destructive Social Dynamics in Dice Experiments Perc M, ed. PLoS One
8(11):e77878.
Rautela H, Janjevic M, Winkenbach M (2021) Investigating the Financial Impact of Collection-and-
Delivery Points in Last-Mile E-Commerce Distribution. Res. Transp. Bus. Manag.:100681.
Robinson SL, Bennett RJ (1995) A Typology of Deviant Workplace Behaviors: A Multidimensional
Scaling Study. Acad. Manag. J. 38(2):555–572.
Roels G, Staats BR (2021) OM Forum—People-Centric Operations: Achievements and Future Research
Directions. Manuf. Serv. Oper. Manag. 23(4):745–757.
Rubin DB (2005) Causal Inference Using Potential Outcomes. J. Am. Stat. Assoc. 100(469):322–331.
Schreiner R, Bremer B (2013) From Natural Variation to Optimal Policy? The Importance of Endogenous
Peer Group Formation. Econometrica 81(3):855–882.
Schwartz LB, Hirschman AO (1972) Exit, Voice, and Loyalty: Responses to Decline in Firms,
Organizations, and States. Univ. PA. Law Rev. 120(6):1210.
Seghezzi A, Mangiaracina R (2022) Investigating Multi-Parcel Crowdsourcing Logistics for B2C E-
Commerce Last-Mile Deliveries. Int. J. Logist. Res. Appl. 25(3):260–277.
Simchi-Levi D (2014) OM Forum —OM Research: From Problem-Driven to Data-Driven Research.
Manuf. Serv. Oper. Manag. 16(1):2–10.
Stock J, Yogo M (2005) Testing for Weak Instruments in Linear IV Regression. Andrews DWK, ed.
Identif. Inference Econom. Model. (Cambridge University Press, New York), 80–108.

34

Electronic copy available at: https://ssrn.com/abstract=4151409


Tang CS (2016) OM Forum—Making OM Research More Relevant: “Why?” and “How?” Manuf. Serv.
Oper. Manag. 18(2):178–183.
Wang X, Yuen KF, Wong YD, Teo CC (2020) E-Consumer Adoption of Innovative Last-Mile Logistics
Services: A Comparison of Behavioural Models. Total Qual. Manag. Bus. Excell. 31(11–12):1381–
1407.
Williamson OE (1993) Opportunism and its Critics. Manag. Decis. Econ. 14(2):97–107.
Winkenbach M (2019) The Analytics Revolution in Last-Mile Delivery. Logistics Management
(2002)58(1) (January) http://libproxy1.nus.edu.sg/login?url=https://www.proquest.com/trade-
journals/analytics-revolution-last-mile-delivery/docview/2177046153/se-2?accountid=13876.
Wooldridge JM (2002) Econometric Analysis of Cross Section and Panel Data (MIT Press).
World Economic Forum (2020) The Future of the Last-Mile Ecosystem. World Econ. Forum (January):1–
26.
Xu Y, Lu B, Ghose A, Dai H, Zhou W (2020) How Do Ratings and Penalties Moderate Earnings on
Crowdsourced Delivery Platforms? SSRN Electron. J.:1–42.

35

Electronic copy available at: https://ssrn.com/abstract=4151409


A. Appendix

Table A. provides pairwise correlations for our main variables. We use Pearson correlation and do not
find any multi-collinearity issues in our data.

Table A.1: Pairwise Correlation Matrix of Variables

(1) (2) (3) (4) (5) (6) (7)


1) parcels 1.00
2) fe_count -0.09* 1.00
3) first_attempt -0.02* -0.08* 1.00
4) lag.unattempted 0.15* 0.01* 0.03* 1.00
5) lag.idle_time 0.01* -0.04* 0.05* 0.07* 1.00
6) cod 0.65* -0.06* -0.12* 0.00 0.00 1.00
7) lag.fake_remark 0.16* -0.03* -0.05* 0.26* 0.07* 0.18* 1.00
Note: *p < 0.05

Figure A.1: Distribution Plots of Variables

36

Electronic copy available at: https://ssrn.com/abstract=4151409


In Figure A.1, we plot histograms and density plots for our variables to ensure that we do not have any
extreme outliers in the data.

Table A.2 provides the first stage estimates for our 2SLS model. We see that the coefficient of average
fake remarks of the coworkers is negative and significant. Also, the value of the F-stat is 276.3, which also
validates the strength of our instrument.

Table A.2: Stage 1 of 2SLS Main Model

lag.fake_remark
lag2.co_fr_avg -0.401***
(0.048)
parcels -0.002
(0.002)
fe_count -0.001
(0.018)
first_attempt -0.002**
(0.001)
lag.unattempted 0.239***
(0.018)
lag.idle_time 0.004***
(0.001)
cod 0.013***
(0.003)
Fixed Effects:
fe Yes
date Yes
center-week Yes
F-test stat 276.3
2
𝑅 0.404
adj-𝑅 2 0.365
Note: *** p<0.01, ** p<0.05, * p<0.1; N=38,602; Standard errors are in
parentheses. Standard errors clustered at FE level.

37

Electronic copy available at: https://ssrn.com/abstract=4151409


In Table A.3, we show the first stage of our 2SLS model with four different alternative IVs. We find
that all of the alternative IVs are relevant and strong. The F-stat for each of them is above 150, which is
much larger than the required threshold of 10.

Table A.3: Stage 1 of Alternative IV Models

lag.fake_remark
(1) (2) (3) (4)
lag2.co_fr_avg1 -0.098***
(0.019)
lag2.co_fr_avg3 -0.274***
(0.028)
lag2.co_fr_avg10 -0.454***
(0.060)
lag2.co_fr_shift_avg -0.108***
(0.015)
Fixed Effects:
fe Yes Yes Yes Yes
date Yes Yes Yes Yes
center-week Yes Yes Yes Yes
Controls Yes Yes Yes Yes
F-test stat 154.1 361.1 228.0 182.4
𝑅2 0.387 0.398 0.404 0.387
adj-𝑅 2 0.350 0.361 0.364 0.349
N 43,414 42,087 36,163 41,588
Note: *** p<0.01, ** p<0.05, * p<0.1; Standard errors are in parentheses. Standard
errors clustered at FE level.

38

Electronic copy available at: https://ssrn.com/abstract=4151409


Table A.4 provides the estimates of stage 1 of our main model with an additional control variable
visited_before.

Table A.4: Stage 1 of the Main Model with visited_before Control

lag.fake_remark
lag2.co_fr_avg -0.401***
(0.048)
parcels -0.006*
(0.003)
fe_count 0.000
(0.018)
first_attempt -0.002**
(0.001)
lag.unattempted 0.238***
(0.018)
lag.idle_time 0.004***
(0.001)
cod 0.008*
(0.004)
visited_before 0.008*
(0.004)
Fixed Effects:
fe Yes
date Yes
center-week Yes
F-test stat 276.8
2
𝑅 0.404
adj-𝑅 2 0.365
Note: *** p<0.01, ** p<0.05, * p<0.1; N=38,602; Standard errors are in parentheses.
Standard errors clustered at FE level.

39

Electronic copy available at: https://ssrn.com/abstract=4151409


Estimates of our main model for the fixed effects and lagged dependent variable models are provided in
Table A.5. According to the bracketing property, the true estimate of the lag.fake_remark coefficient
(𝛽1 ) ∈ [βLDV , βFE ], 𝑖. 𝑒. [−0.705, −0.584]. Interestingly, this interval is quite small, indicating that our
effect is robust. Furthermore, we also calculate the 95% confidence interval for both βLDV and βFE to make
sure that the difference between the two coefficients is statistically insignificant. The 95% confidence
interval for βFE is [−0.987, −0.180], whereas for βLDV it is [−0.900, −0.500]. In other words, the 95%
confidence interval of βLDV contains βFE and vice-versa. These results confirm the robustness of our results.
Additionally, we provide the fixed effects and lagged dependent variable models for FTR as the DV in
Table A.6 and reaffirm the robustness of our results.

Table A.5: Bracketing Property for Main Model

Fixed Effects Lagged Dependent


FE+LDV
(FE) Variable (LDV)
success
lag.fake_remark -0.584** -0.705*** -0.589**
(0.206) (0.100) (0.194)
Fixed Effects:
fe Yes No Yes
date Yes Yes Yes
center-week Yes Yes Yes
Controls Yes Yes Yes
R2 0.823 0.781 0.825
adj-R2 0.811 0.771 0.813
Note: *** p<0.01, ** p<0.05, * p<0.1; N=38,602; Standard errors are in parentheses.
Standard errors clustered at FE level. Estimated with instrumental variable using 2SLS

40

Electronic copy available at: https://ssrn.com/abstract=4151409


Table A.6: Bracketing Property for FTR

Fixed Effects Lagged Dependent


FE+LDV
(FE) Variable (LDV)
FTR
lag.fake_remark -0.587** -0.778*** -0.590**
(0.196) (0.099) (0.189)
Fixed Effects:
fe Yes No Yes
date Yes Yes Yes
center-week Yes Yes Yes
Controls Yes Yes Yes
R2 0.812 0.761 0.814
adj-R2 0.800 0.751 0.802
Note: *** p<0.01, ** p<0.05, * p<0.1; N=38,602; Standard errors are in parentheses.
Standard errors clustered at FE level. Estimated with instrumental variable using 2SLS

41

Electronic copy available at: https://ssrn.com/abstract=4151409

You might also like