You are on page 1of 4

So let's talk about the distribution of Stock

Keeping Units, and I'll use as an example a grocery store.


This is a project we did several years ago,
where we looked at a local grocery store which is located
just off of MIT's campus in Porter Square,
and we looked at it for a bunch of different reasons,
and we wanted to find out how they replenish
different types of items there.
And so we looked through, and we examined
the number of items that are in the store,
and there's about 20,000 SKUs, or Stock Keeping Units,
in this grocery store, and they generally
fall into three categories.
You have the dry goods, the frozen goods,
and the perishables, the meats and vegetables,
things that have to be kept refrigerated but not frozen.
And we focused in on the dry goods.
Of the 20,000 in the store, there
were about 8,000 SKUs that constituted the dry goods.
And we looked over one year, and we found that 1.2 million,
roughly, items were sold of that 8,000 SKUs.
So 1.15 million items were sold.
So if we look at the number of units
sold per SKU, what's interesting is
it's not quite what you'd think.
If I look at this, the average number,
if I just divide 1.156 million divided by 8,000, on average,
each SKU sold 144 items over the course of a year.
If I look at the median-- remember,
the median is if I put them from the lowest to the highest,
it's the one in the middle for the frequency--
the median is 72.
So just by seeing that they're not the same or close
to each other means I'm not looking
like it's nice, symmetric distribution.
Then I look at the mode, and the mode
is the most frequent observation.
And the most frequent number of times a particular SKU was sold
is 0.
That's the most frequent number.
And then we can look at the standard deviation,
another metric.
That's the spread around the mean, and that's about 355.
So if my average is 144 and my standard deviation
is 355, that tells me right away it doesn't look very symmetric.
But let's think about it a little more,
and let's go through to flesh it out.
What do you think some of the biggest sellers
might have been?
Well, we went through and looked at this,
and what's interesting is because it's
right near a campus, you can guess it's a lot of water,
a lot of macaroni and cheese, and a lot of tuna fish.
I thought that was kind of interesting.
Then we also looked to see, well,
do sales happen constantly across the year?
And actually, there were three massive days
that sales were about two times what the normal daily shopping
experience is.
The sales increased by a factor of two or more,
and those were right before Thanksgiving, a large U.S.
holiday, right before Christmas, and right before the Superbowl.
So it's kind of interesting.
The Superbowl is a football game that's played here in the US.
It's the championship game.
So those were the big three days when sales really peaked.
But what we wanted to focus on is of all these 8,000 Stock
Keeping Units, how is the demand distributed?
Is it uniform?
Is it normal?
Is it something else?
How do they fall?
And so we looked at this, and there's
a couple different ways you can think about it.
One is, what if everything sold the same amount?
So on average, they sold 144 items each.
What if every item sold 144?
Then that means 50% of my SKUs, my Stock Keeping Units,
would constitute 50% of my sales volume.
And I'm just using volume here, not dollar value.
This is just units.
If every single one of my items sold the average amount,
then this is what that demand distribution would look like.
This is what it would look like if it was normal.
So if I had normal, let's say, the expected amount.
The average is 144, but some sell a little less,
some sell a little more, but it follows
that nice, symmetric distribution.
Then my distribution chart of percent
of SKUs versus percent of sales volume
would look like that green line, kind of curved.
Now, this blue line is another distribution.
It's called the power law.
And this is the idea that a lot of items don't sell much,
but then I have some that sell a lot.
So it's very skewed in the percent that
actually sell different amounts.

And then finally, I've got this one that


follows a log normal distribution, where again, I
have this long right tail where I have some that sell a lot,
but then I have others that are selling less.
It's kind of skewed to the left.
So you look at these four different types
of distribution, so which do you think
the products at the grocery store followed?
Well, they followed very closely to the power law.
The idea is that very few of the items
contribute to most of the items sold.
So this dark blue line is actually the actual plotted--
let me just highlight it, this line-- is
the actual plotted amount, and then this black line here
is actually a fitted function.
I just used regression to fit that in.
So you can see what's happening here,
what this shows is that 10% of the products--
let's see what that does-- constitute about 50%
of the items sold.
So one out of 10 items constituted
half of all the items sold.
That's pretty amazing because then I look up here,
and I go down here.
50% of the products only constituted 10% of the sales,
from 90% to 100%.
So I've got a critical few and I've got a trivial many here.
And so you might ask yourself, is this unique?
Is this just for that one grocery
store that's in Porter Square?
And the answer would be no.
Let's look at another example, and this is from data
that I collect for the truckload market.
What truckload does, it picks up from one location
and it ships directly to another location.
So it's a full truckload move, and this
is a data set of about five million shipments
over the course of a year across 400,000 lanes.
A lane is simply an origin destination
pairing-- from Chicago to Los Angeles, from Miami to Atlanta.
That's a lane, an origin destination pair.
So if I look at this, on the horizontal axis,
I've got the percent of lanes that I've
included from 0% to 100%, and on the vertical axis,
I've got the percent of the volume included on that lane.
And what this is saying is that 50%
of the volume, 50% of those truckload volumes,
that's 2.5 million shipments, are only
handled on 3% of the lanes.
That's amazing.
3% of 400,000 is about 12,000, so just 12,000 of those lanes
handles half the volume.
And then you could think of it the other way.
Only 3% of the volume is handled by more than almost half
of the lanes, 43%.
So again, we have the case where very few traffic lanes-- origin
destination pairs-- account for the vast majority
of the truckload shipments or the truckload movements.
So again, you have the critical few and the trivial many.
And why are we talking about this?
Because maybe you want to treat them differently.
You want to manage the high volume lanes differently
than the ones where you have very infrequent shipments.
So is this just for supply chain?
No.
Power laws are everywhere.
It's really interesting, and there's
a lot of new research going on in this, especially
in the internet, in different connections
with social sciences.
But it's been around for a long time,
and it's very common in physical and social systems.
Severity of hurricanes and earthquakes
follow the power law.
Income.
We've all heard of Pareto's law.
Sometimes it's turned into a verb.
When we Pareto a function, we look
for the 20% that constitutes 80% of the activity.
Visits to websites and blogs follow a power law.
The frequency of digits, Benford's law, is interesting.
The most common digit in the table is the number 1,
and it shows up much more frequently
than you would expect.
Frequency of authors cited in literature follows it.
So the first person who found each of these power law
relationships, they get to name a law after them, apparently.
There's a whole bunch of these, and if you're ever bored,
look up the power law and you can
see all these different examples where it's been applied.
For us, we'll see it in profitability
of customers and products.
We already saw it in demand, but we're
going to see this in everything.
Very few of your customers drive most of your revenue,
and you're going to have a lot of customers
that don't drive that much.
And my favorite is questions from students in class.
There will be very few students that ask most of the questions,
while the vast majority don't say anything,
don't ask any questions.
So you'll see this a lot.
The critical takeaway from this is that with a power law.
If you have a relationship that follows this power law where
you have something like this, you'll
have a critical few, a very important few,
and you'll have over here a trivial many.
And the question is, how do I treat these guys differently
from the ones in the middle, differently
for the ones on the far right?

You might also like