Professional Documents
Culture Documents
Cumulative Probability P90 P50 P10 PDF
Cumulative Probability P90 P50 P10 PDF
f(x)~0.18
1.28
(P90)
0
(P50)
1
NumberofLeavesofthatsize
500
200
50
150
50
As you can see, the medium sized leaves are most common while the very small and very
large leaves are least common. In nature things tend to group around a central common size
or point.
What you have done is describe the uncertainty of the leaf sizes by a 5-bin frequency
distribution. If you look at Figure 2, you should be able to see that the shape at the top of the
green boxes is similar to the shape of the red line in Figure 1. Figure 1 is known as a
continuous distribution (the line flows continuously) think of it as a distribution with a very
large number of bins. Figure 2 is a discrete distribution (it has a discrete number of bins that
capture the number of leaves that fall within a certain size range).
We can do this exercise for every measureable thing and create a frequency distribution.
There are other (non-normal) frequency distribution shapes possible but these do not need
to be discussed here. If you understand a normal frequency distribution then thats all you
need to know for the time being.
Now that you understand frequency distributions, whats P90 etc?
With the frequency distribution that we just created we can add up all the numbers from one
end and create a CUMULATIVE frequency distribution. Its just another way of showing the
data. Using the leaf example, if we start adding up the leaves from the biggest end and work
our way to the smallest end we end up with the following:
CumulativeLeavesfrombiggesttosmallest
950
900
700
200
50
So how does that help us? Well we can say things like: there are 900 leaves bigger than the
smallest leaves or there are 200 leaves bigger than the medium size leaves.
We can do the same exercise with the continuous frequency distribution in Figure 1 and we
end up with the following continuous cumulative frequency distribution:
Q(x)=0.9
Q(x)=0.5
1.28
(P90)
0
(P50)
4
Although this looks terribly mathematical, its similar to the graph you have just produced
with the leaf example. The main difference is that the numbers on the Y-axis (or vertical axis)
have been divided by the biggest number at the end thereby normalising the axis to 100%.
You should be able to see that the shape described by the top of the green boxes in Figure
3 looks very similar to the shape of the red line in Figure 4. Figure 4 looks smoother than
Figure 3 because Figure 4 was created from the smooth continuous distribution in Figure 1.
Since the Y-axis in Figure 4 has been normalised to 100% we can read off the estimates that
correspond to the 90%, 50% and 10% cumulative frequency. These estimates are usually
termed the P90, P50 and P10 confidence levels.
Using Figure 4, the estimate at the P90 confidence level is -1.28 and the estimate at the P50
confidence level is 0. Its just the way the scale is presented it has been normalised to zero
at the middle.
As per the leaves example, P90 means that 90% of the estimates (or outcomes) are
expected to be bigger than this estimate. P50 means that 50% of the estimates (or
outcomes) are expected to be bigger than this estimate. This is NOT the same as the
chance of that estimate occurring.
The chance of a single estimate occurring can be read off Figure 1. If we ask the question a
different way: from Figure 4, what is more likely to occur more frequently - P90 or P50? To
help you, you cant actually answer the question from the cumulative frequency distribution
(Figure 4) and you will need to jump from the cumulative frequency curve (Figure 4) back to
the frequency distribution (Figure 1).
An easier way to understand the question would be to use the leaf example, assume P90 is
the same as the small leaves and P50 is the same as the medium leaves. So the question
becomes: what is more likely to occur the small leaves or the medium leaves? The
medium leaves are more likely to occur of course. So in a normal distribution, the P50 value
is more likely to occur than the P90 value.
In simple general terms, that is why P50 is sometimes also known as the best estimate
because its the estimate that occurs more frequently.
So how does this help you to understand oil and gas estimates?
An oil or gas estimate is calculated by multiplying together a number of parameters, for
example:
Oil in place equals rock volume of the reservoir multiplied by porosity multiplied by oil
saturation (there are actually a lot more input variables but let us keep it simple for now).
Rock volume, porosity and oil saturation are measureable things. There is however
uncertainty surrounding the measurement of those parameters. To cater for this uncertainty
we describe the input parameters by continuous frequency distributions. If we then multiply
all the input frequency distributions together (a computer does this for us), the output, oil in
place, ends up as a frequency distribution. We can then take this oil in place frequency
distribution and create an oil in place cumulative frequency distribution. This is shown
diagrammatically as follows:
RockVolume
Porosity
OilSaturation
Oilinplace
Multiplythefrequencydistributionstogethertoobtaina
frequencydistributionandthencreateacumulativefrequency
distribution.
FromthefrequencydistributionwecanreadofftheP90,P50
andP10confidencelevels.
Step 3: take the oil in place continuous frequency distribution and create an oil in place
continuous cumulative frequency distribution.
Step 4: From the oil in place continuous cumulative frequency distribution read off the
estimate sizes that correspond to the P90, P50 and P10 confidence levels.
So now that you understand frequency distributions, cumulative frequency distributions and
how we use them to create volumetric estimates you should be able to answer a few
questions:
Question 1: What does P90 mean?
Answer: It means that 90% of the calculated estimates are bigger than the P90 estimate.
Question 2: Is the P90 estimate or the P50 estimate more likely to occur?
Answer: P50 is more likely to occur because the estimate is expected to occur more often
than the P90 estimate in the frequency distribution.
Question 3: Whats the most important number P90, P50 or P10?
Answer: P50 is the most important number because its the best estimate. P90 and P10 just
show the range in the uncertainty of the estimate.
Question 4: Am I more confident in the P90 estimate or the P50 estimate?
Answer: You are more confident in the P90 estimate. As 90% of the estimates are greater
than the P90 estimate you would more confident that the final actual outcome will be greater
than the P90 estimate than greater than the P50 estimate. Recall that only 50% of the
estimates are greater than the P50 estimate. This doesnt mean that the P90 estimate has a
higher chance of occurring, as explained above, all it means is that you have a higher
confidence in that estimate being exceeded by the actual outcome. This can be a difficult
concept to grasp. An easier way to think about it may be to say Im confident that the actual
outcome will be greater than my P90 estimate but overall I expect that the final outcome will
be closest to my P50 estimate.
Question 5: Does everybody do frequency distribution (or probabilistic) calculations?
Answer: No. Some people just multiply single values (deterministic best estimates) together
to calculate a single output estimate, not a frequency distribution.
Question 5a: So how can you tell the confidence of that single output estimate?
Answer: You cant, its a single best estimate. You just have to use it as it is calculated.
Note that in the example above we only calculated the oil in place. We can go one step
further and calculate the recoverable oil. Recoverable oil equals oil in place multiplied by a
recovery factor. For the recovery factor we can create a frequency distribution like all other
input parameters. Multiplying the oil in place frequency distribution by the recovery factor
frequency distribution we end up with a recoverable oil frequency distribution and then we
can convert this to a cumulative frequency distribution and read off the P90, P50 and P10
estimates. All the same concepts as discussed above apply.
Question 6: How do you create frequency distributions for all the variables?
Answer: You go to university for 4-5 years and become a geophysicist, geologist,
petrophysicist or reservoir engineer, you get a good job with an oil and gas company, you
get trained over 5-10 years, you gain a lot of experience and knowledge in earth sciences
and the physics of oil and gas moving through rocks and then you get to work on interesting
things like estimating recoverable oil and gas. But seriously, that question is taking us
outside the scope of this document as it involves knowledge, experience and the
measurement and analysis of the data that make up each of the individual input parameters.
For further detailed reading investors should consult the Recoverable Hydrocarbon
Guidelines on Cooper Energys website policies section.
Cooper Energy Limited
For further information contact Cooper Energy via the website.