You are on page 1of 40

16/04/2018 Project_F

In [1]:

from __future__ import print_function


import pandas as pd
import numpy as np

Creation of the path and importation of the RawConstructionData

In [121]:

RCD_path='C:\\Users\\Paco\\Desktop\\Code.Hub\\Assignment\\RawConstructionData.csv'
rcd=pd.read_csv(RCD_path,delimiter=';')

---Description of shape and demonstration of statistical attributes---

First we need to check the types in order to make changes

In [3]:

rcd.dtypes

Out[3]:

Scope object
Construction Element Type object
ID int64
Construction Element Family object
ConstructionElementPart object
BOQCategory object
BOQ object
BOQDescription object
Quantity float64
Unit object
UnitPrice float64
TotalCost float64
Length float64
Thickness float64
Height float64
X float64
Y float64
Z float64
Scope_ElementType_BOQ object
dtype: object

Every type is correct so we proceed

In [4]:

rcd.shape

Out[4]:

(2295, 19)

http://localhost:8890/notebooks/Project_F.ipynb 1/40
16/04/2018 Project_F

In [5]:

rcd.describe()

Out[5]:

ID Quantity UnitPrice TotalCost Length Thickness

count 2295.000000 2295.000000 2273.000000 2295.000000 1841.000000 1841.000000 1841

mean 42998.432244 146.455399 26.384954 454.171856 5.564373 0.286451 1

std 5598.073473 2144.250955 37.864857 4214.520529 6.358654 0.321794 1

min 31571.000000 0.027000 1.000000 0.140000 0.065000 0.020000 0

25% 40890.000000 1.700000 2.000000 31.185000 2.325000 0.100000 0

50% 42765.000000 6.143000 12.000000 92.367000 4.116000 0.300000 0

75% 47378.000000 42.230000 13.000000 171.509000 6.650000 0.350000 3

max 53694.000000 94882.837000 99.000000 134191.431000 49.900000 3.000000 4

----EDA for RawConstructionData df----

for each categorical column we will find the number of unique values and the number of each time these values
appears

Scope:

In [6]:

i=0
u, c = np.unique(rcd['Scope'], return_counts=True)
for z in zip(list(u), list(c)):
i+=1
print('{}: {}'.format(z[0], z[1]))
print ("Number of unique values: {}".format(i))

PRB_BO_STR: 18
PRS_BO_STR: 658
PRS_L0_STR: 608
PRS_L1_STR: 534
PRS_L2_STR: 457
PRS_RF_STR: 20
Number of unique values: 6

Construction Element Type:

http://localhost:8890/notebooks/Project_F.ipynb 2/40
16/04/2018 Project_F

In [9]:

i=0
u, c = np.unique(rcd['Construction Element Type'], return_counts=True)
for z in zip(list(u), list(c)):
i+=1
print('{}: {}'.format(z[0], z[1]))
print ("Number of unique values: {}".format(i))

Beam: 597
Columns: 222
ConcreteWall: 339
Earthwork: 7
Formwork: 146
Mat Foundation: 148
Parapet: 132
Protection Layer: 531
Ramp: 18
Retaining Wall: 48
Slab: 27
Stair: 80
Number of unique values: 12

Construction Element Family:

In [10]:

i=0
u, c = np.unique(rcd['Construction Element Family'], return_counts=True)
for z in zip(list(u), list(c)):
i+=1
print('{}: {}'.format(z[0], z[1]))
print ("Number of unique values: {}".format(i))

ConcreteWork: 1611
EarthWork: 7
FormWork: 146
ProtectionWork: 531
Number of unique values: 4

Construction Element Part:

http://localhost:8890/notebooks/Project_F.ipynb 3/40
16/04/2018 Project_F

In [11]:

i=0
u, c = np.unique(rcd['ConstructionElementPart'], return_counts=True)
for z in zip(list(u), list(c)):
i+=1
print('{}: {}'.format(z[0], z[1]))
print ("Number of unique values: {}".format(i))

Basement Protection Layer FF: 42


Basement Protection Layer LF: 32
DOW: 132
DOW 3cm: 22
DOW 3cm LF: 74
Earthwork for Elevator Pits - FF: 3
Earthwork for Mat Foundation - FF: 1
Earthwork for Rainwater Tank - FF: 2
Earthwork for Septic Tank - FF: 1
PIT Protection Layer LF: 120
POS for Slabs: 146
Prot.Layer for Ramp Walls & Roof: 5
Prot.Layer for slab +0.25: 36
Protection Layer GF: 6
RC_Beam C-B: 33
RC_Beam C-C/C-CW: 540
RC_Beam C-RW: 24
RC_CW: 126
RC_CW (W): 6
RC_CW (W-W/W-C): 6
RC_CW (standalone): 57
RC_CW (standard): 144
RC_CoL_Int: 198
RC_CoL_eX: 24
RC_MF: 28
RC_MF -LF: 120
RC_Parapet: 132
RC_RW (CW/C-C): 16
RC_RW (RW-C): 32
RC_Ramp: 18
RC_Slab: 24
RC_Slab S15: 3
RC_Stair Filling: 20
RC_StairFlight: 30
RC_StairLanding: 30
Roof Protection Layer FF: 6
Roof Protection Layer LF - 2cm: 56
Number of unique values: 37

BOQCategory:

http://localhost:8890/notebooks/Project_F.ipynb 4/40
16/04/2018 Project_F

In [12]:

i=0
u, c = np.unique(rcd['BOQCategory'], return_counts=True)
for z in zip(list(u), list(c)):
i+=1
print('{}: {}'.format(z[0], z[1]))
print ("Number of unique values: {}".format(i))

Concrete: 551
Earth Moving: 7
Formwork: 640
Protection Layers: 595
Reinforcement: 502
Number of unique values: 5

BOQ:

http://localhost:8890/notebooks/Project_F.ipynb 5/40
16/04/2018 Project_F

In [13]:

i=0
u, c = np.unique(rcd['BOQ'], return_counts=True)
for z in zip(list(u), list(c)):
i+=1
print('{}: {}'.format(z[0], z[1]))
print ("Number of unique values: {}".format(i))

Bituminous (asphalt) WaterProof: 71


Bituminous WP Paint: 75
C20 for Blinding: 7
C20 for Leveling: 7
C20 for Shotcrete: 32
C20 for lightweight Concrete: 1
C30/37 for Beams: 196
C30/37 for Columns & Con. Walls: 179
C30/37 for Mat Foundation: 31
C30/37 for Parapets: 33
C30/37 for Retaining Walls: 18
C30/37 for Slabs: 15
C30/37 for Stairs: 30
Excavation: 7
Formwork for Beams: 196
Formwork for Columns,CW & RW: 200
Formwork for MatFoundation: 24
Formwork for Parapets: 33
Formwork for Slabs: 167
Formwork for Stairs: 20
Geotextile: 39
HDPE Drainage Membrane: 42
Nylon Vapor Barrier: 29
Polysterine Insulation(DOW)6cm: 133
Polystyrene Insulation(DOW) 3cm: 96
Reinforcement for Beams: 196
Reinforcement for Columns: 72
Reinforcement for ConcreteWalls: 113
Reinforcement for MatFoundation: 31
Reinforcement for Parapets: 33
Reinforcement for Ret.Walls: 12
Reinforcement for Slabs: 15
Reinforcement for Stairs: 30
Screed: 1
Wat.Pr. Admixtures for Ret.Wall: 49
Waterstop: 62
Number of unique values: 36

BOQ Description:

http://localhost:8890/notebooks/Project_F.ipynb 6/40
16/04/2018 Project_F

In [14]:

i=0
u, c = np.unique(rcd['BOQDescription'], return_counts=True)
for z in zip(list(u), list(c)):
i+=1
print('{}: {}'.format(z[0], z[1]))
print ("Number of unique values: {}".format(i))

Bituminous (asphalt) WaterProofing: 71


Bituminous WaterProofing Paint: 75
C20 Lightweight Concrete: 1
C20 for Blinding: 7
C20 for Leveling: 7
C20 for Shotcrete: 32
C30/37 for Beams: 196
C30/37 for Columns (internal) and concrete walls: 179
C30/37 for Mat Foundation: 31
C30/37 for Parapets: 33
C30/37 for Retaining Walls and external columns: 18
C30/37 for Slabs: 15
C30/37 for Stairs: 30
Excavation: 7
Formwork for Beams: 196
Formwork for Columns, concrete walls and retaining walls: 200
Formwork for MatFoundation: 24
Formwork for Parapets: 33
Formwork for Slabs: 167
Formwork for Stairs: 20
Geotextile: 39
HDPE Drainage Membrane: 42
Nylon Vapor Barrier: 29
Polysterine Insulation, DOW 6cm: 133
Polystyrene Insulation(DOW) 3cm: 96
Reinforcement for Beams - B500C: 196
Reinforcement for Columns-B500C: 72
Reinforcement for ConcreteWalls- B500C: 113
Reinforcement for MatFoundation - B500C: 31
Reinforcement for Parapets - B500C: 33
Reinforcement for Retaining Walls - B500C: 12
Reinforcement for Slabs - B500C: 15
Reinforcement for Stairs - B500C: 30
Screed: 1
Wat.Pr. Admixtures for Ret.Wall and external basement columns: 49
Waterstop: 62
Number of unique values: 36

Unit:

http://localhost:8890/notebooks/Project_F.ipynb 7/40
16/04/2018 Project_F

In [15]:

i=0
u, c = np.unique(rcd['Unit'], return_counts=True)
for z in zip(list(u), list(c)):
i+=1
print('{}: {}'.format(z[0], z[1]))
print ("Number of unique values: {}".format(i))

kg: 551
m: 62
m2: 1173
m3: 509
Number of unique values: 4

Scope Element Type BOQ:

In [17]:

i=0
u, c = np.unique(rcd['Scope_ElementType_BOQ'], return_counts=True)
for z in zip(list(u), list(c)):
i+=1
print('{}: {}'.format(z[0], z[1]))
print ("Number of unique values: {}".format(i))
PRS_L1_STR_Beam_ReinforcementforBeams: 63
PRS_L1_STR_Columns_C30/37forColumns&Con.Walls: 22
PRS_L1_STR_Columns_FormworkforColumns,CW&RW: 22
PRS_L1_STR_Columns_ReinforcementforColumns: 22
PRS_L1_STR_ConcreteWall_C30/37forColumns&Con.Walls: 34
PRS_L1_STR_ConcreteWall_FormworkforColumns,CW&RW: 34
PRS_L1_STR_ConcreteWall_ReinforcementforConcreteWalls: 34
PRS_L1_STR_Formwork_FormworkforSlabs: 48
PRS_L1_STR_ProtectionLayer_PolysterineInsulation(DOW)6cm: 66
PRS_L1_STR_ProtectionLayer_PolystyreneInsulation(DOW)3cm: 38
PRS_L1_STR_Slab_C30/37forSlabs: 1
PRS_L1_STR_Slab_FormworkforSlabs: 1
PRS_L1_STR_Slab_ReinforcementforSlabs: 1
PRS_L1_STR_Stair_C30/37forStairs: 7
PRS_L1_STR_Stair_FormworkforStairs: 5
PRS_L1_STR_Stair_ReinforcementforStairs: 7
PRS_L2_STR_Beam_C30/37forBeams: 61
PRS_L2_STR_Beam_FormworkforBeams: 61
PRS_L2_STR_Beam_FormworkforColumns,CW&RW: 1
PRS_L2_STR_Beam_FormworkforSlabs: 2

For the numeric type columns we will find the range of the values and we also check for missing values

Quantity:

http://localhost:8890/notebooks/Project_F.ipynb 8/40
16/04/2018 Project_F

In [86]:

x=round((rcd.Quantity.min()),2)
y=rcd.Quantity.max()
print("The range is {} - {}".format(x,y))
x=rcd.Quantity.isnull().sum()
print("number of missing values is: {}".format(x))

The range is 0.03 - 94882.837


number of missing values is: 0

Unit Price:

In [21]:

x=rcd.UnitPrice.min()
y=rcd.UnitPrice.max()
print("The range is {} - {}".format(x,y))
x=rcd.UnitPrice.isnull().sum()
print("number of missing values is: {}".format(x))

The range is 1.0 - 99.0


number of missing values is: 22

So we notice that there are 22 missing values in unitprice. We will work on how to fill them in the next part

Total Cost:

In [24]:

x=rcd.TotalCost.min()
y=round((rcd.TotalCost.max()),2)
print("The range is {} - {}".format(x,y))
x=rcd.TotalCost.isnull().sum()
print("number of missing values is: {}".format(x))

The range is 0.14 - 134191.43


number of missing values is: 0

Length:

In [25]:

x=rcd.Length.min()
y=rcd.Length.max()
print("The range is {} - {}".format(x,y))
x=rcd.Length.isnull().sum()
print("number of missing values is: {}".format(x))

The range is 0.065 - 49.9


number of missing values is: 454

Thickness:

http://localhost:8890/notebooks/Project_F.ipynb 9/40
16/04/2018 Project_F

In [26]:

x=rcd.Thickness.min()
y=rcd.Thickness.max()
print("The range is {} - {}".format(x,y))
x=rcd.Thickness.isnull().sum()
print("number of missing values is: {}".format(x))

The range is 0.02 - 3.0


number of missing values is: 454

Height:

In [27]:

x=rcd.Height.min()
y=rcd.Height.max()
print("The range is {} - {}".format(x,y))
x=rcd.Height.isnull().sum()
print("number of missing values is: {}".format(x))

The range is 0.15 - 4.35


number of missing values is: 454

So we notice that there are 454 missing values in thickness, height and length. We will work on how to fill them
in the next part

X,Y,Z :

In [28]:

x=rcd.X.min()
y=rcd.X.max()
x=print("The range is {} - {}".format(x,y))
x=rcd.X.isnull().sum()
print("number of missing values is: {}".format(x))

The range is 0.05 - 49.95


number of missing values is: 0

In [29]:

x=rcd.Y.min()
y=rcd.Y.max()
x=print("The range is {} - {}".format(x,y))
x=rcd.Y.isnull().sum()
print("number of missing values is: {}".format(x))

The range is -0.775 - 39.95


number of missing values is: 0

http://localhost:8890/notebooks/Project_F.ipynb 10/40
16/04/2018 Project_F

In [30]:

x=rcd.Z.min()
y=rcd.Z.max()
x=print("The range is {} - {}".format(x,y))
x=rcd.Z.isnull().sum()
print("number of missing values is: {}".format(x))

The range is -6.35 - 11.81


number of missing values is: 0

The same process will be apllied in Schedule:

Path:

In [31]:

S_path='C:\\Users\\Paco\\Desktop\\Code.Hub\\Assignment\\Schedule.csv'
s=pd.read_csv(S_path,delimiter=';')

---Description of shape and demonstration of statistical attributes---

First we need to check the types in order to make changes

In [32]:

s.dtypes

Out[32]:

Scope object
ConstructionElementType object
ID object
Act_Code object
Activity_Desc object
BOQ object
START object
FINISH object
Scope_ConstructionElementType object
Scope_ConstructionElementType_BOQ object
Cost Overrrun object
Delay object
dtype: object

We need to change the START and FINISH values as they are dates

In [34]:

s.START=pd.to_datetime(s.START)
s.FINISH=pd.to_datetime(s.FINISH)

http://localhost:8890/notebooks/Project_F.ipynb 11/40
16/04/2018 Project_F

In [35]:

s.dtypes

Out[35]:

Scope object
ConstructionElementType object
ID object
Act_Code object
Activity_Desc object
BOQ object
START datetime64[ns]
FINISH datetime64[ns]
Scope_ConstructionElementType object
Scope_ConstructionElementType_BOQ object
Cost Overrrun object
Delay object
dtype: object

for each categorical column we will find the number of unique values and the number of each time these values
appears

In [36]:

i=0
u, c = np.unique(s['Scope'], return_counts=True)
for z in zip(list(u), list(c)):
i+=1
print('{}: {}'.format(z[0], z[1]))
print ("Number of unique values: {}".format(i))

PRS_BO_STR: 27
PRS_L0_STR: 22
PRS_L1_STR: 16
PRS_L2_STR: 22
PRS_RF_STR: 7
Number of unique values: 5

In [37]:

i=0
u, c = np.unique(s['ConstructionElementType'], return_counts=True)
for z in zip(list(u), list(c)):
i+=1
print('{}: {}'.format(z[0], z[1]))
print ("Number of unique values: {}".format(i))

Beam: 12
Columns: 9
ConcreteWall: 9
Earthwork: 1
Mat Foundation: 3
Parapet: 7
Protection Layer: 21
Ramp: 3
Retaining Wall: 3
Slab: 14
Stair: 12
Number of unique values: 11

http://localhost:8890/notebooks/Project_F.ipynb 12/40
16/04/2018 Project_F

In [38]:

i=0
u, c = np.unique(s['ID'], return_counts=True)
for z in zip(list(u), list(c)):
i+=1
print('{}: {}'.format(z[0], z[1]))
print ("Number of unique values: {}".format(i))

PRS_BO_STR_CWIR01: 1
PRS_BO_STR_CWIR02: 1
PRS_BO_STR_CWIR03: 1
PRS_BO_STR_CWPC01: 1
PRS_BO_STR_CWPC02: 1
PRS_BO_STR_CWPC03: 1
PRS_BO_STR_CWPF01: 1
PRS_BO_STR_CWPF02: 1
PRS_BO_STR_EWEX01: 1
PRS_BO_STR_MFIR01: 1
PRS_BO_STR_MFIR02: 1
PRS_BO_STR_MFPC01: 1
PRS_BO_STR_MFPC02: 1
PRS_BO_STR_MFPF01: 1
PRS_BO_STR_PWMF02: 1
PRS_BO_STR_PWMF03: 1
PRS_BO_STR_PWMF04: 1
PRS_BO_STR_PWMF05: 1
PRS_BO_STR_PWMF06: 1
PRS_BO_STR_PWMF07: 1
PRS_BO_STR_PWMF08: 1
PRS_BO_STR_SRIR01: 1
PRS_BO_STR_SRIR02: 1
PRS_BO_STR_SRPC01: 1
PRS_BO_STR_SRPC02: 1
PRS_BO_STR_SRPF01: 1
PRS_BO_STR_SRPF02: 1
PRS_L0_STR_BSIR01: 1
PRS_L0_STR_BSIR05: 1
PRS_L0_STR_BSPC01: 1
PRS_L0_STR_BSPC05: 1
PRS_L0_STR_BSPF01: 1
PRS_L0_STR_BSPF05: 1
PRS_L0_STR_CPIR01: 1
PRS_L0_STR_CPPC01: 1
PRS_L0_STR_CPPF01: 1
PRS_L0_STR_CWIR01: 1
PRS_L0_STR_CWIR03: 1
PRS_L0_STR_CWPC01: 1
PRS_L0_STR_CWPC03: 1
PRS_L0_STR_CWPF01: 1
PRS_L0_STR_PWPL01: 1
PRS_L0_STR_PWPL02: 1
PRS_L0_STR_PWPL03: 1
PRS_L0_STR_PWPL04: 1
PRS_L0_STR_PWPL05: 1
PRS_L0_STR_STIR01: 1
PRS_L0_STR_STPC01: 1
PRS_L0_STR_STPF01: 1
PRS_L1_STR_BSIR01: 1
PRS_L1_STR_BSIR02: 1
PRS_L1_STR_BSPC01: 1
http://localhost:8890/notebooks/Project_F.ipynb 13/40
16/04/2018 Project_F
PRS_L1_STR_BSPC02: 1
PRS_L1_STR_BSPF01: 1
PRS_L1_STR_BSPF02: 1
PRS_L1_STR_CWIR01: 1
PRS_L1_STR_CWIR03: 1
PRS_L1_STR_CWPC01: 1
PRS_L1_STR_CWPC03: 1
PRS_L1_STR_CWPF01: 1
PRS_L1_STR_PWPL01: 1
PRS_L1_STR_PWPL02: 1
PRS_L1_STR_STIR01: 1
PRS_L1_STR_STPC01: 1
PRS_L1_STR_STPF01: 1
PRS_L2_STR_BSIR01: 1
PRS_L2_STR_BSIR02: 1
PRS_L2_STR_BSPC01: 1
PRS_L2_STR_BSPC02: 1
PRS_L2_STR_BSPF01: 1
PRS_L2_STR_BSPF02: 1
PRS_L2_STR_CPIR01: 1
PRS_L2_STR_CPPC01: 1
PRS_L2_STR_CPPF01: 1
PRS_L2_STR_CWIR01: 1
PRS_L2_STR_CWPC01: 1
PRS_L2_STR_CWPF01: 1
PRS_L2_STR_PWPL01: 1
PRS_L2_STR_PWPL02: 1
PRS_L2_STR_PWPL03: 1
PRS_L2_STR_PWPL04: 1
PRS_L2_STR_PWPL05: 1
PRS_L2_STR_PWPL06: 1
PRS_L2_STR_PWPL07: 1
PRS_L2_STR_STIR01: 1
PRS_L2_STR_STPC01: 1
PRS_L2_STR_STPF01: 1
PRS_RF_STR_BSIR01: 1
PRS_RF_STR_BSIR02: 1
PRS_RF_STR_BSPC01: 1
PRS_RF_STR_BSPC02: 1
PRS_RF_STR_BSPF01: 1
PRS_RF_STR_BSPF02: 1
PRS_RF_STR_PWPL01: 1
Number of unique values: 94

http://localhost:8890/notebooks/Project_F.ipynb 14/40
16/04/2018 Project_F

In [39]:

i=0
u, c = np.unique(s['Act_Code'], return_counts=True)
for z in zip(list(u), list(c)):
i+=1
print('{}: {}'.format(z[0], z[1]))
print ("Number of unique values: {}".format(i))

BSIR01: 5
BSIR02: 3
BSPC01: 5
BSPC02: 3
BSPF01: 5
BSPF02: 3
CPIR01: 2
CPPC01: 2
CPPF01: 2
CWIR01: 4
CWIR02: 1
CWIR03: 3
CWPC01: 4
CWPC02: 1
CWPC03: 3
CWPF01: 4
CWPF02: 1
EWEX01: 1
MFIR01: 1
MFIR02: 1
MFPC01: 1
MFPC02: 1
MFPF01: 1
PWMF02: 1
PWMF03: 1
PWMF04: 1
PWMF05: 1
PWMF06: 1
PWMF07: 1
PWMF08: 1
PWPL01: 12
PWPL02: 1
PWPL03: 1
PWPL04: 1
SRIR01: 1
SRIR02: 1
SRPC01: 1
SRPC02: 1
SRPF01: 1
SRPF02: 1
STIR01: 3
STPC01: 3
STPF01: 3
Number of unique values: 43

http://localhost:8890/notebooks/Project_F.ipynb 15/40
16/04/2018 Project_F

In [40]:

i=0
u, c = np.unique(s['Activity_Desc'], return_counts=True)
for z in zip(list(u), list(c)):
i+=1
print('{}: {}'.format(z[0], z[1]))
print ("Number of unique values: {}".format(i))

Excavation AA: 1
Installing Reinf MF AA: 1
Installing Reinf. Beams L0: 1
Installing Reinf. Col. BO: 1
Installing Reinf. Columns L0: 1
Installing Reinf. Columns L1: 1
Installing Reinf. Con. Walls L0: 1
Installing Reinf. Con. Walls L1: 1
Installing Reinf. Conc.Walls BO: 1
Installing Reinf. MF AA: 1
Installing Reinf. Parapets L0: 1
Installing Reinf. Parapets L2: 1
Installing Reinf. Ramps BO: 1
Installing Reinf. Ret. Walls BO: 1
Installing Reinf. Slabs L0: 1
Installing Reinf. Slabs L1: 1
Installing Reinf. Slabs L2: 1
Installing Reinf. Slabs RF: 1
Installing Reinf. Stairs BO: 1
Installing Reinf. Stairs L0: 1
Installing Reinf. Stairs L1: 1
Installing Reinf. Stairs L2: 1
Installing Reinf.Beams L1: 1
Installing Reinf.Beams L2: 1
Installing Reinf.Beams RF: 1
Installing Reinf.Columns L2: 1
Placing FormWork Beams L0: 1
Placing FormWork Beams L1: 1
Placing FormWork Beams L2: 1
Placing FormWork Beams RF: 1
Placing FormWork Col. BO: 1
Placing FormWork Columns L0: 1
Placing FormWork Columns L1: 1
Placing FormWork Columns L2: 1
Placing FormWork MF AA: 1
Placing FormWork Parapets L0: 1
Placing FormWork Parapets L2: 1
Placing FormWork Ramps BO: 1
Placing FormWork Ret. Walls BO: 1
Placing FormWork Slabs L0: 1
Placing FormWork Slabs L1: 1
Placing FormWork Slabs L2: 1
Placing FormWork Slabs RF: 1
Placing FormWork Stairs BO: 1
Placing FormWork Stairs L0: 1
Placing FormWork Stairs L1: 1
Placing FormWork Stairs L2: 1
Pouring Concrete Beams L0: 1
Pouring Concrete Beams L1: 1
Pouring Concrete Beams L2: 1
Pouring Concrete Beams RF: 1
Pouring Concrete Col. BO: 1
http://localhost:8890/notebooks/Project_F.ipynb 16/40
16/04/2018 Project_F
Pouring Concrete Columns L0: 1
Pouring Concrete Columns L1: 1
Pouring Concrete Columns L2: 1
Pouring Concrete Con. Walls L0: 1
Pouring Concrete Con. Walls L1: 1
Pouring Concrete Conc.Walls BO: 1
Pouring Concrete MF AA: 2
Pouring Concrete Parapets L0: 1
Pouring Concrete Parapets L2: 1
Pouring Concrete Ramps BO: 1
Pouring Concrete Ret. Walls BO: 1
Pouring Concrete Slabs L0: 1
Pouring Concrete Slabs L1: 1
Pouring Concrete Slabs L2: 1
Pouring Concrete Slabs RF: 1
Pouring Concrete Stairs BO: 1
Pouring Concrete Stairs L0: 1
Pouring Concrete Stairs L1: 1
Pouring Concrete Stairs L2: 1
Pr.Mat Foundation & Retaining Walls AA: 7
Protection Layers L0 AA: 5
Protection Layers L1 AA: 2
Protection Layers L2 AA: 7
Protection Layers RF AA: 1
Number of unique values: 76

http://localhost:8890/notebooks/Project_F.ipynb 17/40
16/04/2018 Project_F

In [41]:

i=0
u, c = np.unique(s['BOQ'], return_counts=True)
for z in zip(list(u), list(c)):
i+=1
print('{}: {}'.format(z[0], z[1]))
print ("Number of unique values: {}".format(i))

Bituminous (asphalt) WaterProof: 1


Bituminous WP Paint: 5
C20 for Blinding: 1
C20 for Leveling: 1
C20 for Shotcrete: 1
C20 for lightweight Concrete: 1
C30/37 for Beams: 4
C30/37 for Columns & Con. Walls: 7
C30/37 for Mat Foundation: 1
C30/37 for Parapets: 2
C30/37 for Retaining Walls: 1
C30/37 for Slabs: 6
C30/37 for Stairs: 4
Excavation: 1
Formwork for Beams: 4
Formwork for Columns,CW & RW: 5
Formwork for MatFoundation: 1
Formwork for Parapets: 2
Formwork for Slabs: 5
Formwork for Stairs: 4
Geotextile: 1
HDPE Drainage Membrane: 2
Nylon Vapor Barrier: 1
Polysterine Insulation(DOW)6cm: 3
Polystyrene Insulation(DOW) 3cm: 3
Reinforcement for Beams: 4
Reinforcement for Columns: 4
Reinforcement for ConcreteWalls: 3
Reinforcement for MatFoundation: 1
Reinforcement for Parapets: 2
Reinforcement for Ret.Walls: 1
Reinforcement for Slabs: 6
Reinforcement for Stairs: 4
Screed: 1
Waterstop: 1
Number of unique values: 35

http://localhost:8890/notebooks/Project_F.ipynb 18/40
16/04/2018 Project_F

In [42]:

i=0
u, c = np.unique(s['Scope_ConstructionElementType'], return_counts=True)
for z in zip(list(u), list(c)):
i+=1
print('{}: {}'.format(z[0], z[1]))
print ("Number of unique values: {}".format(i))

PRS_BO_STR_Columns: 3
PRS_BO_STR_ConcreteWall: 2
PRS_BO_STR_Earthwork: 1
PRS_BO_STR_Mat Foundation: 3
PRS_BO_STR_Protection Layer: 7
PRS_BO_STR_Ramp: 3
PRS_BO_STR_Retaining Wall: 3
PRS_BO_STR_Slab: 2
PRS_BO_STR_Stair: 3
PRS_L0_STR_Beam: 3
PRS_L0_STR_Columns: 3
PRS_L0_STR_ConcreteWall: 2
PRS_L0_STR_Parapet: 4
PRS_L0_STR_Protection Layer: 4
PRS_L0_STR_Slab: 3
PRS_L0_STR_Stair: 3
PRS_L1_STR_Beam: 3
PRS_L1_STR_Columns: 3
PRS_L1_STR_ConcreteWall: 2
PRS_L1_STR_Protection Layer: 2
PRS_L1_STR_Slab: 3
PRS_L1_STR_Stair: 3
PRS_L2_STR_Beam: 3
PRS_L2_STR_ConcreteWall: 3
PRS_L2_STR_Parapet: 3
PRS_L2_STR_Protection Layer: 7
PRS_L2_STR_Slab: 3
PRS_L2_STR_Stair: 3
PRS_RF_STR_Beam: 3
PRS_RF_STR_Protection Layer: 1
PRS_RF_STR_Slab: 3
Number of unique values: 31

http://localhost:8890/notebooks/Project_F.ipynb 19/40
16/04/2018 Project_F

In [43]:

i=0
u, c = np.unique(s['Scope_ConstructionElementType_BOQ'], return_counts=True)
for z in zip(list(u), list(c)):
i+=1
print('{}: {}'.format(z[0], z[1]))
print ("Number of unique values: {}".format(i))

PRS_BO_STR_Columns_C30/37forColumns&Con.Walls: 1
PRS_BO_STR_Columns_FormworkforColumns,CW&RW: 1
PRS_BO_STR_Columns_ReinforcementforColumns: 1
PRS_BO_STR_ConcreteWall_C30/37forColumns&Con.Walls: 1
PRS_BO_STR_ConcreteWall_ReinforcementforConcreteWalls: 1
PRS_BO_STR_Earthwork_Excavation: 1
PRS_BO_STR_MatFoundation_C30/37forMatFoundation: 1
PRS_BO_STR_MatFoundation_FormworkforMatFoundation: 1
PRS_BO_STR_MatFoundation_ReinforcementforMatFoundation: 1
PRS_BO_STR_ProtectionLayer_BituminousWPPaint: 1
PRS_BO_STR_ProtectionLayer_C20forBlinding: 1
PRS_BO_STR_ProtectionLayer_C20forLeveling: 1
PRS_BO_STR_ProtectionLayer_C20forShotcrete: 1
PRS_BO_STR_ProtectionLayer_Geotextile: 1
PRS_BO_STR_ProtectionLayer_HDPEDrainageMembrane: 1
PRS_BO_STR_ProtectionLayer_Waterstop: 1
PRS_BO_STR_Ramp_C30/37forSlabs: 1
PRS_BO_STR_Ramp_FormworkforSlabs: 1
PRS_BO_STR_Ramp_ReinforcementforSlabs: 1
PRS_BO_STR_RetainingWall_C30/37forRetainingWalls: 1
PRS_BO_STR_RetainingWall_FormworkforColumns,CW&RW: 1
PRS_BO_STR_RetainingWall_ReinforcementforRet.Walls: 1
PRS_BO_STR_Slab_C30/37forSlabs: 1
PRS_BO_STR_Slab_ReinforcementforSlabs: 1
PRS_BO_STR_Stair_C30/37forStairs: 1
PRS_BO_STR_Stair_FormworkforStairs: 1
PRS_BO_STR_Stair_ReinforcementforStairs: 1
PRS_L0_STR_Beam_C30/37forBeams: 1
PRS_L0_STR_Beam_FormworkforBeams: 1
PRS_L0_STR_Beam_ReinforcementforBeams: 1
PRS_L0_STR_Columns_C30/37forColumns&Con.Walls: 1
PRS_L0_STR_Columns_FormworkforColumns,CW&RW: 1
PRS_L0_STR_Columns_ReinforcementforColumns: 1
PRS_L0_STR_ConcreteWall_C30/37forColumns&Con.Walls: 1
PRS_L0_STR_ConcreteWall_ReinforcementforConcreteWalls: 1
PRS_L0_STR_Parapet_BituminousWPPaint: 1
PRS_L0_STR_Parapet_C30/37forParapets: 1
PRS_L0_STR_Parapet_FormworkforParapets: 1
PRS_L0_STR_Parapet_ReinforcementforParapets: 1
PRS_L0_STR_ProtectionLayer_BituminousWPPaint: 1
PRS_L0_STR_ProtectionLayer_HDPEDrainageMembrane: 1
PRS_L0_STR_ProtectionLayer_PolysterineInsulation(DOW)6cm: 1
PRS_L0_STR_ProtectionLayer_PolystyreneInsulation(DOW)3cm: 1
PRS_L0_STR_Slab_C30/37forSlabs: 1
PRS_L0_STR_Slab_FormworkforSlabs: 1
PRS_L0_STR_Slab_ReinforcementforSlabs: 1
PRS_L0_STR_Stair_C30/37forStairs: 1
PRS_L0_STR_Stair_FormworkforStairs: 1
PRS_L0_STR_Stair_ReinforcementforStairs: 1
PRS_L1_STR_Beam_C30/37forBeams: 1
PRS_L1_STR_Beam_FormworkforBeams: 1
PRS_L1_STR_Beam_ReinforcementforBeams: 1
http://localhost:8890/notebooks/Project_F.ipynb 20/40
16/04/2018 Project_F
PRS_L1_STR_Columns_C30/37forColumns&Con.Walls: 1
PRS_L1_STR_Columns_FormworkforColumns,CW&RW: 1
PRS_L1_STR_Columns_ReinforcementforColumns: 1
PRS_L1_STR_ConcreteWall_C30/37forColumns&Con.Walls: 1
PRS_L1_STR_ConcreteWall_ReinforcementforConcreteWalls: 1
PRS_L1_STR_ProtectionLayer_PolysterineInsulation(DOW)6cm: 1
PRS_L1_STR_ProtectionLayer_PolystyreneInsulation(DOW)3cm: 1
PRS_L1_STR_Slab_C30/37forSlabs: 1
PRS_L1_STR_Slab_FormworkforSlabs: 1
PRS_L1_STR_Slab_ReinforcementforSlabs: 1
PRS_L1_STR_Stair_C30/37forStairs: 1
PRS_L1_STR_Stair_FormworkforStairs: 1
PRS_L1_STR_Stair_ReinforcementforStairs: 1
PRS_L2_STR_Beam_C30/37forBeams: 1
PRS_L2_STR_Beam_FormworkforBeams: 1
PRS_L2_STR_Beam_ReinforcementforBeams: 1
PRS_L2_STR_ConcreteWall_C30/37forColumns&Con.Walls: 1
PRS_L2_STR_ConcreteWall_FormworkforColumns,CW&RW: 1
PRS_L2_STR_ConcreteWall_ReinforcementforColumns: 1
PRS_L2_STR_Parapet_C30/37forParapets: 1
PRS_L2_STR_Parapet_FormworkforParapets: 1
PRS_L2_STR_Parapet_ReinforcementforParapets: 1
PRS_L2_STR_ProtectionLayer_Bituminous(asphalt)WaterProof: 1
PRS_L2_STR_ProtectionLayer_BituminousWPPaint: 1
PRS_L2_STR_ProtectionLayer_C20forlightweightConcrete: 1
PRS_L2_STR_ProtectionLayer_NylonVaporBarrier: 1
PRS_L2_STR_ProtectionLayer_PolysterineInsulation(DOW)6cm: 1
PRS_L2_STR_ProtectionLayer_PolystyreneInsulation(DOW)3cm: 1
PRS_L2_STR_ProtectionLayer_Screed: 1
PRS_L2_STR_Slab_C30/37forSlabs: 1
PRS_L2_STR_Slab_FormworkforSlabs: 1
PRS_L2_STR_Slab_ReinforcementforSlabs: 1
PRS_L2_STR_Stair_C30/37forStairs: 1
PRS_L2_STR_Stair_FormworkforStairs: 1
PRS_L2_STR_Stair_ReinforcementforStairs: 1
PRS_RF_STR_Beam_C30/37forBeams: 1
PRS_RF_STR_Beam_FormworkforBeams: 1
PRS_RF_STR_Beam_ReinforcementforBeams: 1
PRS_RF_STR_ProtectionLayer_BituminousWPPaint: 1
PRS_RF_STR_Slab_C30/37forSlabs: 1
PRS_RF_STR_Slab_FormworkforSlabs: 1
PRS_RF_STR_Slab_ReinforcementforSlabs: 1
Number of unique values: 94

http://localhost:8890/notebooks/Project_F.ipynb 21/40
16/04/2018 Project_F

In [46]:

i=0
u, c = np.unique(s['Cost Overrrun'], return_counts=True)
for z in zip(list(u), list(c)):
i+=1
print('{}: {}'.format(z[0], z[1]))
print ("Number of unique values: {}".format(i))

----------------------------------------------------------------------
-----
TypeError Traceback (most recent call
last)
<ipython-input-46-ce44a94148b9> in <module>()
1 i=0
----> 2 u, c = np.unique(s['Cost Overrrun'], return_counts=True)
3 for z in zip(list(u), list(c)):
4 i+=1
5 print('{}: {}'.format(z[0], z[1]))

~\Anaconda3\lib\site-packages\numpy\lib\arraysetops.py in unique(ar, r
eturn_index, return_inverse, return_counts, axis)
208 ar = np.asanyarray(ar)
209 if axis is None:
--> 210 return _unique1d(ar, return_index, return_inverse, ret
urn_counts)
211 if not (-ar.ndim <= axis < ar.ndim):
212 raise ValueError('Invalid axis kwarg specified for uni
que')

~\Anaconda3\lib\site-packages\numpy\lib\arraysetops.py in _unique1d(a
r, return_index, return_inverse, return_counts)
275 aux = ar[perm]
276 else:
--> 277 ar.sort()
278 aux = ar
279 flag = np.concatenate(([True], aux[1:] != aux[:-1]))

TypeError: '<' not supported between instances of 'str' and 'float'

In [91]:

x=oldest_Date=s['START'].min()
y=newest_Date=s['START'].max()
x = str(x)
y = str(y)
date1 = x.split(" ")[0]
date2 = y.split(" ")[0]
print("Date range is: {} - {}".format(date1,date2))

Date range is: 2015-01-20 - 2015-06-11

http://localhost:8890/notebooks/Project_F.ipynb 22/40
16/04/2018 Project_F

In [92]:

x=oldest_Date=s['FINISH'].min()
y=newest_Date=s['FINISH'].max()
x = str(x)
y = str(y)
date1 = x.split(" ")[0]
date2 = y.split(" ")[0]
print("Date range is: {} - {}".format(date1,date2))

Date range is: 2015-02-17 - 2015-06-13

Part 2: Handling the missing data in the dataframes

In [ ]:

Raw Construction Data:

We need to fill data in unit price , height , length and thickness:

We know that Quantity*UnitPrice=TotalCost. So Unitprice should be equal to TotalCost⧵Quality

In [53]:

for r in range(len(rcd.UnitPrice)):
if np.isnan(rcd.UnitPrice[r]):
rcd.UnitPrice[r]=(float(rcd.TotalCost[r])/float(rcd.Quantity[r]))

C:\Users\Paco\Anaconda3\lib\site-packages\ipykernel_launcher.py:3: Set
tingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-


docs/stable/indexing.html#indexing-view-versus-copy (http://pandas.pyd
ata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy)
This is separate from the ipykernel package so we can avoid doing im
ports until

Now we need to check if the column is filled

In [55]:

any(rcd.UnitPrice.isna())

Out[55]:

False

The approach/assumption we made is to fill the missing data from height,length and thickness with the
average of those values depending on their BOQCategory

Example: We filled the missing values of height with the BOQ Category "Concrete" with the average of the
excisting values which have the same category

http://localhost:8890/notebooks/Project_F.ipynb 23/40
16/04/2018 Project_F

First we need to see the BOQ Categories

In [57]:

list=[]
for r in range(len(rcd.Height)):
if np.isnan(rcd.Height[r]):
list.append(rcd.BOQCategory[r])
print(np.unique(list))

['Concrete' 'Earth Moving' 'Formwork' 'Protection Layers' 'Reinforceme


nt']

For the next step we calculate the averages of all the categories for height length and thickness

In [58]:

#average of heights with the concrete boqcategory


sum1=0
j=0
for r in range(len(rcd.Height)):
if rcd.BOQCategory[r]=='Concrete':
if (np.isnan(rcd.Height[r])==False):
j=j+1
sum1=sum1+rcd.Height[r]
x=round((sum1/j),2)
print(x)

1.69

In [59]:

#average of heights with the earthmoving boqcategory


#j=0
sum1=0
j=0
for r in range(len(rcd.Height)):
if rcd.BOQCategory[r]=='Earth Moving':
if (np.isnan(rcd.Height[r])==False):
j=j+1
sum1=sum1+rcd.Height[r]
y=round((sum1/j),2)
print(y)

----------------------------------------------------------------------
-----
ZeroDivisionError Traceback (most recent call
last)
<ipython-input-59-0b2560786b2d> in <module>()
8 j=j+1
9 sum1=sum1+rcd.Height[r]
---> 10 y=round((sum1/j),2)
11 print(y)

ZeroDivisionError: division by zero

That means that j is zero which means that there are no excisting values of height with the category earth

http://localhost:8890/notebooks/Project_F.ipynb 24/40
16/04/2018 Project_F

moving

In [60]:

#average of heights with the formwork boqcategory

sum1=0
j=0
for r in range(len(rcd.Height)):
if rcd.BOQCategory[r]=='Formwork':
if (np.isnan(rcd.Height[r])==False):
j=j+1
sum1=sum1+rcd.Height[r]
heb=round((sum1/j),2)
print(heb)

1.28

In [61]:

#average of heights with the protection layers boqcategory

sum1=0
j=0
for r in range(len(rcd.Height)):
if rcd.BOQCategory[r]=='Protection Layers':
if (np.isnan(rcd.Height[r])==False):
j=j+1
sum1=sum1+rcd.Height[r]
hplb=round((sum1/j),2)
print(hplb)

1.87

In [62]:

#average of heights with the reinforncement boqcategory


#j=0
sum1=0
j=0
for r in range(len(rcd.Height)):
if rcd.BOQCategory[r]=='Reinforcement':
if (np.isnan(rcd.Height[r])==False):
j=j+1
sum1=sum1+rcd.Height[r]
hrb=round((sum1/j),2)
print(hrb)

1.68

Now we will fill the missing values with the averages except for those with boqcategory of earthmoving. in
those we decided to fill them with 0

http://localhost:8890/notebooks/Project_F.ipynb 25/40
16/04/2018 Project_F

In [63]:

for r in range(len(rcd)):
if np.isnan(rcd.Height[r]):
if rcd.BOQCategory[r]=='Concrete':
rcd.Height[r] = x
if rcd.BOQCategory[r]=='Formwork':
rcd.Height[r] = heb
if rcd.BOQCategory[r]=='Protection Layers':
rcd.Height[r] = hplb
if rcd.BOQCategory[r]=='Reinforcement':
rcd.Height[r] = hrb
if rcd.BOQCategory[r]=='Earth Moving':
rcd.Height[r] = 0

C:\Users\Paco\Anaconda3\lib\site-packages\ipykernel_launcher.py:6: Set
tingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-


docs/stable/indexing.html#indexing-view-versus-copy (http://pandas.pyd
ata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy)

C:\Users\Paco\Anaconda3\lib\site-packages\ipykernel_launcher.py:4: Set
tingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-


docs/stable/indexing.html#indexing-view-versus-copy (http://pandas.pyd
ata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy)
after removing the cwd from sys.path.
C:\Users\Paco\Anaconda3\lib\site-packages\ipykernel_launcher.py:10: Se
ttingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-


docs/stable/indexing.html#indexing-view-versus-copy (http://pandas.pyd
ata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy)
# Remove the CWD from sys.path while we load stuff.
C:\Users\Paco\Anaconda3\lib\site-packages\ipykernel_launcher.py:12: Se
ttingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-


docs/stable/indexing.html#indexing-view-versus-copy (http://pandas.pyd
ata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy)
if sys.path[0] == '':
C:\Users\Paco\Anaconda3\lib\site-packages\ipykernel_launcher.py:8: Set
tingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-


docs/stable/indexing.html#indexing-view-versus-copy (http://pandas.pyd
ata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy)

check if everything is filled

http://localhost:8890/notebooks/Project_F.ipynb 26/40
16/04/2018 Project_F

In [64]:

any(rcd.Height.isna())

Out[64]:

False

In [65]:

#average of lengths with the concrete boqcategory


sum1=0
j=0
for r in range(len(rcd.Length)):
if rcd.BOQCategory[r]=='Concrete':
if (np.isnan(rcd.Length[r])==False):
j=j+1
sum1=sum1+rcd.Length[r]
lcb=round((sum1/j),2)
print(lcb)

5.32

In [66]:

#average of lengths with the earthmoving boqcategory


#j=0
sum1=0
j=0
for r in range(len(rcd.Length)):
if rcd.BOQCategory[r]=='Earth Moving':
if (np.isnan(rcd.Length[r])==False):
j=j+1
sum1=sum1+rcd.Length[r]
leb=round((sum1/j),2)
print(leb)

----------------------------------------------------------------------
-----
ZeroDivisionError Traceback (most recent call
last)
<ipython-input-66-62367c54e052> in <module>()
8 j=j+1
9 sum1=sum1+rcd.Length[r]
---> 10 leb=round((sum1/j),2)
11 print(leb)

ZeroDivisionError: division by zero

http://localhost:8890/notebooks/Project_F.ipynb 27/40
16/04/2018 Project_F

In [67]:

#average of lengths with the formwork boqcategory


sum1=0
j=0
for r in range(len(rcd.Length)):
if rcd.BOQCategory[r]=='Formwork':
if (np.isnan(rcd.Length[r])==False):
j=j+1
sum1=sum1+rcd.Length[r]
lfb=round((sum1/j),2)
print(lfb)

5.23

In [68]:

#average of lengths with the protectionlayers boqcategory


sum1=0
j=0
for r in range(len(rcd.Length)):
if rcd.BOQCategory[r]=='Protection Layers':
if (np.isnan(rcd.Length[r])==False):
j=j+1
sum1=sum1+rcd.Length[r]
lplb=round((sum1/j),2)
print(lplb)

6.43

In [69]:

#average of lengths with the Reinforcement boqcategory


sum1=0
j=0
for r in range(len(rcd.Length)):
if rcd.BOQCategory[r]=='Reinforcement':
if (np.isnan(rcd.Length[r])==False):
j=j+1
sum1=sum1+rcd.Length[r]
lrb=round((sum1/j),2)
print(lrb)

5.15

http://localhost:8890/notebooks/Project_F.ipynb 28/40
16/04/2018 Project_F

In [70]:

for r in range(len(rcd)):
if np.isnan(rcd.Length[r]):
if rcd.BOQCategory[r]=='Concrete':
rcd.Length[r] = lcb
if rcd.BOQCategory[r]=='Formwork':
rcd.Length[r] = lfb
if rcd.BOQCategory[r]=='Protection Layers':
rcd.Length[r] = lplb
if rcd.BOQCategory[r]=='Reinforcement':
rcd.Length[r] = lrb
if rcd.BOQCategory[r]=='Earth Moving':
rcd.Length[r] = 0

C:\Users\Paco\Anaconda3\lib\site-packages\ipykernel_launcher.py:6: Set
tingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-


docs/stable/indexing.html#indexing-view-versus-copy (http://pandas.pyd
ata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy)

C:\Users\Paco\Anaconda3\lib\site-packages\ipykernel_launcher.py:4: Set
tingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-


docs/stable/indexing.html#indexing-view-versus-copy (http://pandas.pyd
ata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy)
after removing the cwd from sys.path.
C:\Users\Paco\Anaconda3\lib\site-packages\ipykernel_launcher.py:10: Se
ttingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-


docs/stable/indexing.html#indexing-view-versus-copy (http://pandas.pyd
ata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy)
# Remove the CWD from sys.path while we load stuff.
C:\Users\Paco\Anaconda3\lib\site-packages\ipykernel_launcher.py:12: Se
ttingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-


docs/stable/indexing.html#indexing-view-versus-copy (http://pandas.pyd
ata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy)
if sys.path[0] == '':
C:\Users\Paco\Anaconda3\lib\site-packages\ipykernel_launcher.py:8: Set
tingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-


docs/stable/indexing.html#indexing-view-versus-copy (http://pandas.pyd
ata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy)

http://localhost:8890/notebooks/Project_F.ipynb 29/40
16/04/2018 Project_F

In [71]:

any(rcd.Length.isna())

Out[71]:

False

In [72]:

#average of thickness with the concrete boqcategory


sum1=0
j=0
for r in range(len(rcd.Thickness)):
if rcd.BOQCategory[r]=='Concrete':
if (np.isnan(rcd.Thickness[r])==False):
j=j+1
sum1=sum1+rcd.Thickness[r]
tcb=round((sum1/j),2)
print(tcb)

0.4

In [73]:

#average of thickness with the earth moving boqcategory


#j=0
sum1=0
j=0
for r in range(len(rcd.Thickness)):
if rcd.BOQCategory[r]=='Earth Work':
if (np.isnan(rcd.Thickness[r])==False):
j=j+1
sum1=sum1+rcd.Thickness[r]
emb=round((sum1/j),2)
print(emb)

----------------------------------------------------------------------
-----
ZeroDivisionError Traceback (most recent call
last)
<ipython-input-73-11c4aa3fe771> in <module>()
8 j=j+1
9 sum1=sum1+rcd.Thickness[r]
---> 10 emb=round((sum1/j),2)
11 print(emb)

ZeroDivisionError: division by zero

http://localhost:8890/notebooks/Project_F.ipynb 30/40
16/04/2018 Project_F

In [74]:

#average of thickness with the formwork boqcategory

sum1=0
j=0
for r in range(len(rcd.Thickness)):
if rcd.BOQCategory[r]=='Formwork':
if (np.isnan(rcd.Thickness[r])==False):
j=j+1
sum1=sum1+rcd.Thickness[r]
teb=round((sum1/j),2)
print(teb)

0.28

In [75]:

#average of thickness with the protection layers boqcategory

sum1=0
j=0
for r in range(len(rcd.Thickness)):
if rcd.BOQCategory[r]=='Protection Layers':
if (np.isnan(rcd.Thickness[r])==False):
j=j+1
sum1=sum1+rcd.Thickness[r]
tplb=round((sum1/j),2)
print(tplb)

0.12

In [76]:

#average of thickness with the Reinforcement layers boqcategory

sum1=0
j=0
for r in range(len(rcd.Thickness)):
if rcd.BOQCategory[r]=='Reinforcement':
if (np.isnan(rcd.Thickness[r])==False):
j=j+1
sum1=sum1+rcd.Thickness[r]
trb=round((sum1/j),2)
print(trb)

0.39

http://localhost:8890/notebooks/Project_F.ipynb 31/40
16/04/2018 Project_F

In [77]:

for r in range(len(rcd)):
if np.isnan(rcd.Thickness[r]):
if rcd.BOQCategory[r]=='Concrete':
rcd.Thickness[r] = lcb
if rcd.BOQCategory[r]=='Formwork':
rcd.Thickness[r] = lfb
if rcd.BOQCategory[r]=='Protection Layers':
rcd.Thickness[r] = lplb
if rcd.BOQCategory[r]=='Reinforcement':
rcd.Thickness[r] = lrb
if rcd.BOQCategory[r]=='Earth Moving':
rcd.Thickness[r] = 0

C:\Users\Paco\Anaconda3\lib\site-packages\ipykernel_launcher.py:6: Set
tingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-


docs/stable/indexing.html#indexing-view-versus-copy (http://pandas.pyd
ata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy)

C:\Users\Paco\Anaconda3\lib\site-packages\ipykernel_launcher.py:4: Set
tingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-


docs/stable/indexing.html#indexing-view-versus-copy (http://pandas.pyd
ata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy)
after removing the cwd from sys.path.
C:\Users\Paco\Anaconda3\lib\site-packages\ipykernel_launcher.py:10: Se
ttingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-


docs/stable/indexing.html#indexing-view-versus-copy (http://pandas.pyd
ata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy)
# Remove the CWD from sys.path while we load stuff.
C:\Users\Paco\Anaconda3\lib\site-packages\ipykernel_launcher.py:12: Se
ttingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-


docs/stable/indexing.html#indexing-view-versus-copy (http://pandas.pyd
ata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy)
if sys.path[0] == '':
C:\Users\Paco\Anaconda3\lib\site-packages\ipykernel_launcher.py:8: Set
tingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-


docs/stable/indexing.html#indexing-view-versus-copy (http://pandas.pyd
ata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy)

http://localhost:8890/notebooks/Project_F.ipynb 32/40
16/04/2018 Project_F

In [78]:

any(rcd.Length.isna())

Out[78]:

False

Now we will figure out how what materials are used for building walls in the schedule table

In [79]:

def contains_walls(x):
return 'walls' in x.lower()

In [80]:

list1=[]
for i in range(len(s.BOQ)):
if contains_walls(s.BOQ[i])==True:
list1.append(s.ConstructionElementType[i])
print(np.unique(list1))

['Columns' 'ConcreteWall' 'Retaining Wall']

We will split the variable Quantity into 2 groups low if the quantity value is lower than 10 or higher if quantity
higher than 10.Those values will be stored in a new column called BinQuantity

In [81]:

BinQuantity=[]
for r in range(len(rcd.Quantity)):
if rcd.Quantity[r]>=10.0:
BinQuantity.append('high')
else:
BinQuantity.append('low')

rcd['BinQuantity'] =BinQuantity

http://localhost:8890/notebooks/Project_F.ipynb 33/40
16/04/2018 Project_F

In [82]:

rcd.head()

Out[82]:

Construction Construction
Scope Element ID Element ConstructionElementPart BOQCategory B
Type Family

C30
0 PRS_RF_STR Beam 47933 ConcreteWork RC_Beam C-C/C-CW Concrete
Bea

C30
1 PRS_RF_STR Beam 47951 ConcreteWork RC_Beam C-C/C-CW Concrete
Bea

C30
2 PRS_RF_STR Beam 47942 ConcreteWork RC_Beam C-C/C-CW Concrete
Bea

C30
3 PRS_RF_STR Beam 47960 ConcreteWork RC_Beam C-C/C-CW Concrete
Bea

C30
4 PRS_L2_STR Beam 46084 ConcreteWork RC_Beam C-C/C-CW Concrete
Bea

In [94]:

###### DATA VIS######

In [95]:

import matplotlib.pyplot as plt


import seaborn as sns
%matplotlib inline

http://localhost:8890/notebooks/Project_F.ipynb 34/40
16/04/2018 Project_F

In [96]:

labels = pd.value_counts(rcd.BOQCategory).keys()
sizes = pd.value_counts(rcd.BOQCategory)

plt.pie(sizes, shadow=True, autopct='%1.1f%%')#, labels=labels)

plt.title('BOQ Category')
plt.legend(labels, bbox_to_anchor=(1, 1.05))

Out[96]:

<matplotlib.legend.Legend at 0x1449ec0fe10>

http://localhost:8890/notebooks/Project_F.ipynb 35/40
16/04/2018 Project_F

In [97]:

import seaborn as sns

plt.style.use('seaborn') # use different plot style


# we can view available styles by: print(plt.style.available)

rcd.Length = rcd.Length .astype(int) # cast column as integer

# Statistical information:
# print('Average age: {:.2f}%'.format(df.horsepower.mean()*100))
# print('Standard deviation: {:.2f}%'.format(df.horsepower.std()*100))
# print('Skewness: {:.2f}%'.format(df.horsepower.skew()*100))
# print('Kurtosis: {:.2f}%'.format(df.horsepower.kurtosis()*100))

# Distplot:
ax = sns.distplot(rcd.Length )

# Auxiliary information:
mn = rcd.Length.mean()
mx = ax.lines[0].get_ydata().max()

# Plot median line:


ax.plot([mn]*2, [0, mx])

# Title:
ax.set_title('Length in the DataFrame')

# Annotation:
plt.annotate('mean', [mn, mx], xytext=[mn*1.1, mx*1.1], fontsize=10,
arrowprops=dict(arrowstyle="->", connectionstyle="arc3,rad=.2", color=

Out[97]:

Text(5.6112,0.199828,'mean')

http://localhost:8890/notebooks/Project_F.ipynb 36/40
16/04/2018 Project_F

In [110]:

#plt.scatter(x=Q,y=T)
rcd[rcd.Quantity < 3000]
#[Q<5000,T<50000]
sns.regplot(rcd[(rcd.Quantity < 3000) & (rcd.TotalCost < 30000)].Quantity,rcd[(rcd.Q

Out[110]:

<matplotlib.axes._subplots.AxesSubplot at 0x144a25eb0b8>

In [ ]:

http://localhost:8890/notebooks/Project_F.ipynb 37/40
16/04/2018 Project_F

In [118]:

labels = ['ConcreteWall', 'Columns', 'Retaining Wall']


sizes = [640, 595, 551]
colors = ['green', 'yellowgreen', 'lightcoral']
explode = (0.1, 0.1, 0.1)
plt.pie(sizes,explode=explode, labels=labels, colors=colors,
shadow=True, startangle=0,autopct='%1.1f%%')
plt.axis('equal')
plt.title('pie BOQ', size=20)

Out[118]:

Text(0.5,1,'pie BOQ')

http://localhost:8890/notebooks/Project_F.ipynb 38/40
16/04/2018 Project_F

In [126]:

rcd[rcd.TotalCost<1000]
sns.boxplot(rcd[rcd.TotalCost < 1000].BOQCategory,rcd[rcd.TotalCost < 1000].TotalCos
plt.title("TotalCost-BOQCategory")

rcd[rcd.TotalCost<1000]
rcd[rcd.TotalCost<1000]
sns.barplot(rcd[rcd.TotalCost < 1000].BOQCategory,rcd[rcd.TotalCost < 1000].TotalCos
plt.title("TotalCost-BOQCategory")

Out[126]:

Text(0.5,1,'TotalCost-BOQCategory')

http://localhost:8890/notebooks/Project_F.ipynb 39/40
16/04/2018 Project_F

In [106]:

sns.barplot(x=rcd[rcd.TotalCost < 1000].BOQCategory, y=rcd[rcd.TotalCost < 1000].Tot


plt.xticks(rotation='vertical')
plt.title('TotalCost vs BOQCategory') #low high

Out[106]:

Text(0.5,1,'TotalCost vs BOQCategory')

In [ ]:

In [ ]:

http://localhost:8890/notebooks/Project_F.ipynb 40/40

You might also like