Professional Documents
Culture Documents
[9]: df = pd.read_csv("kc_house_data.csv")
df.head()
sqft_lot15
0 5650
1 7639
2 8062
3 5000
4 7503
1
[5 rows x 21 columns]
[10]: df.describe()
2
df.shape
[12]: df[df.columns[df.isnull().sum()>0]].isnull().sum()
Collecting scikit-learn-extra
Downloading scikit_learn_extra-0.3.0-cp311-cp311-win_amd64.whl.metadata (3.7
kB)
Requirement already satisfied: numpy>=1.13.3 in
c:\users\lenovo\appdata\local\programs\python\python311\lib\site-packages (from
scikit-learn-extra) (1.26.0)
Requirement already satisfied: scipy>=0.19.1 in
c:\users\lenovo\appdata\local\programs\python\python311\lib\site-packages (from
scikit-learn-extra) (1.11.4)
Requirement already satisfied: scikit-learn>=0.23.0 in
c:\users\lenovo\appdata\local\programs\python\python311\lib\site-packages (from
scikit-learn-extra) (1.3.2)
Requirement already satisfied: joblib>=1.1.1 in
c:\users\lenovo\appdata\local\programs\python\python311\lib\site-packages (from
scikit-learn>=0.23.0->scikit-learn-extra) (1.3.2)
Requirement already satisfied: threadpoolctl>=2.0.0 in
c:\users\lenovo\appdata\local\programs\python\python311\lib\site-packages (from
scikit-learn>=0.23.0->scikit-learn-extra) (3.2.0)
Downloading scikit_learn_extra-0.3.0-cp311-cp311-win_amd64.whl (340 kB)
---------------------------------------- 0.0/340.5 kB ? eta -:--:--
- -------------------------------------- 10.2/340.5 kB ? eta -:--:--
- -------------------------------------- 10.2/340.5 kB ? eta -:--:--
--- ----------------------------------- 30.7/340.5 kB 325.1 kB/s eta 0:00:01
------------- ------------------------ 122.9/340.5 kB 798.9 kB/s eta 0:00:01
---------------------------------------- 340.5/340.5 kB 1.8 MB/s eta 0:00:00
Installing collected packages: scikit-learn-extra
Successfully installed scikit-learn-extra-0.3.0
Note: you may need to restart the kernel to use updated packages.
3
[-0.00567521, -0.40692359, 0.17558163, …, -0.74637458,
-0.43272969, -0.18787744],
[-0.98081575, -1.50829275, -1.44745951, …, -0.13569228,
1.07009338, -0.17238527],
…,
[-0.37584455, -1.50829275, -1.77206774, …, -0.60435544,
-1.41029422, -0.39414664],
[-0.38156737, -0.40692359, 0.50018986, …, 1.02886466,
-0.84126412, -0.42051628],
[-0.58585659, -1.50829275, -1.77206774, …, -0.60435544,
-1.41029422, -0.41795257]])
sqft_living15 sqft_lot15
0 1340 5650
1 1690 7639
2 2720 8062
3 1360 5000
4 1800 7503
4
[17]: X = df.loc[:, df.columns != 'kmedoids Cluster Labels']
X.head()
5
0.40959143, 1.19928763, -0.27223402, -0.19285837]])
[19]: 0 1
1 0
2 2
3 0
4 2
Name: kmedoids Cluster Labels, dtype: int64
Collecting pywaffleNote: you may need to restart the kernel to use updated
packages.
6
Requirement already satisfied: python-dateutil>=2.7 in
c:\users\lenovo\appdata\local\programs\python\python311\lib\site-packages (from
matplotlib->pywaffle) (2.8.2)
Requirement already satisfied: six>=1.5 in
c:\users\lenovo\appdata\local\programs\python\python311\lib\site-packages (from
python-dateutil>=2.7->matplotlib->pywaffle) (1.16.0)
Downloading pywaffle-1.1.0-py2.py3-none-any.whl (30 kB)
Downloading fontawesomefree-6.5.1-py3-none-any.whl (25.6 MB)
---------------------------------------- 0.0/25.6 MB ? eta -:--:--
---------------------------------------- 0.0/25.6 MB 991.0 kB/s eta 0:00:26
--------------------------------------- 0.3/25.6 MB 4.2 MB/s eta 0:00:07
- -------------------------------------- 0.8/25.6 MB 6.5 MB/s eta 0:00:04
-- ------------------------------------- 1.3/25.6 MB 7.7 MB/s eta 0:00:04
--- ------------------------------------ 2.4/25.6 MB 10.9 MB/s eta 0:00:03
---- ----------------------------------- 3.2/25.6 MB 11.9 MB/s eta 0:00:02
----- ---------------------------------- 3.8/25.6 MB 12.1 MB/s eta 0:00:02
------- -------------------------------- 4.6/25.6 MB 12.7 MB/s eta 0:00:02
------- -------------------------------- 5.1/25.6 MB 12.6 MB/s eta 0:00:02
-------- ------------------------------- 5.7/25.6 MB 12.6 MB/s eta 0:00:02
---------- ----------------------------- 6.4/25.6 MB 12.9 MB/s eta 0:00:02
----------- ---------------------------- 7.1/25.6 MB 12.6 MB/s eta 0:00:02
----------- ---------------------------- 7.7/25.6 MB 12.9 MB/s eta 0:00:02
------------- -------------------------- 8.5/25.6 MB 13.0 MB/s eta 0:00:02
-------------- ------------------------- 9.0/25.6 MB 13.1 MB/s eta 0:00:02
-------------- ------------------------- 9.0/25.6 MB 13.1 MB/s eta 0:00:02
-------------- ------------------------- 9.0/25.6 MB 13.1 MB/s eta 0:00:02
---------------- ----------------------- 10.7/25.6 MB 13.9 MB/s eta 0:00:02
----------------- ---------------------- 11.3/25.6 MB 14.2 MB/s eta 0:00:02
------------------ --------------------- 11.8/25.6 MB 13.6 MB/s eta 0:00:02
------------------- -------------------- 12.3/25.6 MB 13.1 MB/s eta 0:00:02
------------------- -------------------- 12.8/25.6 MB 13.1 MB/s eta 0:00:01
--------------------- ------------------ 13.5/25.6 MB 12.8 MB/s eta 0:00:01
--------------------- ------------------ 13.9/25.6 MB 12.9 MB/s eta 0:00:01
---------------------- ----------------- 14.6/25.6 MB 12.6 MB/s eta 0:00:01
----------------------- ---------------- 14.9/25.6 MB 12.6 MB/s eta 0:00:01
------------------------ --------------- 15.8/25.6 MB 12.6 MB/s eta 0:00:01
------------------------- -------------- 16.0/25.6 MB 12.1 MB/s eta 0:00:01
------------------------- -------------- 16.6/25.6 MB 12.4 MB/s eta 0:00:01
--------------------------- ------------ 17.4/25.6 MB 12.1 MB/s eta 0:00:01
--------------------------- ------------ 17.9/25.6 MB 12.1 MB/s eta 0:00:01
---------------------------- ----------- 18.5/25.6 MB 11.9 MB/s eta 0:00:01
----------------------------- ---------- 19.0/25.6 MB 11.7 MB/s eta 0:00:01
------------------------------ --------- 19.7/25.6 MB 13.1 MB/s eta 0:00:01
------------------------------- -------- 20.4/25.6 MB 12.6 MB/s eta 0:00:01
-------------------------------- ------- 21.0/25.6 MB 12.1 MB/s eta 0:00:01
--------------------------------- ------ 21.6/25.6 MB 12.4 MB/s eta 0:00:01
---------------------------------- ----- 22.2/25.6 MB 12.4 MB/s eta 0:00:01
----------------------------------- ---- 22.9/25.6 MB 12.4 MB/s eta 0:00:01
7
------------------------------------ --- 23.6/25.6 MB 12.6 MB/s eta 0:00:01
------------------------------------- -- 24.2/25.6 MB 12.8 MB/s eta 0:00:01
-------------------------------------- - 24.9/25.6 MB 13.1 MB/s eta 0:00:01
--------------------------------------- 25.6/25.6 MB 13.1 MB/s eta 0:00:01
--------------------------------------- 25.6/25.6 MB 13.1 MB/s eta 0:00:01
--------------------------------------- 25.6/25.6 MB 13.1 MB/s eta 0:00:01
--------------------------------------- 25.6/25.6 MB 13.1 MB/s eta 0:00:01
--------------------------------------- 25.6/25.6 MB 13.1 MB/s eta 0:00:01
--------------------------------------- 25.6/25.6 MB 13.1 MB/s eta 0:00:01
---------------------------------------- 25.6/25.6 MB 9.6 MB/s eta 0:00:00
Installing collected packages: fontawesomefree, pywaffle
Successfully installed fontawesomefree-6.5.1 pywaffle-1.1.0
8
9
[24]: labels = df.groupby(["kmedoids Cluster Labels"], as_index=False).
↪mean()[["kmedoids Cluster Labels", "price"]]
labels
[ ]:
10