Professional Documents
Culture Documents
PRAKTIKUM
PEMROGRAMAN BIG DATA
OLEH:
NAMA : SASKYA LIDAYANI
NIM : F1A220099
KELOMPOK : I (SATU)
Output:
Output:
Index(['id', 'bedrooms', 'sqft_living', 'sqft_lot',
'waterfront', 'view', 'condition', 'grade', 'sqft_above',
'sqft_basement', 'yr_built', 'yr_renovated', 'zipcode',
'sqft_living15', 'sqft_lot15'], dtype='object')
Output:
Output:
Program:
X = df[['lat', 'long', 'sqft_living15', 'sqft_lot15']]
y = df['price']
Output:
X_train (11223, 14)
X_test (2806, 14)
y_train (11223,)
y_test (2806,)
Program: Mengimplementasikan testing data
#mengimplementasikan testing data dan hasil prediksi dalam
confusion matrix
cm = confusion_matrix(y_test, y_predict)
Output:
Program:
#menggunakan SVM library untuk membuat SVM classifier
classifier = svm.SVC(kernel = 'linear')
Output:
precision recall f1-score support
Output:
Output:
Program:
X = df[['lat', 'long', 'sqft_living15', 'sqft_lot15']]
y = df['price']
Program:
# Menentukan X Train dan X Test
from sklearn.model_selection import train_test_split
Output:
id int64
bedrooms int64
sqft_living int64
sqft_lot int64
view int64
condition int64
grade int64
sqft_above int64
sqft_basement int64
yr_built int64
yr_renovated int64
zipcode int64
sqft_living15 int64
sqft_lot15 int64
dtype: object
Program:
cols = X_train.columns
from sklearn.preprocessing import RobustScaler
scaler = RobustScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)
X_train = pd.DataFrame(X_train, columns=[cols])
X_test = pd.DataFrame(X_test, columns=[cols])
X_train.head()
Output:
Program:
# train a Gaussian Naive Bayes classifier on the training set
from sklearn.naive_bayes import GaussianNB
y_pred
Output:
array([0, 0, 0, ..., 0, 0, 0])
Program:
from sklearn.metrics import accuracy_score
Output:
Training-set accuracy score: 0.9665
Training set score: 0.9665
Test set score: 0.9641
0 4183
1 26
Name: waterfront, dtype: int64
Program:
null_accuracy = (7407/(7407+2362))
[[4035 148]
[ 3 23]]
Output:
Program:
from sklearn.metrics import classification_report
print(classification_report(y_test, y_pred))
Output:
precision recall f1-score support
Interpretasi: