Professional Documents
Culture Documents
ML Activity 3: Participating Students: BETB116 Sandhya Awari BETB120 Harinakshi Kumbhare
ML Activity 3: Participating Students: BETB116 Sandhya Awari BETB120 Harinakshi Kumbhare
Participating Students:
BETB116 Sandhya Awari
BETB120 Harinakshi Kumbhare
Code:
import pandas as pd
dataframe = pd.read_csv("spam.csv")
print(dataframe.describe())
x = dataframe["EmailText"]
y = dataframe["Label"]
x_train,y_train = x[0:4457],y[0:4457]
x_test,y_test = x[4457:],y[4457:]
cv = CountVectorizer()
features = cv.fit_transform(x_train)
model.fit(features,y_train)
print(model.best_params_)
print(model.score(cv.transform(x_test),y_test))
Output:
Label EmailText
count 5572 5572
unique 2 5169
top ham Sorry, I'll call later
freq 4825 30
C:\ProgramData\Anaconda3\lib\site-packages\sklearn\model_selection\_split.p
y:1978: FutureWarning: The default value of cv will change from 3 to 5 in v
ersion 0.22. Specify it explicitly to silence this warning.
warnings.warn(CV_WARNING, FutureWarning)
{'C': 1000, 'gamma': 0.0001, 'kernel': 'rbf'}
0.9865470852017937