Professional Documents
Culture Documents
Muhd Fakhrullah 5B
1) It involves of cleaning the data that is irrelevant, duplicated, incomplete or need to convert
the data to numerical values if the data is text-based. The consequence of not cleaning your
data is our model will learn bad pattern and produce the wrong result
3) Skikit-Learn – these libraries will ease our work by explicitly program an algorithm
df = pd.read_csv('vgsales.csv')
df
2) df.describe()
3) Input
import pandas as pd
music_data = pd.read_csv('music.csv')
X = music_data.drop(columns=['genre'])
Output
import pandas as pd
music_data = pd.read_csv('music.csv')
X = music_data.drop(columns=['genre'])
Y = music_data['genre']
music_data = pd.read_csv('music.csv')
X = music_data.drop(columns=['genre'])
Y = music_data['genre']
model = DecisionTreeClassifier()
model.fit(X, Y)
predictions
7) import pandas as pd
import joblib
music_data = pd.read_csv('music.csv')
X = music_data.drop(columns=['genre'])
Y = music_data['genre']
model = DecisionTreeClassifier()
model.fit(X, Y)
joblib.dump(model, 'music-recommender.joblib')
8) import pandas as pd
music_data = pd.read_csv('music.csv')
X = music_data.drop(columns=['genre'])
Y = music_data['genre']
model = DecisionTreeClassifier()
model.fit(X_train, Y_train)
predictions = model.predict(X_test)
score