You are on page 1of 3
Stellar-Classifier Introduction: For our final project, we worked on a stellar classifier model. The objective is to predict what the type of object was being viewed, whether they were Stars, Galaxies or Quasars. This project had its difficulties but was also very interesting given we were predicting stellar objects, Selection of Data: ‘The data consists of 100,000 observations of space taken by the SDSS (Sloan Digital Sky Survey). Every observation is deseribed by 17 feature columns and 1 class column which identifies it to be either a star, galaxy or a quasar. ‘The features that relate mainly to the scan/camera data rather than the stellar object data that were dropped are the following: ‘obj_ID', 'run_ID', 'rerun_ID'’, ‘field_ID', ‘fiber_ID', 'spec_obj_ID' and 'MJD'. These features were unnecessary for making our predictions. ‘The data has ten numeric features: ‘alpha’, ‘delta’, w’, ‘g’,°, ‘7 ‘plate’. There is one categorical category labeled ‘class’ ”,‘cam_col’, ‘redshift’ and ‘The categorical feature was encoded using OneHotEncoder and transformed so the model predictions could be more accurate. Methods: Tools: - Numpy, Pandas, Matplotlob ~ Seaborn for analysis and visualization - Anaconda for Jupyter Notebook Models used with Scikit: - KNeighborsClassifier - KNeighborsRegressor - RandomForest, Classifier Results: After the study was completed we found the model we trained could predict 96.3% accuracy We would say the results of the project were successful and answered the initial question which was to classify stellar objects. Lesming curve NN nneighbors=200, presictrs'g and rede ; fon Discussion: ‘After exploring the data we found that KNN gave the best accuracy: This was done by increasing the number of n_neighbors the model had to train with to produce the best result, Random forest was also attempted but did not produce the result we hoped for. KNN proved to be the best fi ‘The perspective of future research is that there may be an ability to clas objects in the future such as black holes, different types of stars and maybe even gravitational pull. iy more stellar Going through the top rated notebooks on kaggle, apart from preprocessing the data a bit differently and using different predictors, they also used different algorithms. The popular algorithms are, XGBClassifier, clustering and liblinear, Summary: ‘This stellar prediction project supervised classification models to predict stellar objects. ‘These predictions are based on 10 features: ‘alpha’, ‘delta’ u’, g’, °, %, 7’, ‘cam_col’, ‘redshift’ and ‘plate’. In conclusion, after using several different types of models, the KNN model was the best with a 96.3% accuracy

You might also like