You are on page 1of 13

Introduction to

Data Science
DataFrame - Pandas
UTS Score
Good Job!

Ok

!
DataFrame
A DataFrame can be thought of as a multi-
dimensional table or a table of data in an
excel file. It is a multi-dimensional table
structure essentially made up of a collection
of Series. It helps us store tabular data
where each row is an observation, and the
columns represent variables.
DataFrame in Pandas
Two-dimensional, size-mutable, potentially
heterogeneous tabular data.

Data structure also contains labeled axes (rows


and columns). Arithmetic operations align on
both row and column labels. Can be thought of
as a dict-like container for Series objects. The
primary pandas data structure.
DataFrame Basic Utilization
1. Read data
2. View the data
3. Understand some basic information about the data
4. Data Selection – Indexing and Slicing data
5. Data Selection – Based on Conditional filtering
6. Groupby operations
7. Sorting operation
8. Dealing with missing values
9. Dropping columns and null values
10. Apply( ) functions
AoL - Group Project

01 02 03
In this assignment, Students must perform The assignment files
students are expected to exploratory data must be submitted in
solve a case study in a analysis & feature a .rar/.zip (python
group which consists of engineering for the notebook file, dataset,
4 to 5 people chosen dataset & ppt) to lecturer email
max 1 week after
session 13
Group Project
 Kaggle Dataset or UCI Machine learning
 https://www.kaggle.com/datasets
 https://archive.ics.uci.edu/ml/datasets.php
 Scoring aspects
 Report
 Background
 Exploratory Data Analysis
 Conclusion
 Python Code
 Coherence with curriculum
 Tidiness
 Efficiency
Sample
Dataset: Netflix

 Explain the reasons why you choose this dataset


 Explain about this dataset
 Explain your objective
 Explain the benefits of your analysis
Sample
 Explore the data
 Explain the result of your exploration
Sample
 Create a model/analysis that
based on your exploration
 Create a conclusion

You might also like