Professional Documents
Culture Documents
Demo Class
Pandas Library in Python
1
CONTENT
1. Python and Its Features
2. Python Libraries
Summary
2
1. Python and Its Features
1. Python and Its Features
1. Python and Its Features
1. Python and Its Features
1. Python and Its Features
1. Python and Its Features
1. Python and Its Features
1. Python and Its Features
1. Python and Its Features
1. Python and Its Features
1. Python and Its Features
1. Python and Its Features
1. Python and Its Features
2. Python Libraries
2. Python Libraries
Step-by-Step Guide to Creating Python Libraries
https://towardsdatascience.com/step-by-step-guide-
to-creating-r-and-python-libraries-e81bbea87911
3. Pandas in Data Analysis
3. Pandas in Data Analysis
3. Pandas in Data Analysis
https://dataindependent.com/pandas/
3. Pandas in Data Analysis
3. Pandas in Data Analysis
3. Pandas in Data Analysis
3. Pandas in Data Analysis
3. Pandas in Data Analysis
Boolean arrays of same size as index can be used for filtering, that is, select rows indexed with True
- df.loc[~df.index.isin([11])] ignores a specific index;
- df[df['A']>0] selects only positive values of column A
- df[(df['A'] > 2) & (df['B'] < 3)] shows a complex expression where use of parenthesis is important.
- df.query('a not in b') instead of df[~df['a'].isin(df['b’])]
- df.query('a in b and c < d') instead of df[df['b'].isin(df['a']) & (df['c'] < df['d'])]
3. Pandas in Data Analysis
Whereas df.where(df.A>0) or df[df.A>0] will return rows that match the condition
To select only rows with null values, use df[df.isnull().any(axis=1)].
Use df[~df.isnull().any(axis=1)] to do the reverse.
For MultiIndex DataFrame, isin() method can be used.
For example, df.loc[df.index.isin([('a','one'), ('b','two')])] or df.loc[df.index.isin(['a','c'], level=0)].
3. Pandas in Data Analysis
df.groupby('A').agg([np.sum, np.mean])
df.groupby('A').agg({'C': np.sum, 'D': lambda x: np.std(x)})
•df.pivot(index='foo', columns='bar', values='baz'): Column 'foo' becomes the index, 'bar' values
become new columns and values of 'baz' becomes values of the new DataFrame. A more
generalized API is df.pivot_table() that allows for duplicate values of an index/column pair.
•df.melt(id_vars=['A','B']): Two columns are retained and other columns are spread into rows. Two
new columns named 'variable' (with old column names as values) and 'value' are introduced. Original
index can be retained but values will be duplicated.
•df.stack(): Columns becomes part of a new inner-most index level. If DataFrame has hierarchical
column labels, level can be specified as argument.
•df.unstack('second') or df.unstack(1): Values of index 'second' are spread into new columns. If no
argument is supplied, the inner-most index is spread.
3. Pandas in Data Analysis
result = pd.merge(df1, df2, how="left",
on=["key1", "key2"])
or
result = pd.merge(df1, df2, left_on='county_ID',
right_on='countyid')
Data Mining
Pipeline and Preprocessing
Exploration and Visualization
Feature Scaling & Engineering
Machine Learning Recommendation System
Regression and Classification
Unsupervised Learning
Ensemble Learning
Association Learning
RNN & LSTM & Deep Neural Network
Deep Learning
Computer Vision Segmentation
Transfer Learning
One and Two-Stage Object Detection
Natural Language Processing
Acoustic Modelling and Processing
Time Series Analysis
Recommendation System
Reinforcement Learning 42
5. VTCA AI SPECIALIST- DATA SCIENTIST
AI Specialist - Data Scientist
(141 hours - Online Course)
Data Mining
Pipeline and Preprocessing
Exploration and Visualization
Feature Scaling & Engineering
Machine Learning Recommendation System
Regression and Classification
Unsupervised Learning
Association & Ensemble Learning
RNN & LSTM & Deep Neural Network
SQL and NoSQL
Data Engineer Database Management
SQL, Execution Plan and Optimization
Database System Management
Big Data Analysis
Web Mining and Security
Data Scientist
Natural Language Processing
Time Series Analysis
Recommendation System
Reinforcement Learning 43
6. Analytics for BI/BA
Business Intelligence Analyst
(96 hours - Online Course)
BI/BA Life Cycle & Strategy
Data Warehouse Operations
Data Mining in BI/BA
Data Collection & Transformation
Explanatory Data Analysis
Data Analysis Expressions
Reporting and Dashboard
Integrating with Azure ML
Data Analytic Pipeline
Descriptive Analytics
Predictive Analytics
Diagnostic Analytics
Prescriptive Analytics
BI&BA Solution
Banking and Finance
HealthCare
Social Media
44
Transport & Logistics
THANK YOU
45
45