You are on page 1of 4

Data analysis

Basic tasks of data analysis


• Get data
• Explore
• Clean
• Summarize
• Transform
• Model
• Deploy the model as an application
Python tools for data analysis
• Jupyterlab, Jupyter Notebooks or Google Colab are commonly used
for data analysis because of communicative nature of data analysis
• Jupyterlab can be installed as a standalone application or using
Anaconda
• Reason for using jupyter-like environment is that results must be explained in
plain language
• Markdown syntax can be used for text, formulas can be embedded using
Latex-syntax (online editor)
Python libraries
• Pandas, creation of data frames, data summary, cleaning and
transformation
• Matplotlib, Seaborn, Plotly, Bokeh, Altair… visualization
• Matplotlib is non-interactive and low level
• Matplotlib can plot everything, but complex plots need a lot of code
• Other libraries are interactive and higher level
• Less coding, but more constraints on what you can plot
• Installation in jupyterlab using %pip for example %pip install plotly
• SciKit-Learn, NumPy, Statsmodels, ... Modeling
• Dash, web application development

You might also like