Courses and certifications Data Science
Data analysis using the Python programming language
Price (without VAT)
Python is an interpreted high-level programming language that is currently among the most popular programming languages overall. It is a so-called multiparadigmatic language that supports various programming paradigms including object-oriented, imperative, procedural, or functional. Although Python is often referred to as a scripting language, its capabilities are actually much broader. It is currently massively used for numerical computations, data analysis, statistical calculations, working with graphs, etc. This course focuses on tools that can be used in Python for data analysis.
Target Audience
Users who need to perform data analysis, from acquiring data from various sources, through processing and analysis, to creating reports.
Course Objective
To acquaint participants with the technologies used for data collection and data analysis. The explanation will focus on the Python programming language and related technologies: IPython, Jupyter Notebook, NumPy, and Pandas.
Course Outline
Introduction to the IPython environment
- Online data analysis tools
- Jupyter Notebook
- Architecture
- Installation
- JupyterLab
- Exporting outputs to PDF and other formats
Overview of data structures in Python
- Variable
- Array
- Structure
- Object
- List
- Tuple
Pandas Library
- Displaying the contents of data frames
- Graph plotting and data validation
- Working with data series
NumPy Library
- Data types of elements
- Array constructors
Importing data from various sources
- Tabular formats and processors (Excel, CSV)
- Databases (SQL)
- TSV
- JSON
- HTML Scraping
Data Processing
- Transformation of tabular data
- Adding missing values
- Tracking
- Data merging
Advanced Data Processing
- Merging data frames using append, concat, merge, and join
- Stacking, unstacking
- Melting
Data joining and aggregation
- Splitting data into groups based on selected criteria
- Using functions
- Combining results into a data structure
- Transformation
- Filtration
Visualization
- Generating graphs
- Scatter, bar, and line plots
- KDE (Kernel density estimation) plot
- Plotting values from a data series
- Grouping data into graphs
Additional Topics
Time series functions
- Overview
- Timestamps
- Time spans
- Timedelta
- DateTimeIndex
Participant Prerequisites
Basic programming knowledge, at least a rudimentary understanding of Python, R language, or statistical and analytical functions in Excel.
Additional Requirements
- Computer with any operating system, ideally Linux (not mandatory)
- Web browser
- Terminal (console)