Software Development

Python for Data Science

Python for Data Science – Complex Data Engineering in Python

sd_pyds_a02_it_enus

Python for Data Science – Introduction to Python for Data Science

sd_pyds_a01_it_enus

### Python for Data Science – Complex Data Engineering in Python

**Lesson Objectives**

**Python for Data Science – Complex Data Engineering in Python**

- start the course
- use pandas to describe the basic and common functionalities of pandas for Data Science
- use pandas to describe its primary data structures
- use pandas to describe hierarchical indexing
- perform basic data query operations on a pandas DataFrame
- perform aggregation operations on a pandas DataFrame
- perform basic merge operations with pandas DataFrames
- describe the functionality and use of core packages and sub-packages in the SciPy stack
- use the scikit-learn library to perform basic data standardization
- use the scikit-learn library to perform basic data normalization
- use the scikit-learn library to perform simple linear regression analysis
- perform supervised learning by using the scikit-learn library to perform optical recognition of hand-written digits
- use the Python matplotlib library to plot and display a simple 2D line plot and set its line properties
- use the Python matplotlib library to create and customize multiple plots in a single figure
- use the Python matplotlib library to create and customize a box plot
- use the Python matplotlib library to create and display a heat map
- use the Python matplotlib library to place legends and annotations on a 2D line plot
- use pandas to create a scatter plot matrix
- use the Python matplotlib library to create a 3D plot
- create, slice, and resample time series data in Python
- use pandas to create and manipulate Timedeltas in Python
- identify key concepts in Python data cleansing
- perform data preprocessing and text mining in Python
- use pandas to access a MySQL database
- use the SciPy package to describe the various forms of distribution
- manage other concepts and processes in data science

**Overview/Description**

There is a vast toolset that is available for data scientists, with several comprehensive moving parts, especially when it comes to using Python. This course provides the map and dives into data analysis using all the necessary tools with pandas, including machine learning using SciPy operations, working with prediction data, and being introduced to the scikit-learn toolset. Then, the course guides the way to Visualization using Python matplotlib, time series, and many more data engineering operations.

**Target Audience**

Individuals wanting to understand a deeper level of data science using more advanced techniques and operations

### Python for Data Science – Introduction to Python for Data Science

**Lesson Objectives**

**Python for Data Science – Introduction to Python for Data Science**

- start the course
- describe elements of data science and datasets with various modeling and prediction relationships
- recognize the various pipelines in data science and the stages of the data science cycle
- define and describe the various libraries and packages for data analysis
- perform the key steps involved in installing Anaconda including all the necessary packages for this course
- describe the various Python containers for data management
- create lists, tuples, and dictionaries with Python to drive data
- use Python list comprehensions to create lists
- describe the IPython shell and shell commands
- run the Jupyter Notebook and familiarize with the basics of its user interface
- capture Python code output in Jupyter Notebook
- run the Jupyter QT Console and familiarize with the basics of its user interface
- use IPython to perform debugging and error management on Python code
- basic access and usage of the NumPy package in a Python development environment
- describe the various components of NumPy
- describe ndarray object attributes
- describe the various NumPy array operations applicable to data science
- describe different ways of creating NumPy arrays
- describe how Pandas library may be used to read and write various formats of data
- use Pandas library to read data from a CSV file and write data out to a CSV file
- use Python's standard JSON package to read JSON data
- use the pandas library to generate and parse date values
- perform data clean up by handling missing and erroneous data
- download and load a sample dataset into Python from a URL
- load a large dataset as smaller chunks by obtaining an iterator for the dataset
- recognize the main concepts in data science using Python

**Overview/Description**

This course introduces the concepts of data science and provides a brief overview of the Python skills needed to work with data. In this course, you will learn about IPython components, Notebook, and the NumPy module. There's still more to come as the course guides you toward managing financial statistics with financial big data.

**Target Audience**

This path is targeted toward individuals wanting to expand their knowledge of Python while learning data science; IT specialists aspiring to learn a new skillset; statisticians; computer scientists; and IT analysts. Python knowledge is recommended.