### Python for Data Science: Advanced Data Visualization Using Seaborn

**Lesson Objectives**

**Python for Data Science: Advanced Data Visualization Using Seaborn**

- work with Seaborn to glean patterns in a dataset by visualizing the relationships between several pairs of variables
- define the aesthetic parameters for a plot and make use of Seaborn's built-in templates for creating shareable graphs
- recognize what a normal distribution is and what is defined as an outlier
- use boxplots and violin plots to visualize the distributions of data within specific categories of your dataset
- compare the use cases for swarm plots, bar plots strip plots, and categorical plots
- create a FacetGrid to visualize distributions within a range of categories
- configure a FacetGrid to convey more information and to draw one's focus to specific plots
- describe what a color palette is and select from the built-in color palettes available
- identify the kinds of color palettes to use depending on the type of data it will represent
- recall different ways to visualize data within categories and identify use cases for specific aesthetic parameters

**Overview/Description**

Explore how to analyze continuous and categorical variables in a dataset using various plotting options in Seaborn. These include box and violin plots, FacetGrids, and aesthetic elements such as color palettes.

**Target**

**Prerequisites: none**

### Python for Data Science: Advanced Operations with NumPy Arrays

**Lesson Objectives**

**Python for Data Science: Advanced Operations with NumPy Arrays**

- identify different ways in which arrays can be split up
- describe how grayscale and color images can be represented as multi-dimensional arrays
- perform some basic image manipulation after converting images to arrays
- create a view into a NumPy array and learn of the relationship between views and their base arrays
- compare deep copies of arrays with views and know when to use each of them
- use fancy indexing with arrays using an index mask
- use fancy indexing to analyze real-world data
- apply boolean masks to access array elements which fulfil a specific condition
- use structured arrays in order to store heterogeneous data
- describe how operations can be performed between arrays of mismatched shapes using broadcasting
- perform operations between arrays of mismatched shapes by applying broadcasting rules
- utilize NumPy to perform multi-dimensional array operations

**Overview/Description**

NumPy is a Python library that works with arrays when performing scientific computing with Python. Explore advanced array operations such as image manipulation, fancy indexing, views and broadcasting.

**Target**

**Prerequisites: none**

### Python for Data Science: Basic Data Visualization Using Seaborn

**Lesson Objectives**

**Python for Data Science: Basic Data Visualization Using Seaborn**

- describe what Seaborn is and how it relates to other data science libraries in Python
- install Seaborn and load a dataset for analysis
- define and plot the distribution of a single variable using a histogram and kernel density estimate curve
- configure an univariate distribution's appearance, including color, size, and the components of the plot
- analyze the relationship between two variables by plotting a bivariate distribution
- distinguish between scatter plots, hexbin plots, and KDE plots
- use the Seaborn pair plot to generate a grid to plot the relationship between multiple pairs of variables in your dataset
- perform a regression analysis on a pair of variables in your dataset by using the Seaborn lmplot
- describe the basic aesthetic themes and styles available in Seaborn
- recall some of the use cases and features of Seaborn

**Overview/Description**

Seaborn is a data visualization library that provides a high-level interface for drawing graphs. These graphs are able to convey a lot of information, while also being visually appealing. Explore Seaborn basic plots and aesthetics.

**Target**

**Prerequisites: none**

### Python for Data Science: Introduction to NumPy for Multi-dimentional Data

**Lesson Objectives**

**Python for Data Science: Introduction to NumPy for Multi-dimentional Data**

- identify the applications of NumPy
- install NumPy and learn how to create basic NumPy arrays
- create specialized NumPy arrays
- describe how arrays of different shapes and sizes can be displayed
- explore the different mathematical operations available when working with arrays
- work with functions which apply to each element of an array
- retrieve specific parts of an array using row and column indices
- describe the options available when iterating over 1-dimensional and multi-dimensional arrays
- perform reshape operations on arrays to visualize its contents in different ways
- utilize NumPy to perform basic array manipulation

**Overview/Description**

NumPy is a Python library that works with arrays when performing scientific computing with Python. Explore how to initialize and load data into arrays and learn about basic array manipulation operations using NumPy.

**Target**

**Prerequisites: none**

### Python for Data Science: Introduction to Pandas

**Lesson Objectives**

**Python for Data Science: Introduction to Pandas**

- understand the various applications of Pandas and why it is a building block in the field of data science
- install Pandas and create a Pandas Series
- work with Pandas Series by accessing elements using the default and a custom index
- define a Pandas DataFrame and describe how data can be stored and accessed in these data structures
- initialize and populate a simple Pandas DataFrame
- load data into a DataFrame from a CSV file
- edit individual cells and entire rows and columns in a Pandas DataFrame
- access specific rows and columns of a Pandas DataFrame using the index and labels
- access parts of a Pandas DataFrame based on specific conditions
- describe the concept of hierarchical index or multi-index and why can be useful
- re-orient a DataFrame as a pivot table to better visualize data
- apply a multi-index to a DataFrame and reshape it using the stack and melt operations
- work with Pandas for basic tabular data manipulation

**Overview/Description**

Discover how to work with series and tabular data, including initialization, population, and manipulation of Pandas Series and DataFrames.

**Target**

**Prerequisites: none**

### Python for Data Science: Manipulating and Analyzing Data in Pandas DataFrames

**Lesson Objectives**

**Python for Data Science: Manipulating and Analyzing Data in Pandas DataFrames**

- learn how to iterate over a DataFrame's rows and columns
- export the contents of a DataFrame into files of various formats
- describe and apply the different techniques involved in handling datasets where some information is missing
- describe and apply the different techniques involved in handling datasets where some information is missing
- implement a hierarchical index and access the DataFrame's contents based on that index
- combine two similar DataFrames using the concat operation
- apply a join operation on two related but dissimilar DataFrames using the merge function
- load data into a Pandas DataFrame from a table in a relational database
- use Pandas for advanced tabular data manipulation

**Overview/Description**

Explore different ways to iterate over and sort Pandas DataFrames. Discover how to handle missing data and perform grouping operations, as well as how to combine data from different DataFrames using join and concatenate operations

**Target**

**Prerequisites: none**