CBT Campus' Online Skills Training Courses.

IT Skills

Enterprise Database Systems

Data Science

Data Science Essentials

df_dses_a07_it_enus

df_dses_a08_it_enus

df_dses_a09_it_enus

df_dses_a05_it_enus

df_dses_a03_it_enus

df_dses_a02_it_enus

df_dses_a06_it_enus

df_dses_a01_it_enus

df_dses_a04_it_enus

Data Analysis Concepts

Course Number:
df_dses_a07_it_enus

Expected Duration (hours)
1.7

Lesson Objectives

Data Analysis Concepts

start the course
perform basic math operations required by data scientists
perform basic vector math operations required by data scientists
perform basic matrix math operations required by data scientists
perform a matrix decomposition
identify different forms of data
describe probability in terms of events and sample space size
describe basic properties of outcomes
apply probability rules in calculation
identify common continuous probability distributions
identify common discrete probability distributions
apply bayes theorem and describe how it is used in email spam algorithms
apply random sampling to A/B tests
identify and describe various statistical measures
describe the difference between an unbiased and biased estimator
describe sampling distributions and recognize the central limit theorem
define confidence intervals and work with margins of error
carrying out hypothesis tests and working with p-values
apply the chi-square test for categorical values
identify the given data set descriptions by their types

Overview/Description
There are many software and programming tools available to data scientists. Before applying those tools effectively, you must understand the underlying concepts. In this course, you'll explore the underlying data analysis concepts needed to employ the software and programming tools effectively

Target Audience
Individuals with some programming and math experience working toward implementing data science in their everyday work

Data Classification and Machine Learning

Course Number:
df_dses_a08_it_enus

Expected Duration (hours)
1.3

Lesson Objectives

Data Classification and Machine Learning

start the course
identify problems in which supervised learning techniques apply
identify problems in which unsupervised learning techniques apply
apply linear regression to machine learning problems
identify predictors in machine learning
apply logistic regression to machine learning problems
describe the use of dummy variables
use naive bayes classification techniques
work with decision trees
describe K-means clustering
define cluster validation
define principal component analysis
describe machine learning errors
describe underfitting
describe overfitting
apply k-folds cross validation
describe fall-forward and back-propagation in neural networks
describe SVMs and their use
choose the appropriate machine learning method for the given example problems

Overview/Description
Machine learning is a particular area of data science that uses techniques to create models from data without being explicitly programmed. In this course, you'll explore the conceptual elements of various machine learning techniques.

Target Audience
Individuals with some programming and math experience working toward implementing data science in their everyday work

Data Communication and Visualization

Course Number:
df_dses_a09_it_enus

Expected Duration (hours)
1.3

Lesson Objectives

Data Communication and Visualization

start the course
choose appropriate visualization techniques
describe the difference between correlation and causation
define Simpson's paradox
communicate data science results informally
communicate data science results formally
implement strategies for effective data communication
use scatter plots
use line graphs
use bar charts
use histograms
use box plots
create a network visualization
create a bubble plot
create an interactive plot
find an appropriate data set in which a scatter plot represents it visually and plot it

Overview/Description
The final step in the data science pipeline is to communicate the results or findings. In this course, you'll explore communication and visualization concepts needed by data scientists.

Target Audience
Individuals with some programming and math experience working toward implementing data science in their everyday work

Data Exploration

Course Number:
df_dses_a05_it_enus

Expected Duration (hours)
1.0

Lesson Objectives

Data Exploration

start the course
use csvgrep to explore data in CSV data
use csvstat to explore values in CSV data
use csvsql to query CSV data like a SQL database
use gnuplot to quickly plot data on the command line
use wc to count words, characters, and lines within a text file
explore a subdirectory tree from the command line
use natural language processing to count word frequencies in a text document
take random samples from a list of records
find the top rows by value and percent in a data set
find repeated records in a data set
identify outliers using standard deviation
perform a word frequency count on a classic book from Project Gutenberg

Overview/Description
Once data is transformed into a useable format, the next step is to carry out preliminary data exploration on the data. In this course, you'll explore examples of practical tools and techniques for data exploration.

Target Audience
Individuals with some programming and math experience working toward implementing data science in their everyday work

Data Filtering

Course Number:
df_dses_a03_it_enus

Expected Duration (hours)
1.0

Lesson Objectives

Data Filtering

start the course
identify common filtering techniques and tools
extract date elements from common date formats
parse content types in HTTP headers
use csvcut to filter CSV data
use sed to replace values in a text data stream
drop duplicate records from data
extract headers from a jpeg image
use pdfgrep to extract data from searchable pdf files
detect invalid or impossible data combinations
parse robots.txt from a web site to decide what should and shouldn't be crawled nor indexed
drop records from a CSV file based on date range

Overview/Description
Once data is gathered for data science it is often in an unstructured or raw format. Data must be filtered for content and validity. In this course, you'll explore examples of practical tools and techniques for data filtering.

Target Audience
Individuals with some programming and math experience working toward implementing data science in their everyday work

Data Gathering

Course Number:
df_dses_a02_it_enus

Expected Duration (hours)
1.2

Lesson Objectives

Data Gathering

start the course
describe problems and software tools associated with data gathering
use curl to gather data from the Web
use in2csv to convert spreadsheet data to CSV format
use agate to extract data from spreadsheets
use agate to extract tabular data from dbf files
extract data from particular tags in an HTML document
distinguish between metadata and data
work with metadata in HTTP Headers
work with Linux log files
work with metadata in email headers
perform a secure shell connection to a remote server
copy remote data using a secure copy
synchronize data from a remote server
download an HTML file and explore table data

Overview/Description
To carry out data science, you need to gather data. Extracting, parsing, and scraping data from various sources, both internal and external, is a critical first part in the data science pipeline. In this course, you'll explore examples of practical tools for data gathering.

Target Audience
Individuals with some programming and math experience working toward implementing data science in their everyday work

Data Integration

Course Number:
df_dses_a06_it_enus

Expected Duration (hours)
0.7

Lesson Objectives

Data Integration

start the course
use csvjoin to concatenate CSV data
use the cat function to concatenate separate logs into a single file
sort lines in a text file
merge separate xml files into a single schema
aggregate data from a CSV file into a table of summarized values
normalize data from unstructured sources
denormalize data from a structured source
use pivot tables to cross tabulate data
insert missing values in a data set
use csvjoin to merge two compatible CSV documents into one

Overview/Description
Data integration is the last step in the data wrangling process where data is put into its useable and structured format for analysis. In this course, you'll explore examples of practical tools and techniques for data integration.

Target Audience
Individuals with some programming and math experience working toward implementing data science in their everyday work

Data Science Overview

Course Number:
df_dses_a01_it_enus

Expected Duration (hours)
0.7

Lesson Objectives

Data Science Overview

start the course
define data science and what it is to be a data scientist
describe the data wrangling aspect of data science
describe the big data aspect of data science
describe the machine learning aspect of data science
use common data science terminology
recognize ways to communicate results of your data science
recall the steps in data science analysis
compare various tools and software libraries used for data science >

Overview/Description
Data science differentiates itself from academic statistics and application programming by using what it needs from a variety of disciplines. In this course, you'll explore what it is to be a data scientist and study what sets data science apart from other disciplines. It prepares learners to navigate the foundational elements of data science.

Target Audience
Individuals with some programming and math experience working toward implementing data science in their everyday work

Data Transformation

Course Number:
df_dses_a04_it_enus

Expected Duration (hours)
0.8

Lesson Objectives

Data Transformation

start the course
convert CSV data to JSON format
convert XML data to JSON format
create SQL inserts from CSV data
extract CSV data from SQL
change delimiters in a csv file from commas to tabs
convert basic date formats to standard ISO 8601 format
convert numeric formats within a CSV document
round floating point decimals to two places within a CSV document
use optical character recognition (OCR) to extract text from a jpeg image
use optical character recognition (OCR) to extract text from a pdf document
read various date formats and convert to standard compliant ISO 8601 format

Overview/Description
Once data is filtered the next step is to transform it into a usable format. In this course, you'll explore examples of practical tools and techniques for data transformation.

Target Audience
Individuals with some programming and math experience working toward implementing data science in their everyday work