CBT Campus' Online Certification Training Courses.

DP-203 - Data Engineering on Microsoft Azure: Data Factory

discover the key concepts covered in this course
describe trigger concepts and how they relate to the Azure Data Factory
create an Azure Data Factory using the Azure portal
handle schema drift when mapping data flow
create Azure Data Factory pipelines and activities
trigger the pipeline manually or using a schedule
optimize pipelines for analytical or transactional purposes
insert data that does not currently exist using the Data Factory
troubleshoot Azure Stream Analytics using resource logs
design and implement incremental data loads
design and develop slowly changing dimensions
handle security and compliance requirements
summarize the key concepts covered in this course

Overview/Description
Once you have data in storage, you'll need to have some mechanism for transforming the data into a usable format. Azure Data Factory is a data integration service that is used to create automated data pipelines that can be used to copy and transform data. In this course, you'll learn about the Azure Data Factory and the Integration Runtime. You'll explore the features of the Azure Data Factory such as linked services and datasets, pipelines and activities, and triggers. Finally, you'll learn how to create an Azure Data Factory using the Azure portal, create Azure Data Factory linked services and datasets, create Azure Data Factory pipelines and activities, and trigger the pipeline manually or using a schedule. This course is one in a collection that prepares learners for the Microsoft Data Engineering on Microsoft Azure (DP-203) exam.

Target

Prerequisites: none

DP-203 - Data Engineering on Microsoft Azure: Data Flow Transformations

Course Number:
it_cldema_12_enus

Expected Duration (hours)
1.4

DP-203 - Data Engineering on Microsoft Azure: Data Flow Transformations

discover the key concepts covered in this course
describe the type of available Azure Data Flow transformations
transform data using Azure Data Mapping Data Flows
transform and split data using Azure Data Flow
transform and flatten data using Azure Data Flow
describe the types of expression functions available in Azure Data Flow
configure Azure Data Flow to perform error handling for data rows that would truncate data
transform and use derived columns to normalize data values
ingest and transform data using Azure Spark and Scala
handle duplicate data using Azure Data Flows
summarize the key concepts covered in this course

Overview/Description
One of the key components of the Azure Cloud platform is the ability to store and process large amounts of data. Azure Data Flow Transformations can be used to ingest and transform data. In this course, you'll learn about the types of Azure Data Flow transformations that are available. You'll explore how to transform, split, and flatten data, as well as handle duplicate data, using Azure Data Mapping Data Flows. Next, you'll examine the types of expression functions available in Azure Data Flow and how to perform error handling for data rows that would truncate data. Finally, you'll learn how to transform and use derived columns to normalize data values, and how to ingest and transform data using Azure Spark and Scala. This course is one in a collection that prepares learners for the Microsoft Data Engineering on Microsoft Azure (DP-203) exam.

Target

Prerequisites: none

DP-203 - Data Engineering on Microsoft Azure: Data Lake Storage

Course Number:
it_cldema_11_enus

Expected Duration (hours)
1.8

DP-203 - Data Engineering on Microsoft Azure: Data Lake Storage

discover the key concepts covered in this course
describe Azure Data Lake Storage Gen2 features and when to use this storage type
create an Azure Data Lake Storage Gen2 storage account
manage directories, files, and Access Control Lists in Azure Data Lake Storage Gen2 using the .NET framework
perform extract, transform, and load operations using Azure Databricks from Azure Data Lake Storage Gen2
transform data using Apache Spark
transform data by using Data Factory and data flows
integrate pipelines using Synapse Studio
stream data into Azure Databricks by using Event Hubs
transform data using Azure Databricks
summarize the key concepts covered in this course

Overview/Description
Azure Data Lake Storage Gen2 provides features to work with big data analytics using Azure Blob Storage. Azure Blob Storage systems provide performance, management, and security functionality. In this course, you'll learn about the features of the Azure Data Lake Storage Gen2 and when to use this storage type. You'll explore features and methods for securing data for the Azure Data Lake Storage Gen2 service and data at rest. You'll examine methods for processing big data using the Azure Data Lake Storage Gen2 service and monitoring Azure Blob Storage. You'll learn how to manage directories, files, and Access Control Lists in Azure Data Lake Storage Gen2 using the .NET framework, as well as how to perform extract, transform, and load operations using Azure Databricks from Azure Data Lake Storage Gen2. Finally, you'll learn how to access Azure Data Lake Storage Gen2 data using Azure Databricks and Spark. This course is one in a collection that prepares learners for the Microsoft Data Engineering on Microsoft Azure (DP-203) exam.

Target

Prerequisites: none

DP-203 - Data Engineering on Microsoft Azure: Data Partitioning

Course Number:
it_cldema_03_enus

DP-203 - Data Engineering on Microsoft Azure: Data Partitioning

discover the key concepts covered in this course
recognize data partitioning concepts
describe data partitioning strategies for different services
compare transactional and analytical workloads to determine which data store and partitioning strategy to implement
recognize criteria for determining how to partition files for efficient distribution and querying
describe how to partition a table to ensure efficient scalability for analytical workloads
describe how the index table and materialized view design patterns can increase efficiency and performance of queries
describe table partitions used by Azure Synapse Analytics and how to size them, and recognize the differences from SQL Server
describe when to implement partitioning at the storage layer
describe how data sharding distributes load over multiple datastores
summarize the key concepts covered in this course

Overview/Description
Partitioning data is key to ensuring efficient processing. In this course, you'll explore what data partitioning is and the strategies for implementation. You'll learn about transactional and analytical workloads and how to determine the best strategy for your files and table storage. Then, you'll examine design patterns for efficiency and performance. You'll learn about partitioning dedicated SQL pools in Azure Synapse Analytics and partitioning data lakes. Finally, you'll learn how data sharding across multiple data stores can be used for improving transaction performance. This course is one in a collection that prepares learners for the Microsoft Data Engineering on Microsoft Azure (DP-203) exam.

Target

Prerequisites: none

DP-203 - Data Engineering on Microsoft Azure: Data Policies & Standards

Course Number:
it_cldema_08_enus

DP-203 - Data Engineering on Microsoft Azure: Data Policies & Standards

discover the key concepts covered in this course
describe scenarios for encrypting data and the best practices for utilizing disk encryption on Azure
describe how Transparent Data Encryption can be used to encrypt an Azure SQL Database, Azure SQL Managed Instance, and Azure Synapse Analytics database at rest
describe the Always Encrypted feature that client applications can use to encrypt data stored in Azure SQL Database
describe how dynamic data masking and classification can identify sensitive data and mask sensitive data in Azure SQL Database, Azure SQL Managed Instance, and Azure Synapse Analytics
describe best practices for defining a data retention policy
describe guidelines, processes, limitations, and considerations for performing a data purge in Azure Data Explorer
recognize the options for controlling access to Azure Data Lake
describe what should be audited for and how technologies on the Azure platform can enable an auditing strategy
describe how Row-Level Security in Azure SQL Database implements restrictions on access to data rows
summarize the key concepts covered in this course

Overview/Description
Data policies and standards help to ensure a repeatable security standard is maintained. In this course, you'll learn about data encryption scenarios and best practices. You'll explore how Azure Transparent Database Encryption and Always Encrypted can be used to ensure data at rest is protected. Next, you'll examine how data classification and data masking can protect data being viewed. You'll learn to configure data retention and purging to ensure data is retained or removed. You'll also explore the various means of controlling access to Azure Data Lake Storage Gen2. Finally, you'll learn how to plan a data auditing strategy and how to limit access to data at the row level in a database. This course is one in a collection that prepares learners for the Microsoft Data Engineering on Microsoft Azure (DP-203) exam.

Target

Prerequisites: none

DP-203 - Data Engineering on Microsoft Azure: Data Process Monitoring

Course Number:
it_cldema_19_enus

Expected Duration (hours)
1.5

DP-203 - Data Engineering on Microsoft Azure: Data Process Monitoring

discover the key concepts covered in this course
describe the features of the Azure Monitor tools and the concepts of continuous monitoring and visualization
create metric charts using the Azure Monitor
collect and analyze Azure resource log data
perform queries against the Azure Monitor logs
create and share dashboards that display data from Log Analytics
describe the features of Azure HDInsight for ingesting, processing, and analyzing big data
use the Azure Data Factory Analytics solution to monitor pipelines
describe how to monitor and tune query performance
query Azure Log Analytics and filter, sort, and group query results
describe how to implement versioning control in Azure Data Factory
describe how to monitor cluster performance
describe how to implement logging for custom monitoring
archive data from a Log Analytics workspace to Azure storage using Logic App
summarize the key concepts covered in this course

Overview/Description
Being able to monitor data processes to ensure they are operational and working correctly is a crucial part of running your business. Azure provides the Azure Monitor service and the Azure Log Analytics service to perform this function. In this course, you'll learn about the features of the Azure Monitor tools and the concepts of continuous monitoring and visualization. Next, you'll examine how to create metric charts using the Azure Monitor, as well as how to collect and analyze Azure resource log data and perform queries against the Azure Monitor logs. You'll explore how to create and share dashboards that display data from Log Analytics, create Azure Monitor alerts, and use the Azure Data Factory Analytics solution to monitor pipelines. You'll learn how to send the Azure Databricks logs to the Azure Monitor and use the dashboard to analyze the Azure Databricks metrics. Finally, you'll learn how to enable monitoring for Azure Stream Analytics and configure alerts, and also query Azure Log Analytics and filter, sort, and group query results. This course is one in a collection that prepares learners for the Microsoft Data Engineering on Microsoft Azure (DP-203) exam.

Target

Prerequisites: none

DP-203 - Data Engineering on Microsoft Azure: Data Solution Optimization

Course Number:
it_cldema_20_enus

DP-203 - Data Engineering on Microsoft Azure: Data Solution Optimization

discover the key concepts covered in this course
describe the concept of cloud optimization and some best practices for optimizing data using data partitions, Azure Data Lake Storage tuning, Azure Synapse Analytics tuning, and Azure Databricks auto-optimizing
describe how to optimize Azure Stream Analytics jobs
describe various strategies for partitioning data using Azure-based storage solutions
describe how to optimize the Azure Data Lake Storage Gen2 for performance
describe methods for optimizing Azure Synapse Analytics
describe how to optimize the Azure Cosmos DB using indexing and partitioning
describe methods for monitoring Azure Databricks
describe methods for optimizing Azure Databricks
describe how to manage common data troubleshooting techniques
summarize the key concepts covered in this course

Overview/Description
Ensuring that data storage and processing systems are operating efficiently will allow your organization to save both time and money. There are several tips and tricks that can be used to optimize both Azure Data Storage service and processes. In this course, you'll learn about cloud optimization, as well as some best practices for optimizing data using data partitions, Azure Data Lake Storage tuning, Azure Synapse Analytics tuning, and Azure Databricks auto-optimizing. Next, you'll learn about strategies for partitioning data using Azure-based storage solutions. You'll learn about the stages of the Azure Blob lifecycle management and how to optimize Azure Data Lake Storage Gen2, Azure Stream Analytics, and Azure Synapse Analytics. Finally, you'll explore how to optimize Azure Data Storage services such Azure Cosmos DB using indexing and partitioning, as well as Azure Blob Storage and Azure Databricks. This course is one in a collection that prepares learners for the Microsoft Data Engineering on Microsoft Azure (DP-203) exam.

Target

Prerequisites: none

DP-203 - Data Engineering on Microsoft Azure: Data Storage Monitoring

Course Number:
it_cldema_18_enus

DP-203 - Data Engineering on Microsoft Azure: Data Storage Monitoring

discover the key concepts covered in this course
describe the features of the Azure Monitor service and how it can be used to monitor storage data
monitor Azure Blob storage
access diagnostic logs to monitor Data Lake Storage Gen2
monitor Azure Synapse Analytics jobs and the adaptive cache
monitor Azure Cosmos DB using the portal and resource logs
configure, manage, and view metric alerts using the Azure Monitor
describe the features and concepts of Azure Log Analytics
configure, manage, and view activity log alerts using the Azure Monitor
monitor Azure Stream Analytics jobs
describe types of Azure Cosmos DB indexes
summarize the key concepts covered in this course

Overview/Description
Being able to monitor data storage systems to ensure they are operational and working correctly is a crucial part of running your business. Azure provides the Azure Monitor service and the Azure Log Analytics service to perform this function. In this course, you'll learn about the features of Azure Log Analytics, as well as the Azure Monitor service and how it can be used to monitor storage data and monitor Azure Blob storage. Next, you'll explore how to access diagnostic logs to monitor Data Lake Storage Gen2, monitor the Azure Synapse Analytics jobs and the adaptive cache, and monitor Azure Cosmos DB using the portal and resource logs. Finally, you'll examine how to configure, manage, and view metric alerts using the Azure Monitor and activity log alerts using the Azure Monitor. This course is one in a collection that prepares learners for the Microsoft Data Engineering on Microsoft Azure (DP-203) exam.

Target

Prerequisites: none

DP-203 - Data Engineering on Microsoft Azure: Databrick Processing

Course Number:
it_cldema_15_enus

Expected Duration (hours)
1.8

DP-203 - Data Engineering on Microsoft Azure: Databrick Processing

discover the key concepts covered in this course
describe the types of available processing when using Azure Databricks such as stream, batch, image and parallel processing
create an Azure Databricks workspace using an Apache Spark cluster
run jobs in the Azure Databricks Workspace jobs using a service principal
query data in SQL server using an Azure Databricks notebook
validate and handle failed batch loads
implement a Cosmos DB service endpoint for Azure Databricks
extract, transform, and load data using Azure Databricks
perform sentiment analysis for steam data by making use of Azure Databricks
debug Spark Jobs running on HDInsight
summarize the key concepts covered in this course

Overview/Description
When working with big data there needs to be a mechanism to process and transform this data quickly and efficiently. Azure Databricks is a service that provides the latest version of Apache Spark that provides functionality processing data from Azure Storage. In this course, you will learn about the types of processing that can be performed with Azure Databricks such as stream, batch, image and parallel processing. Next, you'll learn how to create an Azure Databricks workspace using an Apache Spark cluster, run jobs in the Azure Databricks Workspace jobs using a service principal and query data in SQL server using an Azure Databricks notebook. Next, you'll learn how to retrieve data from an Azure Blob Storage using Azure Databricks and the Azure Key Vault, implement a Cosmos DB service endpoint for Azure Databricks, and extract, transform, and load data using Azure Databricks. Finally, you'll learn how to stream data into Azure Databricks by using Event Hubs and perform sentiment analysis for steam data by making use of Azure Databricks. This course is one in a collection that prepares learners for the Microsoft Data Engineering on Microsoft Azure (DP-203) exam.

Target

Prerequisites: none

DP-203 - Data Engineering on Microsoft Azure: Databricks

Course Number:
it_cldema_14_enus

DP-203 - Data Engineering on Microsoft Azure: Databricks

discover the key concepts covered in this course
describe the features of Azure Databricks
describe the features and concepts of Azure Databricks clusters
capture stream data using Event Hub
describe notebooks and how they can be used with Azure Databricks
create, open, use, and delete notebooks
describe the features and concepts of Azure Databricks jobs
create, open, use, and delete Azure Databricks jobs
describe the concept of autoscaling local storage when configuring clusters
describe how to configure checkpoints and watermarking during stream processing
summarize the key concepts covered in this course

Overview/Description
When working with big data, there needs to be a mechanism to process and transform this data quickly and efficiently. Azure Databricks is a service that provides the latest version of Apache Spark, which provides functionality for machine learning and data warehousing. In this course, you'll learn about the features of Azure Databricks such as clusters, notebooks, and jobs. Next, you'll learn about autoscaling local storage when configuring clusters. Next, you'll explore how to create, manage, and configure Azure Databricks clusters, as well as how to create, open, use, and delete notebooks. Finally, you'll learn how to create, open, use, and delete jobs. This course is one in a collection that prepares learners for the Microsoft Data Engineering on Microsoft Azure (DP-203) exam.

Target

Prerequisites: none

DP-203 - Data Engineering on Microsoft Azure: Designing Data Storage Structures

Course Number:
it_cldema_02_enus

DP-203 - Data Engineering on Microsoft Azure: Designing Data Storage Structures

discover the key concepts covered in this course
describe key considerations for designing a data lake
identify and evaluate criteria for selecting a file format for big data applications
recognize the defining characteristics of the supported file formats in Azure Data Lake
describe steps for efficient read operations for a table storage service
describe the dynamic data pruning feature in Databricks at the file and partition level
recognize an efficient folder structure design
define the zones within a data lake for organizing data distribution
describe the data access tiers in Azure Blob storage and how data can be moved between them for efficient and cost-effective storage
describe the steps to archive data in an Azure Blob storage container, rehydrate blob data, and automate access tiers using life cycle management
summarize the key concepts covered in this course

Overview/Description
Planning the structure for data storage is integral to performance in big data operations. In this course, you'll learn about key considerations for data lakes and how to determine which file type and file format are the most appropriate for your use case. Then, you'll explore how to define how to design table storage for efficient querying and how data pruning can remove unnecessary data to accelerate transactions. You'll examine folder structures and data lake zones for organizing data effectively. Finally, you'll learn how to define storage tiers and how to manage the life cycle of data. This course is one in a collection that prepares learners for the Microsoft Data Engineering on Microsoft Azure (DP-203) exam.

Target

Prerequisites: none

DP-203 - Data Engineering on Microsoft Azure: Designing the Serving Layer

Course Number:
it_cldema_04_enus

Expected Duration (hours)
1.2

DP-203 - Data Engineering on Microsoft Azure: Designing the Serving Layer

discover the key concepts covered in this course
recognize the concepts of dimensional data modeling
describe multidimensional data modeling dimensional hierarchies
describe slowly changing dimensions used to capture changing data over time, as well as the various types and their applications
describe temporal databases and steps for designing a database for temporalness
describe the differences between the star and snowflake schemes for data modeling
recognize the rules and best practices to follow when designing a star schema
implement incremental data loading using Azure Data Factory
select the appropriate technology for analytical data storage
describe the options for storing metadata external to Azure Synapse Analytics and Azure Databricks
summarize the key concepts covered in this course

Overview/Description
The serving layer is where data is stored for consumption by processing services. In this course, you'll explore dimensional data modeling and hierarchies. You'll learn how to define slowly changing dimensions and temporal design within databases. Then, you'll learn about the differences between the star and snowflake schemas as well as how to design a star schema. Next, you'll examine incremental data loading for stream processing and the options for analytical data stores. Finally, you'll learn about options for creating metastores for use by Azure Databricks and Azure Synapse Analytics. This course is one in a collection that prepares learners for the Microsoft Data Engineering on Microsoft Azure (DP-203) exam.

Target

Prerequisites: none

DP-203 - Data Engineering on Microsoft Azure: Logical Data Structures

Course Number:
it_cldema_06_enus

Expected Duration (hours)
1.5

DP-203 - Data Engineering on Microsoft Azure: Logical Data Structures

discover the key concepts covered in this course
define key concepts for maturing data lake storage structures
describe how system-versioned temporal tables in are used for point-in-time analysis
create and manage system-versioned temporal tables in an Azure SQL Database
describe the different types of slowly changing dimensions
build a slowly changing dimension type 1 deployment
build a slowly changing dimension type 2 deployment
define an effective logical file and folder structure for efficient data ingestion and manipulation
use PolyBase to build an external table
describe best practices for accelerating queries against data in Azure Data Lake Storage Gen2
summarize the key concepts covered in this course

Overview/Description
Logical data structures, also called entity-relationship models, are models used to define a high-level model of data and the relationships contained within. In this course, you'll learn about the stages of data lake maturity. You'll explore temporal database tables and how to manage them. You'll also learn how to define slowly changing dimensions and how to implement them. You'll then move on to explore logical file and folder structures for data ingestion. You'll discover how PolyBase can be used to connect to external tables. Finally, you'll explore the best practices for accelerating queries. This course is one in a collection that prepares learners for the Data Engineering on Microsoft Azure (DP-203) exam.

Target

Prerequisites: none

DP-203 - Data Engineering on Microsoft Azure: Physical Data Storage Structures

Course Number:
it_cldema_05_enus

DP-203 - Data Engineering on Microsoft Azure: Physical Data Storage Structures

discover the key concepts covered in this course
define considerations for implementing data compression technologies at the database and file level
create a data partition in a SQL database
describe how a shard map manager is used by an application to connect to the required Azure SQL Databases
describe the key concepts for designing tables in an Azure Synapse Analytics dedicated SQL pool
deploy Azure SQL Database geo-replication
describe the options for redundancy in Azure Blob storage
determine the appropriate distribution scheme for an Azure Synapse Analytics database and build it into the table on creation
archive data in Azure storage and rehydrate it when necessary
configure a long-term retention policy for an Azure SQL Database
summarize the key concepts covered in this course

Overview/Description
An effective storage structure is critical to big data implementation success. In this course, you'll explore data compression in databases and file storage. Then, you'll discover how partitioning and sharding are implemented in the database. Next, you'll explore designing tables in an Azure Synapse Analytics dedicated SQL pool, and implement geo-replication for redundancy in both databases and Azure Blob storage. You'll also discover implementing distribution schemes in Azure Synapse Analytics. Finally, you'll discover data archiving and long-term retention policies for Azure Blob storage and Azure SQL Databases. This course is one in a collection that prepares learners for the Microsoft Data Engineering on Microsoft Azure (DP-203) exam.

Target

Prerequisites: none

DP-203 - Data Engineering on Microsoft Azure: Securing Data

Course Number:
it_cldema_10_enus

Expected Duration (hours)
1.4

DP-203 - Data Engineering on Microsoft Azure: Securing Data

discover the key concepts covered in this course
describe the tools used by Azure to encrypt data across the platform
manage Transparent Data Encryption on Azure SQL Databases
describe how Always Encrypted is used by the database engine to process queries on encrypted data
enable Always Encrypted on an Azure SQL Database
encrypt single columns in an Azure SQL Database
use group membership to control access to Azure SQL Database rows
use DataFrames in Databricks to perform mixed functions
configure Advanced Threat Protection and dynamic data masking in an Azure SQL Database, Azure SQL Managed Instance, or Azure Synapse Analytics instance
utilize immutable storage on Azure Blob storage to store business data in a manner that cannot be erased or modified to meet time-based or legal holds
summarize the key concepts covered in this course

Overview/Description

The final line of defense for protecting against a data breach is securing the data itself. With today's cloud environments, data is often in transit, duplicated, and stored in various data centers around the world, making effective data protection a challenge.

In this course, you'll explore the various methods available for encrypting data stored in SQL databases. You'll examine how to use DataFrames in Databricks, as well as how to implement Advanced Threat Protection and dynamic data masking in Azure databases. Finally, you'll learn how immutable blobs can be used to manage sensitive information.

This course is one in a collection that prepares learners for the Microsoft Data Engineering on Microsoft Azure (DP-203) exam.

Target

Prerequisites: none

DP-203 - Data Engineering on Microsoft Azure: Securing Data Access

Course Number:
it_cldema_09_enus

DP-203 - Data Engineering on Microsoft Azure: Securing Data Access

discover the key concepts covered in this course
recognize how Azure Key Vault can be used to store and manage keys and secrets used by multiple sources
describe private endpoints used for ensuring data flows only within your private link and service endpoints used to provide secure direct connectivity
utilize managed virtual networks and managed private endpoints to secure traffic between Azure Synapse Analytics and other Azure resources
describe how resources and apps can utilize Azure managed identities to securely connect to Azure services
manage access control lists on Azure Data Lake Storage Gen2
manage access to resources using Azure role-based access control
describe how token-based authentication can be utilized to manage authentication to Azure Databricks
Manage access Azure Databricks workspaces using Azure Databricks Token Authentication
manage retention policies for temporal tables in Azure SQL Database
enable auditing on an Azure SQL Database
summarize the key concepts covered in this course

Overview/Description
Securing access to data is a fundamental part of any security strategy. In this course, you'll explore how Azure Key Vault can be used to store and manage keys and secrets for accessing data. You'll discover how to connect to Azure resources through private and service endpoints and managed virtual networks and how to use Azure managed identities for connections between Azure resources. Next, you'll learn how to utilize access control lists and Azure role-based access control to provide only the necessary permissions to users to access your data. You'll also learn how token-based authentication works in Azure Databricks. Finally, you'll examine how to audit an Azure SQL Database to monitor for unauthorized access. This course is one in a collection that prepares learners for the Microsoft Data Engineering on Microsoft Azure (DP-203) exam.

Target

Prerequisites: none

DP-203 - Data Engineering on Microsoft Azure: Storage Accounts

Course Number:
it_cldema_01_enus

Expected Duration (hours)
1.2

DP-203 - Data Engineering on Microsoft Azure: Storage Accounts

discover the key concepts covered in this course
describe the storage capabilities Azure Blob storage provides
recognize how to architect a blob storage deployment to meet performance and scalability requirements
utilize the geo-redundancy features in Azure Storage to design highly available applications
describe the available options and design options for Azure Storage account disaster recovery
describe the role played by Azure Data Lake Storage Gen2 in managing big data used in analytics scenarios
recognize the scenarios where Azure Data Lake Storage Gen2 would be applied for big data processing
effectively plan an Azure Data Lake Gen2 deployment
apply best practices when designing a solution incorporating Azure Data Lake Storage Gen2
deploy an Azure Data Lake Gen2 storage account
summarize the key concepts covered in this course

Overview/Description
Microsoft Azure Blob storage is a container system for storing a variety of file types. In this course, you'll learn about the capabilities of blob storage and how to architect a deployment for optimal performance and scalability. Then, you'll explore the options for redundancy and how to recover from disasters. You'll discover where Azure Data Lake Storage Gen2, a feature set within blob storage, can be utilized for big data operations. You'll also learn how to plan for a data lake deployment, examine best practices, and explore how to deploy a Data Lake Gen2 account on Azure. This course is one in a collection that prepares learners for the Microsoft Data Engineering on Microsoft Azure (DP-203) exam.

Target

Prerequisites: none

DP-203 - Data Engineering on Microsoft Azure: Stream Analytics

Course Number:
it_cldema_16_enus

Expected Duration (hours)
1.2

DP-203 - Data Engineering on Microsoft Azure: Stream Analytics

discover the key concepts covered in this course
describe how Azure Stream Analytics is used to process streaming data
describe the available inputs that can be used with Azure Stream Analytics
describe the available outputs that can be used with Azure Stream Analytics
describe how to create user-defined functions for Azure Stream Analytics
process data by using Spark structured streaming
describe how to use Stream Analytics windowing functions
set up and process time series data
create a Stream Analytics job by using the Azure portal
summarize the key concepts covered in this course

Overview/Description
Azure Stream Analytics is a complex, serverless, and highly scalable processing engine that can be used to perform real-time analytics on multiple data streams. Alerts can be configured to forecast trends, trigger workflows, and detect irregularities. In this course, you'll learn to use Azure Stream Analytics to process streaming data. You'll examine how to implement security, create user-defined functions, and optimize jobs for Azure Stream Analytics, as well as explore the available inputs and outputs. Finally, you'll learn how to create an Azure Stream Analytics job, create an Azure Stream Analytics dedicated cluster, run Azure Functions from Azure Stream Analytics jobs, and monitor Azure Stream Analytics jobs. This course is one in a collection that prepares learners for the Microsoft Data Engineering on Microsoft Azure (DP-203) exam.

Target

Prerequisites: none

DP-203 - Data Engineering on Microsoft Azure: Synapse Analytics

Course Number:
it_cldema_17_enus

Expected Duration (hours)
0.8

DP-203 - Data Engineering on Microsoft Azure: Synapse Analytics

discover the key concepts covered in this course
describe the Azure Synapse Analytics platform and how it is used for data warehousing and big data analytics
integrate pipelines using Synapse Studio
visualize data using a Power BI workspace that is linked to an Azure Synapse Workspace
monitor a Synapse Workspace
recognize features of the Synapse Knowledge Center
describe the features of Azure Synapse Analytics and PolyBase
summarize the key concepts covered in this course

Overview/Description
Azure Synapse Analytics is an analytics service that provides functionality for data integration, enterprise data warehousing, and big data analytics. Services provided include ingesting, exploring, preparing, managing, and serving data for BI and machine learning needs. In this course, you'll learn about the Azure Synapse Analytics platform and how it is used for data warehousing and big data analytics. Next, you'll learn how to create a Synapse Workspace, a dedicated SQL pool, and a serverless Apache Spark pool. You'll move on to explore how to analyze data using a dedicated SQL pool, Apache Spark for Azure Synapse, Serverless SQL Pools, and a Spark database, as well as how to analyze data that is in a storage account. You'll learn how to integrate pipelines using Synapse Studio, visualize data using a Power BI workspace, and monitor a Synapse Workspace. Finally, you'll learn about the Synapse Knowledge Center and the features of Azure Synapse Analytics and PolyBase. This course is one in a collection that prepares learners for the Microsoft Data Engineering on Microsoft Azure (DP-203) exam.

Target

Prerequisites: none

DP-203 - Data Engineering on Microsoft Azure: The Serving Layer

Course Number:
it_cldema_07_enus