DP-300 Practice Test Questions

You have an Azure Stream Analytics job.
You need to ensure that the job has enough streaming units provisioned.
You configure monitoring of the SU % Utilization metric.
Which two additional metrics should you monitor? Each correct answer presents part of the solution.
NOTE: Each correct selection is worth one point.

A. Late Input Events

B. Out of order Events

C. Backlogged Input Events

D. Watermark Delay

E. Function Events

C. Backlogged Input Events

D. Watermark Delay

Explanation:

To react to increased workloads and increase streaming units, consider setting an alert of 80% on the SU Utilization metric. Also, you can use watermark delay and backlogged events metrics to see if there is an impact.
Note: Backlogged Input Events: Number of input events that are backlogged. A non-zero value for this metric implies that your job isn't able to keep up with the number of incoming events. If this value is slowly increasing or consistently non-zero, you should scale out your job, by increasing the SUs.

You are designing a star schema for a dataset that contains records of online orders. Each record includes an order date, an order due date, and an order ship date.
You need to ensure that the design provides the fastest query times of the records when querying for arbitrary date ranges and aggregating by fiscal calendar attributes.
Which two actions should you perform? Each correct answer presents part of the solution.
NOTE: Each correct selection is worth one point.

A. Create a date dimension table that has a DateTime key.

B. Create a date dimension table that has an integer key in the format of YYYYMMDD.

C. Use built-in SQL functions to extract date attributes.

D. Use integer columns for the date fields.

E. Use DateTime columns for the date fields.

B. Create a date dimension table that has an integer key in the format of YYYYMMDD.

D. Use integer columns for the date fields.

You are designing a date dimension table in an Azure Synapse Analytics dedicated SQL pool. The date dimension table will be used by all the fact tables.
Which distribution type should you recommend to minimize data movement?

A. HASH

B. REPLICATE

C. ROUND_ROBIN

B. REPLICATE

You are planning a solution that will use Azure SQL Database. Usage of the solution will peak from October 1 to January 1 each year.
During peak usage, the database will require the following:
24 cores
500 GB of storage
124 GB of memory
More than 50,000 IOPS
During periods of off-peak usage, the service tier of Azure SQL Database will be set to Standard.
Which service tier should you use during peak usage?

A. Business Critical

B. Premium

C. Hyperscale

A. Business Critical

You have an Azure Data Factory that contains 10 pipelines.
You need to label each pipeline with its main purpose of either ingest, transform, or load. The labels must be available for grouping and filtering when using the monitoring experience in Data Factory. What should you add to each pipeline?

A. an annotation

B. a resource tag

C. a run group ID

D. a user property

E. a correlation ID

A. an annotation

Explanation:

Azure Data Factory annotations help you easily filter different Azure Data Factory objects based on a tag. You can define tags so you can see their performance or find errors faster.

Note: This question is part of a series of questions that present the same scenario.
Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.
After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.
You have an Azure Data Lake Storage account that contains a staging zone.
You need to design a daily process to ingest incremental data from the staging zone, transform the data by executing an R script, and then insert the transformed data into a data warehouse in Azure Synapse Analytics.
Solution: You schedule an Azure Databricks job that executes an R notebook, and then inserts the data into the data warehouse.
Does this meet the goal?

A. Yes

B. No

Note: This question is part of a series of questions that present the same scenario.
Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.
After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.
You have an Azure Synapse Analytics dedicated SQL pool that contains a table named Table1.
You have files that are ingested and loaded into an Azure Data Lake Storage Gen2 container named container1.
You plan to insert data from the files into Table1 and transform the data. Each row of data in the files will produce one row in the serving layer of Table1.
You need to ensure that when the source data files are loaded to container1, the DateTime is stored as an additional column in Table1.
Solution: In an Azure Synapse Analytics pipeline, you use a Get Metadata activity that retrieves the DateTime of the files.
Does this meet the goal?

A. Yes

B. No

You create five Azure SQL Database instances on the same logical server.
In each database, you create a user for an Azure Active Directory (Azure AD) user named User1.
User1 attempts to connect to the logical server by using Azure Data Studio and receives a login error.
You need to ensure that when User1 connects to the logical server by using Azure Data Studio, User1 can see all the databases.
What should you do?

A. Create User1 in the master database.

B. Assign User1 the db_datareader role for the master database.

C. Assign User1 the db_datareader role for the databases that Userl creates.

D. Grant select on sys.databases to public in the master database.

A. Create User1 in the master database.

You have an Azure SQL database named DB1. You need to display the estimated execution plan of a query by using the query editor in the Azure portal. What should you do first?

A. Run the set showplan_all Transact-SQL statement.

B. For DB1, set QUERY_CAPTURE_MODE of Query Store to All.

C. Run the set forceplan Transact-SQL statement.

D. Enable Query Store for DB1.

A. Run the set showplan_all Transact-SQL statement.

You plan to build a structured streaming solution in Azure Databricks. The solution will count new events in fiveminute intervals and report only events that arrive during the interval.The output will be sent to a Delta Lake table. Which output mode should you use?

A. complete

B. append

C. update

A. complete

You have the following Azure Data Factory pipelines:
Ingest Data from System1
Ingest Data from System2
Populate Dimensions
Populate Facts
Ingest Data from System1 and Ingest Data from System2 have no dependencies. Populate Dimensions must execute after Ingest Data from System1 and Ingest Data from System2.
Populate Facts must execute after the Populate Dimensions pipeline. All the pipelines must execute every eight hours.
What should you do to schedule the pipelines for execution?

A. Add a schedule trigger to all four pipelines.

B. Add an event trigger to all four pipelines.

C. Create a parent pipeline that contains the four pipelines and use an event trigger.

D. Create a parent pipeline that contains the four pipelines and use a schedule trigger

You are creating a new notebook in Azure Databricks that will support R as the primary language but will also support Scala and SQL.
Which switch should you use to switch between languages?

A. \\ [ < language > ]

B. %< language >

C. \\[< language >]

D. @ < language >

B. %< language >