DA0-001 Practice Test Questions

393 Questions


An analyst runs a report on a daily basis, and the number of datapoints must be validated before the data can be analyzed. The number of datapoints increases each day by approximately 20% of the total number from the day before. On a given day, the number of datapoints was 8,798. Which of the following should be the total number of datapoints on the next day?


A. 7,038


B. 9,600


C. 10,600


D. 10,800





C.
  10,600

A customer list from a financial services company is shown below:

A data analyst wants to create a likely-to-buy score on a scale from 0 to 100, based on an average of the three numerical variables: number of credit cards, age, and income. Which of the following should the analyst do to the variables to ensure they all have the same weight in the score calculation?


A. Recode the variables.


B. Calculate the percentiles of the variables.


C. Calculate the standard deviations of the variables.


D. Normalize the variables.





D.
  Normalize the variables.

An analyst is working with the income data of suburban families in the United States. The data set has a lot of outliers, and the analyst needs to provide a measure that represents the typical income. Which of the following would BEST fulfill the analyst’s goal?


A. Median


B. Mean


C. Mode


D. Standard deviation





A.
  Median

A data analyst needs to apply quality control concepts to a data set for accuracy. Which of the following is the best way to do this?


A. Standardization


B. Parameterization


C. Encryption


D. Cross-validation





D.
  Cross-validation

Which of the following data cleansing issues will be fixed when a DISTINCT function is applied?


A. Missing data


B. Duplicate data


C. Redundant data


D. Invalid data





B.
  Duplicate data

An analyst wants to determine whether a relationship between an individual's age and voting preferences exists. Which of the following is the best statistical method for the analyst to use?


A. P-value


B. Chi-squared


C. F-test


D. Z-score





B.
  Chi-squared

Which one the following is not considered an aggregate function?


A. SUM


B. MIN


C. SELECT


D. MAX





C.
  SELECT

Given the following:

Which of the following is the most important thing for an analyst to do when transforming the table for a trend analysis?


A. Fill in the missing cost where it is null.


B. Separate the table into two tables and create a primary key


C. Replace the extended cost field with a calculated field.


D. Correct the dates so they have the same format.





D.
  Correct the dates so they have the same format.

An analyst in a consumer bank department wants to showcase the concentration of accounts opened in the United States by ZIP Code to describe the effectiveness of the bank's marketing campaigns. Which of the following would be the best way to visualize the data?


A. A stacked chart


B. A tree map


C. A waterfall chart


D. A geographic map





D.
  A geographic map

Which of the following is the first step an analyst should perform upon receiving a business request for analysis?


A. Determine the data needs and sources for analysis.


B. Initiate the analysis for exploratory data analysis.


C. Review the business questions to understand the scope.


D. Finalize the methodology to solve the problem.





C.
  Review the business questions to understand the scope.

An analyst for a small business with multiple locations is using each location’s quarterly sales reports from last year to create a single revenue report for the year. Which of the following data mining techniques should the analyst use to complete this task?


A. Data merge


B. Data append


C. Data blending


D. Data imputation





A.
  Data merge

Which of the following would be the best way to identify multicollinear attributes in a data set?


A. Correlation coefficient


B. Chi-squared test


C. Two-sample f-test


D. Two-way ANOVA





A.
  Correlation coefficient


Page 4 out of 33 Pages
Previous