Black Friday Sale Limited Time 65% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: pass65

Databricks-Certified-Professional-Data-Scientist Databricks Certified Professional Data Scientist Exam Questions and Answers

Questions 4

Select the correct option which applies to L2 regularization

Options:

A.

Computational efficient due to having analytical solutions

B.

Non-sparse outputs

C.

No feature selection

Buy Now
Questions 5

You are creating a Classification process where input is the income, education and current debt of a customer, what could be the possible output of this process.

Options:

A.

Probability of the customer default on loan repayment

B.

Percentage of the customer loan repayment capability

C.

Percentage of the customer should be given loan or not

D.

The output might be a risk class, such as "good", "acceptable", "average", or "unacceptable".

Buy Now
Questions 6

You are having 1000 patients' data with the height and age. Where age in years and height in meters. You wanted to create cluster using this two attributes. You wanted to have near equal effect for both the age and height while creating the cluster. What you can do?

Options:

A.

You will be adding height with the numeric value 100

B.

You will be converting each height value to centimeters

C.

You will be dividing both age and height with their respective standard deviation

D.

You will be taking square root of height

Buy Now
Questions 7

Under which circumstance do you need to implement N-fold cross-validation after creating a regression model?

Options:

A.

The data is unformatted.

B.

There is not enough data to create a test set.

C.

There are missing values in the data.

D.

There are categorical variables in the model.

Buy Now
Questions 8

Which of the following statement true with regards to Linear Regression Model?

Options:

A.

Ordinary Least Square can be used to estimates the parameters in linear model

B.

In Linear model, it tries to find multiple lines which can approximate the relationship between the outcome and input variables.

C.

Ordinary Least Square is a sum of the individual distance between each point and the fitted line of regression model.

D.

Ordinary Least Square is a sum of the squared individual distance between each point and the fitted line of regression model.

Buy Now
Questions 9

A data scientist wants to predict the probability of death from heart disease based on three risk factors: age, gender, and blood cholesterol level. What is the most appropriate method for this project?

Options:

A.

Linear regression

B.

K-means clustering

C.

Logistic regression

D.

Apriori algorithm

Buy Now
Questions 10

Question-26. There are 5000 different color balls, out of which 1200 are pink color. What is the maximum likelihood estimate for the proportion of "pink" items in the test set of color balls?

Options:

A.

2.4

B.

24 0

C.

.24

D.

.48

E.

4.8

Buy Now
Questions 11

A bio-scientist is working on the analysis of the cancer cells. To identify whether the cell is cancerous or not, there has been hundreds of tests are done with small variations to say yes to the problem. Given the test result for a sample of healthy and cancerous cells, which of the following technique you will use to determine whether a cell is healthy?

Options:

A.

Linear regression

B.

Collaborative filtering

C.

Naive Bayes

D.

Identification Test

Buy Now
Questions 12

Marie is getting married tomorrow, at an outdoor ceremony in the desert. In recent years, it has

rained only 5 days each year. Unfortunately, the weatherman has predicted rain for tomorrow. When it actually rains, the weatherman correctly forecasts rain 90% of the time. When it doesn't rain, he incorrectly forecasts rain 10% of the time. Which of the following will you use to calculate the probability whether it will rain on the

day of Marie’s wedding?

Options:

A.

Naive Bayes

B.

Logistic Regression

C.

Random Decision Forests

D.

All of the above

Buy Now
Questions 13

Which of the following skills a data scientists required?

Options:

A.

Web designing to represent best visuals of its results from algorithm.

B.

He should be creative

C.

Should possess good programming skills

D.

Should be very good at mathematics and statistic

E.

He should possess database administrative skills.

Buy Now
Questions 14

Suppose there are three events then which formula must always be equal to P(E1|E2,E3)?

Options:

A.

P(E1,E2,E3)P(E1)/P(E2:E3)

B.

P(E1,E2;E3)/P(E2,E3)

C.

P(E1,E2|E3)P(E2|E3)P(E3)

D.

P(E1,E2|E3)P(E3)

E.

P(E1,E2,E3)P(E2)P(E3)

Buy Now
Questions 15

Let's say you have two cases as below for the movie ratings

1. You recommend to a user a movie with four stars and he really doesn't like it and he'd rate it two stars

2. You recommend a movie with three stars but the user loves it (he'd rate it five stars). So which statement correctly applies?

Options:

A.

In both cases, the contribution to the RMSE is the same

B.

In both cases, the contribution to the RMSE is the different

C.

In both cases, the contribution to the RMSE, could varies

D.

None of the above

Buy Now
Questions 16

Refer to the Exhibit.

Databricks-Certified-Professional-Data-Scientist Question 16

In the Exhibit, the table shows the values for the input Boolean attributes "A", "B", and "C". It also shows the values for the output attribute "class". Which decision tree is valid for the data?

Options:

A.

Tree A

B.

Tree B

C.

Tree C

D.

Tree D

Buy Now
Questions 17

You are creating a regression model with the input income, education and current debt of a customer, what could be the possible output from this model.

Options:

A.

Customer fit as a good

B.

Customer fit as acceptable or average category

C.

expressed as a percent, that the customer will default on a loan

D.

1 and 3 are correct

E.

2 and 3 are correct

Buy Now
Questions 18

Clustering is a type of unsupervised learning with the following goals

Options:

A.

Maximize a utility function

B.

Find similarities in the training data

C.

Not to maximize a utility function

D.

1 and 2

E.

2 and 3

Buy Now
Questions 19

You are using one approach for the classification where to teach the agent not by giving explicit categorizations, but by using some sort of reward system to indicate success, where agents might be rewarded for doing certain actions and punished for doing others. Which kind of this learning

Options:

A.

Supervised

B.

Unsupervised

C.

Regression

D.

None of the above

Buy Now
Questions 20

If you are trying to predict or forecast a discrete target value, then which is the correct options

Options:

A.

Supervised Learning regression algorithms

B.

Supervised Learning classification algorithms

C.

Un supervised Learning

D.

Density estimation algorithm

Buy Now
Exam Name: Databricks Certified Professional Data Scientist Exam
Last Update: Nov 22, 2024
Questions: 138

PDF + Testing Engine

$57.75  $164.99

Testing Engine

$43.75  $124.99
buy now Databricks-Certified-Professional-Data-Scientist testing engine

PDF (Q&A)

$36.75  $104.99
buy now Databricks-Certified-Professional-Data-Scientist pdf