Databricks-Generative-AI-Engineer-Associate Databricks Certified Generative AI Engineer Associate Questions and Answers

Questions 4

A Generative AI Engineer has created a RAG application which can help employees retrieve answers from an internal knowledge base, such as Confluence pages or Google Drive. The prototype application is now working with some positive feedback from internal company testers. Now the Generative Al Engineer wants to formally evaluate the system’s performance and understand where to focus their efforts to further improve the system.

How should the Generative AI Engineer evaluate the system?

Options:

Use cosine similarity score to comprehensively evaluate the quality of the final generated answers.

Curate a dataset that can test the retrieval and generation components of the system separately. Use MLflow’s built in evaluation metrics to perform the evaluation on the retrieval and generation components.

Benchmark multiple LLMs with the same data and pick the best LLM for the job.

Use an LLM-as-a-judge to evaluate the quality of the final answers generated.

Buy Now

Questions 5

A company has a typical RAG-enabled, customer-facing chatbot on its website.

Databricks-Generative-AI-Engineer-Associate Question 5

Select the correct sequence of components a user's questions will go through before the final output is returned. Use the diagram above for reference.

Options:

1.embedding model, 2.vector search, 3.context-augmented prompt, 4.response-generating LLM

1.context-augmented prompt, 2.vector search, 3.embedding model, 4.response-generating LLM

1.response-generating LLM, 2.vector search, 3.context-augmented prompt, 4.embedding model

1.response-generating LLM, 2.context-augmented prompt, 3.vector search, 4.embedding model

Buy Now

Questions 6

A Generative Al Engineer is building a system which will answer questions on latest stock news articles.

Which will NOT help with ensuring the outputs are relevant to financial news?

Options:

Implement a comprehensive guardrail framework that includes policies for content filters tailored to the finance sector.

Increase the compute to improve processing speed of questions to allow greater relevancy analysis

C Implement a profanity filter to screen out offensive language

Incorporate manual reviews to correct any problematic outputs prior to sending to the users

Buy Now

Questions 7

A Generative Al Engineer is building a RAG application that answers questions about internal documents for the company SnoPen AI.

The source documents may contain a significant amount of irrelevant content, such as advertisements, sports news, or entertainment news, or content about other companies.

Which approach is advisable when building a RAG application to achieve this goal of filtering irrelevant information?

Options:

Keep all articles because the RAG application needs to understand non-company content to avoid answering questions about them.

Include in the system prompt that any information it sees will be about SnoPenAI, even if no data filtering is performed.

Include in the system prompt that the application is not supposed to answer any questions unrelated to SnoPen Al.

Consolidate all SnoPen AI related documents into a single chunk in the vector database.

Buy Now

Questions 8

A Generative Al Engineer is ready to deploy an LLM application written using Foundation Model APIs. They want to follow security best practices for production scenarios

Which authentication method should they choose?

Options:

Use an access token belonging to service principals

Use a frequently rotated access token belonging to either a workspace user or a service principal

Use OAuth machine-to-machine authentication

Use an access token belonging to any workspace user

Buy Now

Answer:

Explanation:

The task is to deploy an LLM application using Foundation Model APIs in a production environment while adhering to security best practices. Authentication is critical for securing access to Databricks resources, such as the Foundation Model API. Let’s evaluate the options based on Databricks’ security guidelines for production scenarios.

Option A: Use an access token belonging to service principals

Service principals are non-human identities designed for automated workflows and applications in Databricks. Using an access token tied to a service principal ensures that the authentication is scoped to the application, follows least-privilege principles (via role-based access control), and avoids reliance on individual user credentials. This is a security best practice for production deployments.

Databricks Reference:"For production applications, use service principals with access tokens to authenticate securely, avoiding user-specific credentials"("Databricks Security Best Practices," 2023). Additionally, the "Foundation Model API Documentation" states:"Service principal tokens are recommended for programmatic access to Foundation Model APIs."

Option B: Use a frequently rotated access token belonging to either a workspace user or a service principal

Frequent rotation enhances security by limiting token exposure, but tying the token to a workspace user introduces risks (e.g., user account changes, broader permissions). Including both user and service principal options dilutes the focus on application-specific security, making this less ideal than a service-principal-only approach. It also adds operational overhead without clear benefits over Option A.

Databricks Reference:"While token rotation is a good practice, service principals are preferred over user accounts for application authentication"("Managing Tokens in Databricks," 2023).

Option C: Use OAuth machine-to-machine authentication

OAuth M2M (e.g., client credentials flow) is a secure method for application-to-service communication, often using service principals under the hood. However, Databricks’ Foundation Model API primarily supports personal access tokens (PATs) or service principal tokens over full OAuth flows for simplicity in production setups. OAuth M2M adds complexity (e.g., managing refresh tokens) without a clear advantage in this context.

Databricks Reference:"OAuth is supported in Databricks, but service principal tokens are simpler and sufficient for most API-based workloads"("Databricks Authentication Guide," 2023).

Option D: Use an access token belonging to any workspace user

Using a user’s access token ties the application to an individual’s identity, violating security best practices. It risks exposure if the user leaves, changes roles, or has overly broad permissions, and it’s not scalable or auditable for production.

Databricks Reference:"Avoid using personal user tokens for production applications due to security and governance concerns"("Databricks Security Best Practices," 2023).

Conclusion: Option A is the best choice, as it uses a service principal’s access token, aligning with Databricks’ security best practices for production LLM applications. It ensures secure, application-specific authentication with minimal complexity, as explicitly recommended for Foundation Model API deployments.

Questions 9

A Generative Al Engineer is using an LLM to classify species of edible mushrooms based on text descriptions of certain features. The model is returning accurate responses in testing and the Generative Al Engineer is confident they have the correct list of possible labels, but the output frequently contains additional reasoning in the answer when the Generative Al Engineer only wants to return the label with no additional text.

Which action should they take to elicit the desired behavior from this LLM?

Options:

Use few snot prompting to instruct the model on expected output format

Use zero shot prompting to instruct the model on expected output format

Use zero shot chain-of-thought prompting to prevent a verbose output format

Use a system prompt to instruct the model to be succinct in its answer

Buy Now

Answer:

Explanation:

The LLM classifies mushroom species accurately but includes unwanted reasoning text, and the engineer wants only the label. Let’s assess how to control output format effectively.

Option A: Use few shot prompting to instruct the model on expected output format

Few-shot prompting provides examples (e.g., input: description, output: label). It can work but requires crafting multiple examples, which is effort-intensive and less direct than a clear instruction.

Databricks Reference:"Few-shot prompting guides LLMs via examples, effective for format control but requires careful design"("Generative AI Cookbook").

Option B: Use zero shot prompting to instruct the model on expected output format

Zero-shot prompting relies on a single instruction (e.g., “Return only the label”) without examples. It’s simpler than few-shot but may not consistently enforce succinctness if the LLM’s default behavior is verbose.

Databricks Reference:"Zero-shot prompting can specify output but may lack precision without examples"("Building LLM Applications with Databricks").

Option C: Use zero shot chain-of-thought prompting to prevent a verbose output format

Chain-of-Thought (CoT) encourages step-by-step reasoning, which increases verbosity—opposite to the desired outcome. This contradicts the goal of label-only output.

Databricks Reference:"CoT prompting enhances reasoning but often results in detailed responses"("Databricks Generative AI Engineer Guide").

Option D: Use a system prompt to instruct the model to be succinct in its answer

A system prompt (e.g., “Respond with only the species label, no additional text”) sets a global instruction for the LLM’s behavior. It’s direct, reusable, and effective for controlling output style across queries.

Databricks Reference:"System prompts define LLM behavior consistently, ideal for enforcing concise outputs"("Generative AI Cookbook," 2023).

Conclusion: Option D is the most effective and straightforward action, using a system prompt to enforce succinct, label-only responses, aligning with Databricks’ best practices for output control.

Questions 10

A Generative Al Engineer has successfully ingested unstructured documents and chunked them by document sections. They would like to store the chunks in a Vector Search index. The current format of the dataframe has two columns: (i) original document file name (ii) an array of text chunks for each document.

What is the most performant way to store this dataframe?

Options:

Split the data into train and test set, create a unique identifier for each document, then save to a Delta table

Flatten the dataframe to one chunk per row, create a unique identifier for each row, and save to a Delta table

First create a unique identifier for each document, then save to a Delta table

Store each chunk as an independent JSON file in Unity Catalog Volume. For each JSON file, the key is the document section name and the value is the array of text chunks for that section

Buy Now

Questions 11

A Generative AI Engineer wants to build an LLM-based solution to help a restaurant improve its online customer experience with bookings by automatically handling common customer inquiries. The goal of the solution is to minimize escalations to human intervention and phone calls while maintaining a personalized interaction. To design the solution, the Generative AI Engineer needs to define the input data to the LLM and the task it should perform.

Which input/output pair will support their goal?

Options:

Input: Online chat logs; Output: Group the chat logs by users, followed by summarizing each user’s interactions

Input: Online chat logs; Output: Buttons that represent choices for booking details

Input: Customer reviews; Output: Classify review sentiment

Input: Online chat logs; Output: Cancellation options

Buy Now

Questions 12

A Generative AI Engineer is building a RAG application that will rely on context retrieved from source documents that are currently in PDF format. These PDFs can contain both text and images. They want to develop a solution using the least amount of lines of code.

Which Python package should be used to extract the text from the source documents?

Options:

flask

beautifulsoup

unstructured

numpy

Buy Now

Questions 13

A Generative Al Engineer is tasked with improving the RAG quality by addressing its inflammatory outputs.

Which action would be most effective in mitigating the problem of offensive text outputs?

Options:

Increase the frequency of upstream data updates

Inform the user of the expected RAG behavior

Restrict access to the data sources to a limited number of users

Curate upstream data properly that includes manual review before it is fed into the RAG system

Buy Now

Questions 14

A Generative Al Engineer is tasked with developing a RAG application that will help a small internal group of experts at their company answer specific questions, augmented by an internal knowledge base. They want the best possible quality in the answers, and neither latency nor throughput is a huge concern given that the user group is small and they’re willing to wait for the best answer. The topics are sensitive in nature and the data is highly confidential and so, due to regulatory requirements, none of the information is allowed to be transmitted to third parties.

Which model meets all the Generative Al Engineer’s needs in this situation?

Options:

Dolly 1.5B

OpenAI GPT-4

BGE-large

Llama2-70B

Buy Now

Questions 15

A Generative Al Engineer has built an LLM-based system that will automatically translate user text between two languages. They now want to benchmark multiple LLM's on this task and pick the best one. They have an evaluation set with known high quality translation examples. They want to evaluate each LLM using the evaluation set with a performant metric.

Which metric should they choose for this evaluation?

Options:

ROUGE metric

BLEU metric

NDCG metric

RECALL metric

Buy Now

Questions 16

A Generative AI Engineer is designing a chatbot for a gaming company that aims to engage users on its platform while its users play online video games.

Which metric would help them increase user engagement and retention for their platform?

Options:

Randomness

Diversity of responses

Lack of relevance

Repetition of responses

Buy Now

Questions 17

A Generative Al Engineer needs to design an LLM pipeline to conduct multi-stage reasoning that leverages external tools. To be effective at this, the LLM will need to plan and adapt actions while performing complex reasoning tasks.

Which approach will do this?

Options:

Tram the LLM to generate a single, comprehensive response without interacting with any external tools, relying solely on its pre-trained knowledge.

Implement a framework like ReAct which allows the LLM to generate reasoning traces and perform task-specific actions that leverage external tools if necessary.

Encourage the LLM to make multiple API calls in sequence without planning or structuring the calls, allowing the LLM to decide when and how to use external tools spontaneously.

Use a Chain-of-Thought (CoT) prompting technique to guide the LLM through a series of reasoning steps, then manually input the results from external tools for the final answer.

Buy Now

Answer:

Explanation:

The task requires an LLM pipeline for multi-stage reasoning with external tools, necessitating planning, adaptability, and complex reasoning. Let’s evaluate the options based on Databricks’ recommendations for advanced LLM workflows.

Option A: Train the LLM to generate a single, comprehensive response without interacting with any external tools, relying solely on its pre-trained knowledge

This approach limits the LLM to its static knowledge base, excluding external tools and multi-stage reasoning. It can’t adapt or plan actions dynamically, failing the requirements.

Databricks Reference:"External tools enhance LLM capabilities beyond pre-trained knowledge"("Building LLM Applications with Databricks," 2023).

Option B: Implement a framework like ReAct which allows the LLM to generate reasoning traces and perform task-specific actions that leverage external tools if necessary

ReAct (Reasoning + Acting) combines reasoning traces (step-by-step logic) with actions (e.g., tool calls), enabling the LLM to plan, adapt, and execute complex tasks iteratively. This meets all requirements: multi-stage reasoning, tool use, and adaptability.

Databricks Reference:"Frameworks like ReAct enable LLMs to interleave reasoning and external tool interactions for complex problem-solving"("Generative AI Cookbook," 2023).

Option C: Encourage the LLM to make multiple API calls in sequence without planning or structuring the calls, allowing the LLM to decide when and how to use external tools spontaneously

Unstructured, spontaneous API calls lack planning and may lead to inefficient or incorrect tool usage. This doesn’t ensure effective multi-stage reasoning or adaptability.

Databricks Reference: Structured frameworks are preferred:"Ad-hoc tool calls can reduce reliability in complex tasks"("Building LLM-Powered Applications").

Option D: Use a Chain-of-Thought (CoT) prompting technique to guide the LLM through a series of reasoning steps, then manually input the results from external tools for the final answer

CoT improves reasoning but relies on manual tool interaction, breaking automation and adaptability. It’s not a scalable pipeline solution.

Databricks Reference:"Manual intervention is impractical for production LLM pipelines"("Databricks Generative AI Engineer Guide").

Conclusion: Option B (ReAct) is the best approach, as it integrates reasoning and tool use in a structured, adaptive framework, aligning with Databricks’ guidance for complex LLM workflows.

Questions 18

What is an effective method to preprocess prompts using custom code before sending them to an LLM?

Options:

Directly modify the LLM’s internal architecture to include preprocessing steps

It is better not to introduce custom code to preprocess prompts as the LLM has not been trained with examples of the preprocessed prompts

Rather than preprocessing prompts, it’s more effective to postprocess the LLM outputs to align the outputs to desired outcomes

Write a MLflow PyFunc model that has a separate function to process the prompts

Buy Now

Exam Code: Databricks-Generative-AI-Engineer-Associate

Exam Name: Databricks Certified Generative AI Engineer Associate

Last Update: Jul 9, 2025

Questions: 61

PDF + Testing Engine

$72.6 ~~$181.49~~

Testing Engine

$57.8 ~~$144.49~~

PDF (Q&A)

$49.8 ~~$124.49~~

buy now Databricks-Generative-AI-Engineer-Associate pdf

Databricks-Generative-AI-Engineer-Associate Databricks Certified Generative AI Engineer Associate Questions and Answers

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

PDF + Testing Engine

Testing Engine

PDF (Q&A)

Quick Links

Why Us

Unlimited Packages

Marks4sure

Site Secure