Databricks Genie – How to get started

Databricks Genie is a cutting-edge feature designed to transform how you interact with your data. Built within the Databricks Lakehouse ecosystem, Genie enables conversational queries, making complex data analysis intuitive and accessible for all users. In other words, it empowers business users to mine and gain insights from data just by having a chat… pretty cool, right?


What are the prerequisites for Genie? 

  • Unity Catalog – The data you want to use needs to be registered in Unity Catalog.

Unity Catalog is Databricks’ unified data governance solutions for all data and AI assets. It provides centralized access controls, auditing, lineage, and data discovery across all Databricks workspaces.

  • Compute – You must have a Pro or Serverless SQL warehouse.

Permissions

  • Compute – Users, Editors, or Creators require at least “CAN USE” access to the default SQL warehouse of the space. Creators also require “CAN RUN” permission on the warehouse they intend to select as the default.
  • Data Access – Users utilizing the space need at least “SELECT” access on the underlying data the space is using as its source.
  • Genie Space – Users need “CAN RUN” on the Genie Space to interact with Genie. More information on Genie ACLs can be found here.

How to Set It Up

create genie space

  1. Create a Genie Space
  • Log in to your Databricks workspace.
  • Select Create Genie Space and define its parameters.
  1. Connect Your Data
  • Ensure your Lakehouse is properly configured, with secure access permissions granted for Genie to query datasets effectively.
  1. Start Asking Questions
  • Once configured, you can type natural language questions like:
    “What is the name of the person with the most Databricks certs?”
  • Genie processes these queries to deliver actionable insights in real time.

actionable insights through genie

Key Benefits

  • Ease of Use – No SQL expertise is needed; Genie uses natural language processing.
  • Efficiency – Quickly analyse data by asking the right questions.
  • Collaboration – Share insights across teams directly from your Genie space.

Security

Genie uses Azures Open AI model to return its responses and does so by sending the following to the model:

  • The natural language prompt submitted by the user
  • Table names and descriptions
  • Column titles and descriptions
  • General instructions
  • Example SQL queries
  • SQL functions

flowchart of azures open AI model used by Genie

AI/BI Genie | Databricks

Databricks has also opted into the exemption from abuse monitoring and human review program which means prompts and completions send to and from the Azure OpenAI service and not stored by Microsoft.


Final Questions

Who is Genie for?

Genie is great for non-technical users, data analysis consultants, and engineers looking to streamline data interactions.

What data can Genie use?

Genie uses your Databricks Lakehouse to access connected and permissioned datasets. This helps with security, as it uses Unity Catalog to manage permissions.

When should I use Genie?

Genie can be used for a multitude of tasks, for example:

  • Exploration
  • Summary creation
  • Visualisation
  • Reporting

Summary

With Databricks Genie, data-driven decisions become faster, easier, and more collaborative. By leveraging conversational AI directly within your Lakehouse, you unlock a new level of insight and efficiency.

Talk to us about how we can support and visit our other AI articles below.