Unlocking Database Insights: A Guide to Vanna SQL Framework
Written on
Chapter 1: Introduction to Vanna
Are you looking for a swift way to extract information from your database? The Vanna framework is designed to make this task effortless. It is an open-source application that enables users to generate SQL code, allowing data to reveal its insights.
Vanna offers three distinct modes to meet varying user requirements:
- Open-Source Mode: This mode allows integration with any preferred language model, providing full customization. Think of it as building with blocks—freedom to create in any way you like.
- Free Mode: This option includes a daily limit on usage and free storage services. Utilizing GPT-3.5, you can delve into your data without incurring any costs.
- Paid Mode: For those needing more extensive capabilities, this mode provides unlimited usage, GPT-4 support, and service guarantees.
Imagine using Vanna, whether as a developer or analyst, to uncover valuable insights from your data. Let’s delve deeper into Vanna’s functionality and implementation.
Section 1.1: How Vanna Operates
Picture a tool that lets you interact with your database as effortlessly as a casual chat. Vanna is a Python package that employs RAG (Retrieval-Augmented Generation) technology to swiftly create SQL queries tailored to your needs.
Using Vanna involves a straightforward two-step process:
- Train a model.
- Pose your questions, and it will respond with SQL.
RAG technology finds relevant information based on your inquiries and generates appropriate answers. You don't need to grasp all the technicalities; just know that it can effectively address your needs.
Being open-source, Vanna can be hosted on your server, and its data processing capabilities improve with more training data. On the security side, Vanna prioritizes your data's safety, ensuring it remains secure unless a specific feature is activated. The more you utilize Vanna, the better it adapts to your preferences, continually enhancing its understanding.
Vanna supports several databases, including Snowflake, BigQuery, and Postgres, and has the flexibility to create connectors for other systems. You can start using it via a Jupyter Notebook, Slackbot, web applications, or Streamlit.
Section 1.2: Getting Started with Vanna's API
To begin, visit Vanna’s official website and select “Get Started for Free.”
Next, you'll need to create a model name, similar to naming a new companion. After that, a key will be provided, which grants you daily access to 100k free credits.
After this, you can run the code to set up the platform. A helpful tip: use the email you registered with and enter the verification code.
import vanna
from vanna.remote import VannaDefault
vn = VannaDefault(model='chinook', api_key=vanna.get_api_key('[email protected]'))
vn.ask("What are the top 10 albums by sales?")
from vanna.flask import VannaFlaskApp
VannaFlaskApp(vn).run()
Vanna provides default training data for you, but feel free to add your own if desired.
When you input a prompt, Vanna generates results tailored to your specifications, akin to conversing with an intelligent assistant.
Chapter 2: Running Vanna Locally
This method allows you to utilize various models for free and is user-friendly. It supports models from OpenAI, Anthropic, and Ollama, and even allows for customization of additional models. You can also leverage different vector databases, including the online VectorDB.
Using ChromaDB’s open-source vector database is completely free and local, requiring no extra setup. All database files are automatically created and stored on your machine. Alternatively, you can choose Marqo, which is also free for local use but requires some setup. You may also explore their hosted options.
Here’s a code example to illustrate how to use the open-source Ollama framework alongside Vanna to implement an SQL model. The entire procedure is free, highly convenient, and quick.
from vanna.ollama import Ollama
from vanna.chromadb.chromadb_vector import ChromaDB_VectorStore
class MyVanna(ChromaDB_VectorStore, Ollama):
def __init__(self, config=None):
ChromaDB_VectorStore.__init__(self, config=config)
Ollama.__init__(self, config=config)
vn = MyVanna(config={'model': 'gemma:7b'})
vn.connect_to_sqlite('my-database.sqlite')
df_ddl = vn.run_sql("SELECT type, sql FROM sqlite_master WHERE sql is not null")
for ddl in df_ddl['sql'].to_list():
vn.train(ddl=ddl)
vn.train(ddl="""
CREATE TABLE IF NOT EXISTS my-table (
id INT PRIMARY KEY,
name VARCHAR(100),
age INT
)
""")
vn.train(documentation="Our business defines OTIF score as the percentage of orders that are delivered on time and in full")
vn.train(sql="SELECT * FROM my-table WHERE name = 'John Doe'")
training_data = vn.get_training_data()
from vanna.flask import VannaFlaskApp
VannaFlaskApp(vn).run()
Bear in mind that this process may require some time, as your model needs to be trained on the SQL data.
Conclusion
In conclusion, we have explored the capabilities of Vanna, from its easy registration and setup to the ability to interact with databases as simply as chatting. Vanna demonstrates remarkable flexibility and power.
You can opt for the open-source mode to customize it as per your requirements, or start with the free mode without any financial commitment. For those with greater demands, the paid mode offers unlimited usage and enhanced support.
Visualize a scenario where you can interact with your data using just a few lines of code. It may sound futuristic, but this technology is already at your fingertips, ready for exploration.
If you appreciated this article, consider subscribing to Medium for notifications on new posts and access to a wealth of articles from other authors.
I am Li Meng, an independent open-source software developer and the creator of SolidUI. I am deeply interested in new technologies, especially in the AI and data sectors. If my content resonates with you, please follow, like, and share. Thank you!