DB-GPT is an open-source framework for large models in the databases fields. It’s purpose is to build infrastructure for the domain of large models, making it easier and more convenient to develop applications around databases. By developing various technical capabilities such as:
  1. SMMF(Service-oriented Multi-model Management Framework)

  2. Text2SQL Fine-tuning

  3. RAG(Retrieval Augmented Generation) framework and optimization

  4. Data-Driven Agents framework collaboration

  5. GBI(Generative Business intelligence)

etc, DB-GPT simplifies the construction of large model applications based on databases.

In the era of Data 3.0, enterprises and developers can build their own customized applications with less code, leveraging models and databases.


1. Private Domain Q&A & Data Processing
Supports custom construction of knowledge bases through methods such as built-in, multi-file format uploads, and plugin-based web scraping. Enables unified vector storage and retrieval of massive structured and unstructured data.
2.Multi-Data Source & GBI(Generative Business intelligence)
Supports interaction between natural language and various data sources such as Excel, databases, and data warehouses. Also supports analysis reporting.
3.SMMF(Service-oriented Multi-model Management Framework)
Supports a wide range of models, including dozens of large language models such as open-source models and API proxies. Examples include LLaMA/LLaMA2, Baichuan, ChatGLM, Wenxin, Tongyi, Zhipu, Xinghuo, etc.
4.Automated Fine-tuning
A lightweight framework for automated fine-tuning built around large language models, Text2SQL datasets, and methods like LoRA/QLoRA/Pturning. Makes TextSQL fine-tuning as convenient as a production line.
5.Data-Driven Multi-Agents & Plugins
Supports executing tasks through custom plugins and natively supports the Auto-GPT plugin model. Agents protocol follows the Agent Protocol standard.
6.Privacy and Security
Ensures data privacy and security through techniques such as privatizing large models and proxy de-identification.

Getting Started#

Concepts and terminology


These modules are the core abstractions with which we can interact with data and environment smoothly. It’s very important for DB-GPT, DB-GPT also provide standard, extendable interfaces.
The docs for each module contain quickstart examples, how to guides, reference docs, and conceptual guides.
The modules are as follows
  • LLMs: Supported multi models management and integrations.

  • Prompts: Prompt management, optimization, and serialization for multi database.

  • Plugins: Plugins management, scheduler.

  • Knowledge: Knowledge management, embedding, and search.

  • Connections: Supported multi databases connection. management connections and interact with this.

  • Vector: Supported multi vector database.


Additional resources we think may be useful as you develop your application!
  • Discord: if your have some problem or ideas, you can talk from discord.