KBQA#

chat_knowledge

DB-GPT supports a knowledge question-answering module, which aims to create an intelligent expert in the field of databases and provide professional knowledge-based answers to database practitioners.

chat_knowledge

KBQA abilities#

KBQA abilities

  • Knowledge Space.

  • Multi Source Knowledge Source Embedding.

  • Embedding Argument Adjust

  • Chat Knowledge

  • Multi Vector DB

Note

If your DB type is Sqlite, there is nothing to do to build KBQA service database schema.

If your DB type is Mysql or other DBTYPE, you will build kbqa service database schema.

Mysql

$ mysql -h127.0.0.1 -uroot -paa12345678 < ./assets/schema/knowledge_management.sql

or

execute DBGPT/assets/schema/knowledge_management.sql

Steps to KBQA In DB-GPT#

1.Create Knowledge Space#

If you are using Knowledge Space for the first time, you need to create a Knowledge Space and set your name, owner, description. create_space

2.Create Knowledge Document#

DB-GPT now support Multi Knowledge Source, including Text, WebUrl, and Document(PDF, Markdown, Word, PPT, HTML and CSV). After successfully uploading a document for translation, the backend system will automatically read and split and chunk the document, and then import it into the vector database. Alternatively, you can manually synchronize the document. You can also click on details to view the specific document slicing content.

2.1 Choose Knowledge Type:#

document

2.2 Upload Document:#

upload

3.Chat With Knowledge#

upload

4.Adjust Space arguments#

Each knowledge space supports argument customization, including the relevant arguments for vector retrieval and the arguments for knowledge question-answering prompts.

4.1 Embedding#

Embedding Argument upload

Tip

Embedding arguments

  • topk:the top k vectors based on similarity score.

  • recall_score:set a threshold score for the retrieval of similar vectors.

  • recall_type:recall type.

  • model:A model used to create vector representations of text or other data.

  • chunk_size:The size of the data chunks used in processing.

  • chunk_overlap:The amount of overlap between adjacent data chunks.

4.2 Prompt#

Prompt Argument upload

Tip

Prompt arguments

  • scene:A contextual parameter used to define the setting or environment in which the prompt is being used.

  • template:A pre-defined structure or format for the prompt, which can help ensure that the AI system generates responses that are consistent with the desired style or tone.

  • max_token:The maximum number of tokens or words allowed in a prompt.

5.Change Vector Database#

Vector Store SETTINGS

Chroma

  • VECTOR_STORE_TYPE=Chroma

MILVUS

  • VECTOR_STORE_TYPE=Milvus

  • MILVUS_URL=127.0.0.1

  • MILVUS_PORT=19530

  • MILVUS_USERNAME

  • MILVUS_PASSWORD

  • MILVUS_SECURE=

WEAVIATE

  • WEAVIATE_URL=https://kt-region-m8hcy0wc.weaviate.network

KBQA command line#

Load your local documents to DB-GPT#

dbgpt knowledge load --space_name my_kbqa_space --local_doc_path ./pilot/datasets --vector_store_type Chroma
  • --space_name: Your knowledge space name, default: default

  • --local_doc_path: Your document directory or document file path, default: ./pilot/datasets

  • --vector_store_type: Vector store type, default: Chroma

View the dbgpt knowledge load --helphelp

dbgpt knowledge load --help

Here you can see the parameters:

Usage: dbgpt knowledge load [OPTIONS]

  Load your local knowledge to DB-GPT

Options:
  --space_name TEXT         Your knowledge space name  [default: default]
  --vector_store_type TEXT  Vector store type.  [default: Chroma]
  --local_doc_path TEXT     Your document directory or document file path.
                            [default: ./pilot/datasets]
  --skip_wrong_doc          Skip wrong document.
  --overwrite               Overwrite existing document(they has same name).
  --max_workers INTEGER     The maximum number of threads that can be used to
                            upload document.
  --pre_separator TEXT      Preseparator, this separator is used for pre-
                            splitting before the document is actually split by
                            the text splitter. Preseparator are not included
                            in the vectorized text.
  --separator TEXT          This is the document separator. Currently, only
                            one separator is supported.
  --chunk_size INTEGER      Maximum size of chunks to split.
  --chunk_overlap INTEGER   Overlap in characters between chunks.
  --help                    Show this message and exit.

List knowledge space#

List knowledge space#

dbgpt knowledge list

Output should look something like the following:

+------------------------------------------------------------------+
|                       All knowledge spaces                       |
+----------+-------------+-------------+-------------+-------------+
| Space ID |  Space Name | Vector Type |    Owner    | Description |
+----------+-------------+-------------+-------------+-------------+
|    6     |      n1     |    Chroma   |    DB-GPT   |  DB-GPT cli |
|    5     |  default_2  |    Chroma   |    DB-GPT   |  DB-GPT cli |
|    4     |  default_1  |    Chroma   |    DB-GPT   |  DB-GPT cli |
|    3     |   default   |    Chroma   |    DB-GPT   |  DB-GPT cli |
+----------+-------------+-------------+-------------+-------------+

List documents in knowledge space#

dbgpt knowledge list --space_name default

Output should look something like the following:

+------------------------------------------------------------------------+
|                       Space default description                        |
+------------+-----------------+--------------+--------------+-----------+
| Space Name | Total Documents | Current Page | Current Size | Page Size |
+------------+-----------------+--------------+--------------+-----------+
|  default   |        1        |      1       |      1       |     20    |
+------------+-----------------+--------------+--------------+-----------+

+-----------------------------------------------------------------------------------------------------------------------------------+
|                                                     Documents of space default                                                    |
+------------+-------------+---------------+----------+--------+----------------------------+----------+----------------------------+
| Space Name | Document ID | Document Name |   Type   | Chunks |         Last Sync          |  Status  |           Result           |
+------------+-------------+---------------+----------+--------+----------------------------+----------+----------------------------+
|  default   |      61     | Knowledge.pdf | DOCUMENT |   745   | 2023-09-28T03:25:39.065762 | FINISHED | document embedding success |
+------------+-------------+---------------+----------+--------+----------------------------+----------+----------------------------+

List chunks of document in space default#

dbgpt knowledge list --space_name default --doc_id 61 --page_size 5
+-----------------------------------------------------------------------------------+
|                         Document 61 in default description                        |
+------------+-------------+--------------+--------------+--------------+-----------+
| Space Name | Document ID | Total Chunks | Current Page | Current Size | Page Size |
+------------+-------------+--------------+--------------+--------------+-----------+
|  default   |      61     |      745     |      1       |      5       |     5     |
+------------+-------------+--------------+--------------+--------------+-----------+

+-----------------------------------------------------------------------------------------------------------------------+
|                                       chunks of document id 61 in space default                                       |
+------------+-------------+---------------+----------+-----------------------------------------------------------------+
| Space Name | Document ID | Document Name | Content  |                            Meta Data                            |
+------------+-------------+---------------+----------+-----------------------------------------------------------------+
|  default   |      61     | Knowledge.pdf | [Hidden] | {'source': '/app/pilot/data/default/Knowledge.pdf', 'page': 10} |
|  default   |      61     | Knowledge.pdf | [Hidden] |  {'source': '/app/pilot/data/default/Knowledge.pdf', 'page': 9} |
|  default   |      61     | Knowledge.pdf | [Hidden] |  {'source': '/app/pilot/data/default/Knowledge.pdf', 'page': 9} |
|  default   |      61     | Knowledge.pdf | [Hidden] |  {'source': '/app/pilot/data/default/Knowledge.pdf', 'page': 8} |
|  default   |      61     | Knowledge.pdf | [Hidden] |  {'source': '/app/pilot/data/default/Knowledge.pdf', 'page': 8} |
+------------+-------------+---------------+----------+-----------------------------------------------------------------+

More list usage#

dbgpt knowledge list --help
Usage: dbgpt knowledge list [OPTIONS]

  List knowledge space

Options:
  --space_name TEXT               Your knowledge space name. If None, list all
                                  spaces
  --doc_id INTEGER                Your document id in knowledge space. If Not
                                  None, list all chunks in current document
  --page INTEGER                  The page for every query  [default: 1]
  --page_size INTEGER             The page size for every query  [default: 20]
  --show_content                  Query the document content of chunks
  --output [text|html|csv|latex|json]
                                  The output format
  --help                          Show this message and exit.

Delete your knowledge space or document in space#

Delete your knowledge space#

dbgpt knowledge delete --space_name default

Delete your document in space#

dbgpt knowledge delete --space_name default --doc_name Knowledge.pdf

More delete usage#

dbgpt knowledge delete --help
Usage: dbgpt knowledge delete [OPTIONS]

  Delete your knowledge space or document in space

Options:
  --space_name TEXT  Your knowledge space name  [default: default]
  --doc_name TEXT    The document name you want to delete. If doc_name is
                     None, this command will delete the whole space.
  -y                 Confirm your choice
  --help             Show this message and exit.

More knowledge usage#

dbgpt knowledge --help
Usage: dbgpt knowledge [OPTIONS] COMMAND [ARGS]...

  Knowledge command line tool

Options:
  --address TEXT  Address of the Api server(If not set, try to read from
                  environment variable: API_ADDRESS).  [default:
                  http://127.0.0.1:5000]
  --help          Show this message and exit.

Commands:
  delete  Delete your knowledge space or document in space
  list    List knowledge space
  load    Load your local documents to DB-GPT