Cluster Deployment#
Model cluster deployment#
Installing Command-Line Tool
All operations below are performed using the dbgpt
command. To use the dbgpt
command, you need to install the DB-GPT project with pip install -e ".[default]"
. Alternatively, you can use python pilot/scripts/cli_scripts.py
as a substitute for the dbgpt
command.
Launch Model Controller#
dbgpt start controller
By default, the Model Controller starts on port 8000.
Launch LLM Model Worker#
If you are starting chatglm2-6b
:
dbgpt start worker --model_name chatglm2-6b \
--model_path /app/models/chatglm2-6b \
--port 8001 \
--controller_addr http://127.0.0.1:8000
If you are starting vicuna-13b-v1.5
:
dbgpt start worker --model_name vicuna-13b-v1.5 \
--model_path /app/models/vicuna-13b-v1.5 \
--port 8002 \
--controller_addr http://127.0.0.1:8000
Note: Be sure to use your own model name and model path.
Launch Embedding Model Worker#
dbgpt start worker --model_name text2vec \
--model_path /app/models/text2vec-large-chinese \
--worker_type text2vec \
--port 8003 \
--controller_addr http://127.0.0.1:8000
Note: Be sure to use your own model name and model path.
Check your model:
dbgpt model list
You will see the following output:
+-----------------+------------+------------+------+---------+---------+-----------------+----------------------------+
| Model Name | Model Type | Host | Port | Healthy | Enabled | Prompt Template | Last Heartbeat |
+-----------------+------------+------------+------+---------+---------+-----------------+----------------------------+
| chatglm2-6b | llm | 172.17.0.2 | 8001 | True | True | | 2023-09-12T23:04:31.287654 |
| WorkerManager | service | 172.17.0.2 | 8001 | True | True | | 2023-09-12T23:04:31.286668 |
| WorkerManager | service | 172.17.0.2 | 8003 | True | True | | 2023-09-12T23:04:29.845617 |
| WorkerManager | service | 172.17.0.2 | 8002 | True | True | | 2023-09-12T23:04:24.598439 |
| text2vec | text2vec | 172.17.0.2 | 8003 | True | True | | 2023-09-12T23:04:29.844796 |
| vicuna-13b-v1.5 | llm | 172.17.0.2 | 8002 | True | True | | 2023-09-12T23:04:24.597775 |
+-----------------+------------+------------+------+---------+---------+-----------------+----------------------------+
Connect to the model service in the webserver (dbgpt_server)#
First, modify the .env
file to change the model name and the Model Controller connection address.
LLM_MODEL=vicuna-13b-v1.5
# The current default MODEL_SERVER address is the address of the Model Controller
MODEL_SERVER=http://127.0.0.1:8000
Start the webserver#
dbgpt start webserver --light
--light
indicates not to start the embedded model service.
Alternatively, you can prepend the command with LLM_MODEL=chatglm2-6b
to start:
LLM_MODEL=chatglm2-6b dbgpt start webserver --light
More Command-Line Usages#
You can view more command-line usages through the help command.
View the dbgpt
help
dbgpt --help
You will see the basic command parameters and usage:
Usage: dbgpt [OPTIONS] COMMAND [ARGS]...
Options:
--log-level TEXT Log level
--version Show the version and exit.
--help Show this message and exit.
Commands:
install Install dependencies, plugins, etc.
knowledge Knowledge command line tool
model Clients that manage model serving
start Start specific server.
stop Start specific server.
View the dbgpt start
help
dbgpt start --help
Here you can see the related commands and usage for start:
Usage: dbgpt start [OPTIONS] COMMAND [ARGS]...
Start specific server.
Options:
--help Show this message and exit.
Commands:
apiserver Start apiserver(TODO)
controller Start model controller
webserver Start webserver(dbgpt_server.py)
worker Start model worker
View the dbgpt start worker
help
dbgpt start worker --help
Here you can see the parameters to start Model Worker:
Usage: dbgpt start worker [OPTIONS]
Start model worker
Options:
--model_name TEXT Model name [required]
--model_path TEXT Model path [required]
--worker_type TEXT Worker type
--worker_class TEXT Model worker class,
pilot.model.cluster.DefaultModelWorker
--host TEXT Model worker deploy host [default: 0.0.0.0]
--port INTEGER Model worker deploy port [default: 8001]
--daemon Run Model Worker in background
--limit_model_concurrency INTEGER
Model concurrency limit [default: 5]
--standalone Standalone mode. If True, embedded Run
ModelController
--register Register current worker to model controller
[default: True]
--worker_register_host TEXT The ip address of current worker to register
to ModelController. If None, the address is
automatically determined
--controller_addr TEXT The Model controller address to register
--send_heartbeat Send heartbeat to model controller
[default: True]
--heartbeat_interval INTEGER The interval for sending heartbeats
(seconds) [default: 20]
--device TEXT Device to run model. If None, the device is
automatically determined
--model_type TEXT Model type, huggingface, llama.cpp and proxy
[default: huggingface]
--prompt_template TEXT Prompt template. If None, the prompt
template is automatically determined from
model path, supported template: zero_shot,vi
cuna_v1.1,llama-2,alpaca,baichuan-chat
--max_context_size INTEGER Maximum context size [default: 4096]
--num_gpus INTEGER The number of gpus you expect to use, if it
is empty, use all of them as much as
possible
--max_gpu_memory TEXT The maximum memory limit of each GPU, only
valid in multi-GPU configuration
--cpu_offloading CPU offloading
--load_8bit 8-bit quantization
--load_4bit 4-bit quantization
--quant_type TEXT Quantization datatypes, `fp4` (four bit
float) and `nf4` (normal four bit float),
only valid when load_4bit=True [default:
nf4]
--use_double_quant Nested quantization, only valid when
load_4bit=True [default: True]
--compute_dtype TEXT Model compute type
--trust_remote_code Trust remote code [default: True]
--verbose Show verbose output.
--help Show this message and exit.
View the dbgpt model
help
dbgpt model --help
The dbgpt model
command can connect to the Model Controller via the Model Controller address and then manage a remote model:
Usage: dbgpt model [OPTIONS] COMMAND [ARGS]...
Clients that manage model serving
Options:
--address TEXT Address of the Model Controller to connect to. Just support
light deploy model, If the environment variable
CONTROLLER_ADDRESS is configured, read from the environment
variable
--help Show this message and exit.
Commands:
chat Interact with your bot from the command line
list List model instances
restart Restart model instances
start Start model instances
stop Stop model instances