docs(model-spec): update gguf model naming for multiple file models (#3356)

2024-11-21 07:50:13 +00:00 · 2024-11-02 06:39:59 +08:00 · 2024-11-02 06:39:59 +08:00 · d12b6d47bc
commit d12b6d47bc
parent 8ffec60ee5
1 changed files with 11 additions and 5 deletions
--- a/MODEL_SPEC.md
+++ b/MODEL_SPEC.md
@ -1,11 +1,12 @@
 # Tabby Model Specification

-Tabby organizes the model within a directory. This document provides an explanation of the necessary contents for supporting model serving.
-The minimal Tabby model directory should include the following contents:
+Tabby organizes the models within a directory.
+This document provides an explanation of the necessary contents for supporting model serving.
+A minimal Tabby model directory should include the following contents:

 ```
-ggml/model.gguf
 tabby.json
+ggml/model-00001-of-00001.gguf
 ```

 ### tabby.json
@ -27,6 +28,11 @@ The **chat_template** field is optional. When it is present, it is assumed that

 ### ggml/

-This directory contains binary files used by the [llama.cpp](https://github.com/ggerganov/llama.cpp) inference engine. Tabby utilizes ggml for inference on `cpu`, `cuda` and `metal` devices.
+This directory contains binary files used by the [llama.cpp](https://github.com/ggerganov/llama.cpp) inference engine.
+Tabby utilizes GGML for inference on `cpu`, `cuda` and `metal` devices.

-Currently, only `model.gguf` in this directory is in use. You can refer to the instructions in llama.cpp to learn how to acquire it.
+Tabby saves GGUF model files in the format `model-{index}-of-{count}.gguf`, following the llama.cpp naming convention.
+Please note that the index is 1-based,
+by default, Tabby names a single file model as `model-00001-of-00001.gguf`.
+
+For more details about GGUF models, please refer to the instructions in llama.cpp.