mirror of
https://github.com/TabbyML/tabby
synced 2024-11-22 00:08:06 +00:00
docs(models-http-api): add llamafile example
This commit is contained in:
parent
bf031513d5
commit
502bb410f6
35
website/docs/references/models-http-api/llamafile.md
vendored
Normal file
35
website/docs/references/models-http-api/llamafile.md
vendored
Normal file
@ -0,0 +1,35 @@
|
||||
# llamafile
|
||||
|
||||
[llamafile](https://github.com/Mozilla-Ocho/llamafile)
|
||||
is a Mozilla Builders project that allows you to distribute and run LLMs with a single file.
|
||||
|
||||
llamafile provides an OpenAI API-compatible chat-completions and embedding endpoint,
|
||||
enabling us to use the OpenAI kinds for chat and embeddings.
|
||||
|
||||
However, for completion, there are certain differences in the implementation, and we are still working on it.
|
||||
|
||||
llamafile uses port `8080` by default, which is also the port used by Tabby.
|
||||
Therefore, it is recommended to run llamafile with the `--port` option to serve on a different port, such as `8081`.
|
||||
|
||||
Below is an example for chat:
|
||||
|
||||
```toml title="~/.tabby/config.toml"
|
||||
# Chat model
|
||||
[model.chat.http]
|
||||
kind = "openai/chat"
|
||||
model_name = "your_model"
|
||||
api_endpoint = "http://localhost:8081/v1"
|
||||
api_key = ""
|
||||
```
|
||||
|
||||
For embeddings, the embedding endpoint is no longer supported in the standard llamafile server,
|
||||
so you have to run llamafile with the `--embedding` option and set the Tabby config to:
|
||||
|
||||
```toml title="~/.tabby/config.toml"
|
||||
# Embedding model
|
||||
[model.embedding.http]
|
||||
kind = "openai/embedding"
|
||||
model_name = "your_model"
|
||||
api_endpoint = "http://localhost:8082/v1"
|
||||
api_key = ""
|
||||
```
|
Loading…
Reference in New Issue
Block a user