mirror of
https://github.com/TabbyML/tabby
synced 2024-11-22 00:08:06 +00:00
268 B
268 B
v0.13.1 (2024-07-10)
Fixed and Improvements
- Bump llama.cpp version to b3334, supporting Deepseek V2 series models.
- Turn on fast attention for Qwen2-1.5B model to fix the quantization error.
- Properly set number of GPU layers (to zero) when device is CPU.