mirror of
https://github.com/TabbyML/tabby
synced 2024-11-21 07:50:13 +00:00
docs: document the 0.13.1 release
This commit is contained in:
parent
cfdf70fe36
commit
a14efb5ce8
7
.changes/v0.13.1.md
Normal file
7
.changes/v0.13.1.md
Normal file
@ -0,0 +1,7 @@
|
|||||||
|
## v0.13.1 (2024-07-10)
|
||||||
|
|
||||||
|
### Fixed and Improvements
|
||||||
|
|
||||||
|
* Bump llama.cpp version to b3334, supporting Deepseek V2 series models.
|
||||||
|
* Turn on fast attention for Qwen2-1.5B model to fix the quantization error.
|
||||||
|
* Properly set number of GPU layers (to zero) when device is CPU.
|
@ -5,6 +5,13 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
|
|||||||
adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html),
|
adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html),
|
||||||
and is generated by [Changie](https://github.com/miniscruff/changie).
|
and is generated by [Changie](https://github.com/miniscruff/changie).
|
||||||
|
|
||||||
|
## v0.13.1 (2024-07-10)
|
||||||
|
|
||||||
|
### Fixed and Improvements
|
||||||
|
|
||||||
|
* Bump llama.cpp version to b3334, supporting Deepseek V2 series models.
|
||||||
|
* Turn on fast attention for Qwen2-1.5B model to fix the quantization error.
|
||||||
|
* Properly set number of GPU layers (to zero) when device is CPU.
|
||||||
## v0.13.0 (2024-06-28)
|
## v0.13.0 (2024-06-28)
|
||||||
|
|
||||||
### Features
|
### Features
|
||||||
|
Loading…
Reference in New Issue
Block a user