tabby/.changes/v0.13.1.md

## v0.13.1 (2024-07-10)

### Fixed and Improvements

* Bump llama.cpp version to b3334, supporting Deepseek V2 series models.
* Turn on fast attention for Qwen2-1.5B model to fix the quantization error.
* Properly set number of GPU layers (to zero) when device is CPU.
docs: document the 0.13.1 release 2024-07-10 01:24:36 +00:00			`## v0.13.1 (2024-07-10)`

			`### Fixed and Improvements`

			`* Bump llama.cpp version to b3334, supporting Deepseek V2 series models.`
			`* Turn on fast attention for Qwen2-1.5B model to fix the quantization error.`
			`* Properly set number of GPU layers (to zero) when device is CPU.`