tabby/README.md

<div align="center">

# 🐾 Tabby

[![License](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
![Docker build status](https://img.shields.io/github/actions/workflow/status/TabbyML/tabby/docker.yml?label=docker%20image%20build)

![architecture](https://user-images.githubusercontent.com/388154/228543840-bff32fac-0802-4dd3-b0d9-2151647dfa6d.png)

</div>

> **Warning**
> Tabby is still in the alpha phrase

An opensource / on-prem alternative to GitHub Copilot.

## Features

* Self-contained, with no need for a DBMS or cloud service
* Web UI for visualizing and configuration models and MLOps.
* OpenAPI interface, easy to integrate with existing infrastructure (e.g Cloud IDE).
* Consumer level GPU supports (FP-16 weight loading with various optimization).

## Get started

### Docker

The easiest way of getting started is using the official docker image:
```bash
docker run \
  -it --rm \
  -v ./data:/data \
  -v ./data/hf_cache:/root/.cache/huggingface \
  -p 5000:5000 \
  -p 8501:8501 \
  -p 8080:8080\
  -e MODEL_NAME=TabbyML/J-350M tabbyml/tabby
```

You can then query the server using `/v1/completions` endpoint:
```bash
curl -X POST http://localhost:5000/v1/completions -H 'Content-Type: application/json' --data '{
    "prompt": "def binarySearch(arr, left, right, x):\n    mid = (left +"
}'
```

To use the GPU backend (triton) for a faster inference speed, use `deployment/docker-compose.yml`:
```bash
docker-compose up
```
Note: To use GPUs, you need to install the [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html). We also recommend using NVIDIA drivers with CUDA version 11.8 or higher.

We also provides an interactive playground in admin panel [localhost:8501](http://localhost:8501)

![image](https://user-images.githubusercontent.com/388154/227792390-ec19e9b9-ebbb-4a94-99ca-8a142ffb5e46.png)

### API documentation

Tabby opens an FastAPI server at [localhost:5000](https://localhost:5000), which embeds an OpenAPI documentation of the HTTP API.

## Development

Go to `development` directory.
```bash
make dev
```
or
```bash
make dev-python  # Turn off triton backend (for non-cuda env developers)
```

## TODOs

* [ ] Fine-tuning models on private code repository. [#23](https://github.com/TabbyML/tabby/issues/23)
* [ ] Production ready (Open Telemetry, Prometheus metrics).
Update README.md 2023-03-27 04:09:07 +00:00			`<div align="center">`
Update README.md 2023-03-27 04:45:59 +00:00
Update README.md 2023-03-27 04:09:07 +00:00			`# 🐾 Tabby`

Update README.md 2023-03-16 09:28:20 +00:00			`[![License](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)`
Add project preprocessing 2023-03-16 09:26:43 +00:00			`[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)`
Add docker build workflow 2023-03-22 15:20:24 +00:00			`![Docker build status](https://img.shields.io/github/actions/workflow/status/TabbyML/tabby/docker.yml?label=docker%20image%20build)`
Update README.md 2023-03-27 04:45:59 +00:00
Update README.md 2023-03-29 13:03:07 +00:00			`![architecture](https://user-images.githubusercontent.com/388154/228543840-bff32fac-0802-4dd3-b0d9-2151647dfa6d.png)`
Update README.md 2023-03-27 04:45:59 +00:00
Update README.md 2023-03-27 04:09:07 +00:00			`</div>`
Add project preprocessing 2023-03-16 09:26:43 +00:00
			`> Warning`
Update README.md 2023-03-27 04:09:07 +00:00			`> Tabby is still in the alpha phrase`

			`An opensource / on-prem alternative to GitHub Copilot.`

			`## Features`

			`* Self-contained, with no need for a DBMS or cloud service`
			`* Web UI for visualizing and configuration models and MLOps.`
			`* OpenAPI interface, easy to integrate with existing infrastructure (e.g Cloud IDE).`
			`* Consumer level GPU supports (FP-16 weight loading with various optimization).`
Update README.md 2023-03-16 10:23:45 +00:00
Update README.md 2023-03-27 04:54:37 +00:00			`## Get started`
Update README.md 2023-03-27 04:59:08 +00:00
			`### Docker`

Add supervisord to support a single docker run deployment (#29) * Add suppervisord in dockerfile * Create supervisord * Update README.md * Update README.md 2023-03-29 04:57:03 +00:00			`The easiest way of getting started is using the official docker image:`
Update README.md 2023-03-27 04:54:37 +00:00			```bash
Add supervisord to support a single docker run deployment (#29) * Add suppervisord in dockerfile * Create supervisord * Update README.md * Update README.md 2023-03-29 04:57:03 +00:00			`docker run \`
			`-it --rm \`
			`-v ./data:/data \`
			`-v ./data/hf_cache:/root/.cache/huggingface \`
			`-p 5000:5000 \`
			`-p 8501:8501 \`
Cleanup environment variable (#30) * Remove EVENTS_LOG_DIR * Rename supervisord.sh -> tabby.sh 2023-03-29 08:33:00 +00:00			`-p 8080:8080\`
Add supervisord to support a single docker run deployment (#29) * Add suppervisord in dockerfile * Create supervisord * Update README.md * Update README.md 2023-03-29 04:57:03 +00:00			`-e MODEL_NAME=TabbyML/J-350M tabbyml/tabby`
Update README.md 2023-03-27 04:54:37 +00:00			```

			You can then query the server using `/v1/completions` endpoint:
			```bash
			`curl -X POST http://localhost:5000/v1/completions -H 'Content-Type: application/json' --data '{`
			`"prompt": "def binarySearch(arr, left, right, x):\n mid = (left +"`
			`}'`
			```

Update README.md 2023-03-29 05:30:17 +00:00			To use the GPU backend (triton) for a faster inference speed, use `deployment/docker-compose.yml`:
Add supervisord to support a single docker run deployment (#29) * Add suppervisord in dockerfile * Create supervisord * Update README.md * Update README.md 2023-03-29 04:57:03 +00:00			```bash
			`docker-compose up`
			```
			`Note: To use GPUs, you need to install the [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html). We also recommend using NVIDIA drivers with CUDA version 11.8 or higher.`

Update README.md 2023-03-27 04:54:37 +00:00			`We also provides an interactive playground in admin panel [localhost:8501](http://localhost:8501)`

			`![image](https://user-images.githubusercontent.com/388154/227792390-ec19e9b9-ebbb-4a94-99ca-8a142ffb5e46.png)`
Update README.md 2023-03-22 16:21:07 +00:00
Update README.md 2023-03-27 04:59:08 +00:00			`### API documentation`

			`Tabby opens an FastAPI server at [localhost:5000](https://localhost:5000), which embeds an OpenAPI documentation of the HTTP API.`

Add Development 2023-03-27 05:07:41 +00:00			`## Development`

			Go to `development` directory.
Update README.md 2023-03-28 13:06:45 +00:00			```bash
Add Development 2023-03-27 05:07:41 +00:00			`make dev`
			```
			`or`
Update README.md 2023-03-28 13:06:45 +00:00			```bash
Update README.md 2023-03-28 13:06:28 +00:00			`make dev-python # Turn off triton backend (for non-cuda env developers)`
Add Development 2023-03-27 05:07:41 +00:00			```

Update README.md 2023-03-27 04:45:59 +00:00			`## TODOs`
Update README.md 2023-03-26 16:13:19 +00:00
Update README.md 2023-03-28 08:31:54 +00:00			`* [ ] Fine-tuning models on private code repository. [#23](https://github.com/TabbyML/tabby/issues/23)`
Update README.md 2023-03-27 04:09:07 +00:00			`* [ ] Production ready (Open Telemetry, Prometheus metrics).`