mirror of https://github.com/OneUptime/oneuptime synced 2024-11-22 23:30:10 +00:00

History

Simon Larsen 9b08d1a9e4 refactor: Convert job function to async in app.py The job function in app.py has been converted to an async function to support asynchronous processing. This change improves the performance and responsiveness of the application by allowing other tasks to run concurrently while the job function is processing the queue.		2024-06-19 21:05:36 +00:00
..
Models	refactor: Update Dockerfile.tpl to use huggingface/transformers-pytorch-gpu image	2024-06-19 13:20:34 +00:00
app.py	refactor: Convert job function to async in app.py	2024-06-19 21:05:36 +00:00
Dockerfile.tpl	refactor: Update Dockerfile.tpl to use huggingface/transformers-pytorch-gpu image	2024-06-19 13:06:23 +00:00
Readme.md
requirements.txt	refactor: Add GPU support to Llama app in docker-compose.ai.yml	2024-06-19 20:58:08 +00:00
tsconfig.json

Llama

Prepare

Download models from meta
Once the model is downloaded, place them in the Llama/Models folder. Please make sure you also place tokenizer.model and tokenizer_checklist.chk in the same folder.
Edit Dockerfile to include the model name in the MODEL_NAME variable.
Docker build

docker build -t llama . -f ./Llama/Dockerfile

docker run --gpus all -p 8547:8547 -it -v ./Llama/Models:/app/Models llama

docker run -p 8547:8547 -it -v ./Llama/Models:/app/Models llama

uvicorn app:app --host 0.0.0.0 --port 8547