oneuptime/Llama
2024-06-19 12:18:25 +00:00
..
Models refactor: Update Dockerfile.tpl to expose port 8547 instead of port 80 2024-06-19 12:18:25 +00:00
app.py refactor: Update Llama app to log prompt and output to console 2024-06-18 22:08:42 +01:00
Dockerfile.tpl refactor: Update Dockerfile.tpl to expose port 8547 instead of port 80 2024-06-18 18:42:11 +01:00
Readme.md
requirements.txt refactor: Update Llama app to use local model path instead of model ID 2024-06-18 21:41:29 +01:00
tsconfig.json Update tsconfig.json files with resolveJsonModule option 2024-04-08 14:03:07 +01:00

Llama

Prepare

  • Download models from meta
  • Once the model is downloaded, place them in the Llama/Models folder. Please make sure you also place tokenizer.model and tokenizer_checklist.chk in the same folder.
  • Edit Dockerfile to include the model name in the MODEL_NAME variable.
  • Docker build
docker build -t llama . -f ./Llama/Dockerfile 

Run

For Linux

docker run --gpus all -p 8547:8547 -it -v ./Llama/Models:/app/Models llama 

For MacOS

docker run -p 8547:8547 -it -v ./Llama/Models:/app/Models llama 

Run without a docker conatiner

uvicorn app:app --host 0.0.0.0 --port 8547