This commit updates the Dockerfile.tpl to use the huggingface/transformers-pytorch-gpu image instead of the continuumio/anaconda3 image. This change allows the Llama app to utilize GPU resources for improved performance in AI processing. Additionally, the unnecessary installation of the transformers and accelerate libraries is removed as they are already included in the huggingface/transformers-pytorch-gpu image.
This commit adds GPU support to the Llama app in the docker-compose.ai.yml file. It includes a new deploy section with reservations for GPU devices, specifying the driver, count, and capabilities. This change enables the Llama app to utilize GPU resources for improved performance in AI processing.
This commit updates the Llama app to use a local model path instead of a model ID. The model path is set to "/app/Models/Meta-Llama-3-8B-Instruct". This change improves the reliability and performance of the app by directly referencing the model file instead of relying on an external model ID.