2023-10-14 15:36:12 +00:00
# Llama
2024-06-27 17:48:41 +00:00
### Development Guide
2023-10-14 15:36:12 +00:00
2024-06-27 17:48:41 +00:00
#### Step 1: Downloading Model from Hugging Face
2023-10-14 15:36:12 +00:00
2024-06-27 17:48:41 +00:00
Please make sure you have git lfs installed before cloning the model.
```bash
git lfs install
2023-10-14 15:36:12 +00:00
```
2024-06-27 17:48:41 +00:00
```bash
cd ./Llama/Models
# Here we are downloading the Meta-Llama-3-8B-Instruct model
git clone https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct
2023-10-14 15:36:12 +00:00
```
2024-06-27 17:48:41 +00:00
You will be asked for username and password.
Please use Hugging Face Username as Username and,
Hugging Face API Token as Password.
#### Step 2: Install Docker.
2023-10-14 15:36:12 +00:00
2024-06-27 17:48:41 +00:00
Install Docker and Docker Compose
2023-10-18 11:07:37 +00:00
2024-06-27 17:48:41 +00:00
```bash
sudo apt-get update
sudo curl -sSL https://get.docker.com/ | sh
2023-10-18 11:07:37 +00:00
```
2024-06-27 17:48:41 +00:00
Install Rootless Docker
```bash
sudo apt-get install -y uidmap
dockerd-rootless-setuptool.sh install
2023-10-18 11:07:37 +00:00
```
2024-06-27 17:48:41 +00:00
See if the installation works
```bash
docker --version
docker ps
2023-10-18 11:07:37 +00:00
2024-06-27 17:48:41 +00:00
# You should see no containers running, but you should not see any errors.
2023-10-14 15:36:12 +00:00
```
2024-06-27 17:48:41 +00:00
#### Step 3: Insall nvidia drivers on the machine to use GPU
- Install Container Toolkit: https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html#installing-the-nvidia-container-toolkit
- Install CUDA: https://developer.nvidia.com/cuda-downloads?target_os=Linux& target_arch=x86_64& Distribution=Ubuntu& target_version=22.04& target_type=deb_network
- Restart the machine
- You should now see GPU when you run `nvidia-smi`
#### Step 4: Run the test workload to see if GPU is connected to Docker.
```bash
docker run --rm -it --gpus=all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark
2023-10-16 10:45:15 +00:00
```
2024-06-27 17:48:41 +00:00
You have configured the machine to use GPU with Docker.
### Build
- Download models from meta
- Once the model is downloaded, place them in the `Llama/Models` folder. Please make sure you also place tokenizer.model and tokenizer_checklist.chk in the same folder.
- Edit `Dockerfile` to include the model name in the `MODEL_NAME` variable.
- Docker build
```
npm run build-ai
```
2023-10-16 10:45:15 +00:00
2024-06-27 17:48:41 +00:00
### Run
2023-10-16 10:45:15 +00:00
2024-06-27 17:48:41 +00:00
```
npm run start-ai
```