This is a quick hello world to run local model with Ollama.
Ollama docker Link to heading
The simplest way is running Ollama docker. To create the container, we just to fire up 2 commands.
docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
docker exec -it ollama ollama run llama2
You also can open a shell that container with
docker exec -it ollama bash
Side note, Sometimes the containers can linger around, so we need to clean up the containers before restarting a new one.
docker ps
docker container prune
Ollama build Link to heading
Another option is building ollama from source. This needs recent version of go (>2.1 i think).
cd ollama
cmake -B build
cmake --build build
go run . serve
go build .
go install .
In one terminal, Let’s start the server.
ollama serve
In another terminal, let’s run ollama shell with llama3.2. This will take a while the first time as it needs to download the model.
ollama run llama3.2
Ollama with python ollama Link to heading
An easy way to access Ollama end points, is to use the ollama
python package.
import ollama
response = ollama.chat(model='llama2', messages=[
{
'role': 'user',
'content': 'Why is the sky blue?',
},
])
print(response['message']['content'])
Ollama with lLamaIndex Link to heading
or if you are llamaIndex fan, we can use the ollama integration with those 2 packages.
!pip install llama-index
!pip install llama-index-embeddings-ollama llama-index-llms-ollama
from llama_index.llms.ollama import Ollama
llm = Ollama(model="llama2", request_timeout=30000.0)
resp = llm.complete("Who is Paul Graham?")
print(resp)
That’s it.