How to Integrate Local LLMs With Ollama and Python

How to Integrate Local LLMs With Ollama and Python

Integrating local large language models (LLMs) into your Python projects using Ollama is a great strategy for improving privacy, reducing costs, and building offline-capable AI-powered apps.

Ollama is an open-source platform that makes it straightforward to run modern LLMs locally on your machine. Once you’ve set up Ollama and pulled the models you want to use, you can connect to them from Python using the ollama library.

Here’s a quick demo:

In this tutorial, you’ll integrate local LLMs into your Python projects using the Ollama platform and its Python SDK.

You’ll first set up Ollama and pull a couple of LLMs. Then, you’ll learn how to use chat, text generation, and tool calling from your Python code. These skills will enable you to build AI-powered apps that run locally, improving privacy and cost efficiency.

Take the Quiz: Test your knowledge with our interactive “How to Integrate Local LLMs With Ollama and Python” quiz. You’ll receive a score upon completion to help you track your learning progress:


Interactive Quiz

How to Integrate Local LLMs With Ollama and Python

Test your understanding of using Ollama with Python to run local LLMs, chat, generate text, and call tools. Build private offline apps today.

Prerequisites

To work through this tutorial, you’ll need the following resources and setup:

  • Ollama installed and running: You’ll need Ollama to use local LLMs. You’ll get to install it and set it up in the next section.
  • Python 3.8 or higher: You’ll be using Ollama’s Python software development kit (SDK), which requires Python 3.8 or higher. If you haven’t already, install Python on your system to fulfill this requirement.
  • Models to use: You’ll use llama3.2:latest and codellama:latest in this tutorial. You’ll download them in the next section.
  • Capable hardware: You need relatively powerful hardware to run Ollama’s models locally, as they may require considerable resources, including memory, disk space, and CPU power. You may not need a GPU for this tutorial, but local models will run much faster if you have one.

With these prerequisites in place, you’re ready to connect local models to your Python code using Ollama.

Step 1: Set Up Ollama, Models, and the Python SDK

Before you can talk to a local model from Python, you need Ollama running and at least one model downloaded. In this step, you’ll install Ollama, start its background service, and pull the models you’ll use throughout the tutorial.

Get Ollama Running

To get started, navigate to Ollama’s download page and grab the installer for your current operating system. You’ll find installers for Windows 10 or newer and macOS 14 Sonoma or newer. Run the appropriate installer and follow the on-screen instructions. For Linux users, the installation process differs slightly, as you’ll learn soon.

On Windows, Ollama will run in the background after installation, and the CLI will be available for you. If this doesn’t happen automatically for you, then go to the Start menu, search for Ollama, and run the app.

On macOS, the app manages the CLI and setup details, so you just need to launch Ollama.app.

If you’re on Linux, install Ollama with the following command:

Shell
$ curl -fsSL https://ollama.com/install.sh | sh

Once the process is complete, you can verify the installation by running:

Shell
$ ollama -v

If this command works, then the installation was successful. Next, start Ollama’s service by running the command below:

Shell
$ ollama serve

That’s it! You’re now ready to start using Ollama on your local machine. In some Linux distributions, such as Ubuntu, this final command may not be necessary, as Ollama may start automatically when the installation is complete. In that case, running the command above will result in an error.

Locked learning resources

Join us and get access to thousands of tutorials and a community of expert Pythonistas.

Unlock This Article

Already a member? Sign-In

Locked learning resources

The full article is for members only. Join us and get access to thousands of tutorials and a community of expert Pythonistas.

Unlock This Article

Already a member? Sign-In

About Leodanis Pozo Ramos

Leodanis is a self-taught Python developer, educator, and technical writer with over 10 years of experience.

» More about Leodanis

Each tutorial at Real Python is created by a team of developers so that it meets our high quality standards. The team members who worked on this tutorial are:

What Do You Think?

What’s your #1 takeaway or favorite thing you learned? How are you going to put your newfound skills to use? Leave a comment below and let us know.

Commenting Tips: The most useful comments are those written with the goal of learning from or helping out other students. Get tips for asking good questions and get answers to common questions in our support portal.


Looking for a real-time conversation? Visit the Real Python Community Chat or join the next “Office Hours” Live Q&A Session. Happy Pythoning!

Become a Member to join the conversation.

Keep Learning

Related Topics: intermediate ai tools