How to Install Ollama on Any Platform

Updated May 2026

Installing Ollama takes one command on macOS and Linux, or a standard installer on Windows. The process sets up the Ollama binary, starts the background service, configures GPU acceleration automatically, and makes the API available at localhost:11434. You can be running your first model within minutes of starting the installation.

Ollama is available for macOS, Linux, and Windows. Each platform has a streamlined installation process that handles dependencies, service configuration, and GPU detection automatically. Choose the section below for your operating system.

Install on macOS

On macOS, you have two installation options. The recommended approach is downloading the Ollama application from ollama.com, which installs as a menu bar application that manages the background service and provides convenient access to settings and logs. Simply download the .dmg file, drag Ollama to your Applications folder, and launch it.

Alternatively, install through Homebrew with brew install ollama. This installs the command line tool and sets up the service. After installation, start the service with brew services start ollama or simply run ollama serve in a terminal window.

On Apple Silicon Macs (M1, M2, M3, M4 series), GPU acceleration through the Metal framework is enabled automatically with no additional setup. On Intel Macs, Ollama runs in CPU-only mode since Metal compute is not available on Intel-based Mac hardware.

Install on Linux

The recommended Linux installation method is the official installer script: curl -fsSL https://ollama.com/install.sh | sh. This script detects your Linux distribution, downloads the appropriate binary, installs it to /usr/local/bin, creates an ollama system user, and sets up a systemd service that starts automatically on boot.

The installer also detects NVIDIA GPUs and verifies that compatible drivers are installed. If you have an NVIDIA GPU, make sure the NVIDIA driver is installed before running the Ollama installer. On Ubuntu, run sudo apt install nvidia-driver-550 (or the latest version) and reboot before installing Ollama.

For distributions that do not use systemd, or for manual installations, download the binary directly from the Ollama GitHub releases page. Place it in your PATH, then run ollama serve to start the server manually. You can create your own service file or startup script to run Ollama as a background process.

AMD GPU users need the ROCm framework installed before Ollama will use GPU acceleration. Follow AMD's ROCm installation guide for your distribution, then install Ollama. The installer detects ROCm automatically and configures GPU acceleration for compatible AMD GPUs.

Install on Windows

Download the Windows installer from ollama.com. The installer is a standard .exe that sets up Ollama, adds it to your PATH, and configures it to run as a background process. After installation, Ollama appears in your system tray and starts automatically when you log in.

NVIDIA GPU acceleration works automatically on Windows as long as you have a recent NVIDIA driver installed. Download the latest driver from nvidia.com or through GeForce Experience. After installing both the driver and Ollama, the tool automatically detects and uses your GPU.

You can interact with Ollama through PowerShell, Command Prompt, or Windows Terminal. The commands are identical across all platforms: ollama run llama4 to start a chat session, ollama list to see installed models, and ollama pull to download new models.

Verify the Installation

After installation on any platform, verify everything is working by running ollama --version in your terminal. This should display the installed Ollama version number. If the command is not found, ensure Ollama's installation directory is in your system PATH.

Next, run ollama run llama3.2 to download and test a small model. Llama 3.2 3B is a good first test because it downloads quickly (about 2GB) and runs on virtually any hardware. If the model downloads, loads, and responds to your messages, the installation is complete and working correctly.

Check GPU acceleration by observing the generation speed. On a GPU, you should see 30 to 80 tokens per second. On CPU only, expect 3 to 10 tokens per second. If GPU acceleration is not working, verify your GPU drivers and check the Ollama logs for GPU detection messages.

The API server should be accessible at http://localhost:11434. Test it by opening that URL in a browser or running curl http://localhost:11434 from a terminal. You should see a response confirming that Ollama is running.

Post-Installation Configuration

The default Ollama configuration works well for most users, but a few environment variables are worth knowing about. OLLAMA_MODELS changes where model files are stored, useful if your default drive has limited space. OLLAMA_HOST changes the API listen address, necessary if you want to accept connections from other machines on your network.

On Linux, set environment variables in the systemd service file at /etc/systemd/system/ollama.service using Environment= directives. On macOS, set them in your shell profile or use launchctl setenv. On Windows, set them through System Properties or PowerShell's [Environment]::SetEnvironmentVariable method.

Model storage can consume significant disk space, with individual models ranging from 2GB to 45GB. The default storage location is ~/.ollama/models on macOS and Linux, and %USERPROFILE%\.ollama\models on Windows. Plan your disk usage accordingly, especially if you intend to install multiple large models.

Updating Ollama

Keeping Ollama updated ensures you have the latest model support, performance optimizations, and bug fixes. On macOS with Homebrew, run brew upgrade ollama. On Linux, re-run the installer script, which detects the existing installation and upgrades in place. On Windows, download and run the latest installer, which replaces the existing installation while preserving your models and settings.

Your downloaded models are not affected by Ollama updates. They remain in the model storage directory and continue to work with the new version. Occasionally, a major Ollama update may re-download model layers to take advantage of improved quantization or format changes, but this happens transparently.

Key Takeaway

Ollama installs with a single command on macOS and Linux, or a simple installer on Windows. GPU acceleration is detected automatically on all platforms. After installation, test with ollama run llama3.2 to verify everything is working.

Install on macOS

Install on Linux

Install on Windows

Verify the Installation

Post-Installation Configuration

Updating Ollama

Related Articles

Getting Started with Ollama

Ollama GPU Setup and Configuration

What Is Ollama and How Does It Work

Running Ollama in Docker

Run AI Locally: Complete Setup Guide