How to Create Custom Ollama Models
A Modelfile is a plain text file that describes how to build a custom model. It follows a simple declarative syntax inspired by Dockerfiles, where each line specifies one aspect of the model's configuration. The system reads the Modelfile, creates a new named model based on your specifications, and makes it available for use through the CLI and API.
Write the Modelfile
Every Modelfile starts with a FROM directive that specifies the base model. This can be any model in the Ollama library or any model you have already pulled. For example, FROM qwen3:14b uses Qwen3 14B as the foundation. The base model provides the weights and capabilities that your custom model inherits.
Add PARAMETER directives to customize inference settings. PARAMETER temperature 0.2 sets low randomness for deterministic output. PARAMETER num_ctx 16384 sets a large context window. PARAMETER repeat_penalty 1.2 reduces repetitive output. PARAMETER top_k 40 and PARAMETER top_p 0.9 control sampling diversity. Each parameter overrides the base model's default value.
The SYSTEM directive defines the system prompt that guides the model's behavior for every interaction. Write a clear, specific system prompt that describes the model's role, expertise, communication style, and any constraints. A well-crafted system prompt is the most impactful customization you can make, often producing more noticeable behavioral changes than parameter tuning.
Build the Custom Model
Save your Modelfile to a text file with any name (for example, CodingAssistant) and run ollama create my-coder -f ./CodingAssistant. Ollama reads the Modelfile, applies your configurations to the base model, and creates a new model called my-coder that appears in ollama list.
The build process is fast because it does not copy or modify the base model's weights. Instead, it creates a reference to the base model and layers your parameter overrides and system prompt on top. This means custom models consume minimal additional disk space beyond the base model.
If the base model is not already downloaded, Ollama pulls it automatically during the build process. You can also reference local GGUF files as the base model by using the file path in the FROM directive, which is useful for models not available in the Ollama library.
Test and Iterate
Run ollama run my-coder to test your custom model. Evaluate whether the system prompt produces the expected behavior, whether the temperature and sampling parameters give you the right balance of creativity and consistency, and whether the context window size meets your needs. Note specific responses that do not match your expectations.
Iterate by editing the Modelfile and rebuilding with the same ollama create command. Each rebuild replaces the previous version of the custom model. The base model weights are not re-downloaded, so rebuilds are fast. This rapid iteration cycle lets you refine your model's behavior through systematic prompt engineering and parameter tuning.
Keep notes on what parameter changes produce which behavioral changes. Small temperature adjustments (0.1 to 0.3 increments) produce noticeable differences in output variety. Context window changes affect the model's ability to reference earlier parts of long conversations. Repeat penalty adjustments between 1.0 and 1.3 control how much the model avoids repeating words and phrases.
Deploy and Use
Custom models are available through the API just like standard models. Specify the custom model name in API requests to /api/chat, /api/generate, or the OpenAI-compatible endpoint. Applications that work with standard Ollama models work identically with custom models, with no code changes needed.
Share Modelfiles with your team by committing them to version control. Team members can build the same custom model on their machines by running ollama create with the shared Modelfile. This provides a reproducible, version-controlled way to distribute model configurations across a development team.
Modelfile Directives Reference
FROM specifies the base model (required). PARAMETER sets inference parameters. SYSTEM defines the system prompt. TEMPLATE customizes the prompt template format for the model. ADAPTER applies a LoRA or QLoRA fine-tuning adapter. LICENSE specifies the license for the custom model. MESSAGE adds example conversation turns that prime the model's behavior.
The TEMPLATE directive is advanced and usually unnecessary unless you are working with a model that uses a non-standard prompt format. The default template is determined by the base model and works correctly in most cases. Changing the template without understanding the base model's expected format can break generation quality.
The ADAPTER directive applies fine-tuning adapters to the base model. If you have trained a LoRA adapter on custom data using tools like Hugging Face's PEFT library, you can apply it through the Modelfile to create a fine-tuned local model. The adapter file must be in GGUF format compatible with the base model's architecture.
Practical Modelfile Examples
A coding assistant Modelfile might use Qwen3 14B as the base with temperature 0.1, context window 16384, and a system prompt that instructs the model to produce clean code with inline comments, follow language-specific conventions, and explain its reasoning before presenting code solutions. This configuration prioritizes accuracy and consistency over creative variety.
A creative writing Modelfile might use Llama 4 Scout as the base with temperature 0.9, top_p 0.95, repeat_penalty 1.1, and a system prompt that describes the model as a skilled fiction author who creates vivid descriptions, develops complex characters, and maintains consistent narrative voice. This configuration encourages diverse, engaging output.
A data analysis Modelfile might use DeepSeek-R1 14B as the base with temperature 0.3 and a system prompt that instructs the model to think step by step, show calculations, cite data sources, and present findings in structured formats with clear conclusions. This leverages DeepSeek-R1's reasoning strengths for analytical tasks.
Modelfiles let you create unlimited custom model configurations by combining base models with specific parameters and system prompts. The system prompt is the most impactful customization, and rapid iteration through edit-build-test cycles lets you refine model behavior precisely for your needs.