Ollama Java Client

English | 中文

English

A Java client for the Ollama API, providing a simple and efficient way to interact with Ollama's language models.

Features

Text Generation
- Support for streaming responses
- Customizable generation parameters
- System prompt and template support
Chat Functionality
- Multi-turn conversation support
- Role-based message handling
- Streaming chat responses
Text Embedding
- Vector embeddings for text
- Support for multiple models
Model Management
- List available models
- Delete models
- Model information retrieval
Advanced Features
- Full support for Ollama API parameters
- Comprehensive error handling
- Detailed logging with SLF4J
- Spring Boot integration
- Configurable HTTP client
- Customizable timeouts

Installation

Download the latest release from the releases page and add it to your project's dependencies.

Prerequisites

Java 8 or higher
Ollama server running locally or accessible via network

Installing Ollama

macOS

curl -fsSL https://ollama.com/install.sh | sh

Linux

curl -fsSL https://ollama.com/install.sh | sh

Windows
- Download the installer from Ollama's official website
- Run the installer and follow the setup wizard

After installation, start the Ollama service:

ollama serve

In a new terminal, you can pull the recommended model:

ollama pull qwen2.5:7b

Usage

Quick Start

import com.matrixhero.ollama.client.OllamaClient; import com.matrixhero.ollama.client.model.*; // Create client OllamaClient client = new OllamaClient(); // Generate text GenerateRequest request = new GenerateRequest(); request. setModel("qwen2.5:7b"); // Recommended model request. setPrompt("Tell me a story"); GenerateResponse response = client.generate(request); System.out. println(response.getResponse()); // Chat ChatRequest chatRequest = new ChatRequest(); chatRequest. setModel("qwen2.5:7b"); // Recommended model chatRequest. setMessages(Arrays.asList( new Message(Message.Role.USER, "Hello, how are you?") )); ChatResponse chatResponse = client.chat(chatRequest); System.out. println(chatResponse.getMessage(). getContent()); // Stream generation request. setStream(true); client. generateStream(request) . forEach(r ->System.out. print(r.getResponse()));

Recommended Model

We recommend using the Qwen model (qwen2.5:7b) for best performance and quality. Qwen is a powerful language model developed by Alibaba Cloud, offering excellent performance in both English and Chinese tasks.

For more information about Qwen, please visit:

GitHub: QwenLM/Qwen
Model Card: Qwen/Qwen2.5-7B

API Documentation

Generate Text

GenerateRequest request = new GenerateRequest(); request.setModel("qwen2.5:7b"); request.setPrompt("Your prompt here"); request.setSystem("Optional system prompt"); request.setTemplate("Optional template"); request.setContext(new long[]{...}); // Optional context request.setStream(true); // Enable streaming request.setOptions(new Options()); // Optional generation options GenerateResponse response = client.generate(request);

Chat

ChatRequest request = new ChatRequest(); request.setModel("qwen2.5:7b"); request.setMessages(Arrays.asList( new Message(Message.Role.USER, "User message"), new Message(Message.Role.ASSISTANT, "Assistant message") )); request.setStream(true); // Enable streaming request.setOptions(new Options()); // Optional generation options ChatResponse response = client.chat(request);

Embed Text

EmbedRequest request = new EmbedRequest(); request.setModel("qwen2.5:7b"); request.setInput("Text to embed"); EmbedResponse response = client.embed(request); float[] embedding = response.getEmbedding();

List Models

ListResponse response = client.list(); List<Model> models = response.getModels();

Delete Model

DeleteRequest request = new DeleteRequest(); request.setName("model-name"); client.delete(request);

Error Handling

The client includes comprehensive error handling:

try { GenerateResponse response = client.generate(request); } catch (IOException e) { // Handle network errors log.error("Error generating text", e); } catch (Exception e) { // Handle other errors log.error("Unexpected error", e); }

Logging

The client uses SLF4J for logging. Configure your logging framework to see detailed logs:

<dependency> <groupId>ch.qos.logback</groupId> <artifactId>logback-classic</artifactId> <version>1.2.11</version> </dependency>

Spring Boot Integration

If you're using Spring Boot, you can easily integrate the Ollama client:

@Configuration public class OllamaConfig { @Bean public OllamaClient ollamaClient() { return new OllamaClient(); } }

Development

Building from Source

git clone https://github.com/yourusername/ollama-java.git cd ollama-java ./gradlew build

Running Tests

./gradlew test

Contributing

Contributions are welcome! Please feel free to submit a Pull Request. Before submitting, please:

Fork the repository
Create your feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add some amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

FAQ

Q: What is the default Ollama server URL?

A: The client connects to http://localhost:11434 by default. You can configure a different URL using the OllamaClient constructor.

Q: How do I handle rate limiting?

A: The client includes built-in retry mechanisms for rate limiting. You can configure retry behavior through the client options.

Q: Can I use this client with other Ollama models?

A: Yes, the client supports all Ollama models. We recommend Qwen for best performance, but you can use any model available in your Ollama installation.

Examples

Check out our examples directory for more detailed usage examples:

Client Configuration

Timeout Settings

You can configure various timeout settings for the client using the builder pattern:

// Create a client with custom timeouts OllamaClient client = new OllamaClient() .withConnectTimeout(60) .withReadTimeout(120) .withWriteTimeout(60); // Or set timeouts individually OllamaClient client = new OllamaClient() .withConnectTimeout(60) .withReadTimeout(120) .withWriteTimeout(60);

Default timeout values:

Connection timeout: 10 seconds
Read timeout: 30 seconds
Write timeout: 10 seconds

When a timeout occurs, the client will throw an OllamaTimeoutException with a descriptive message indicating which operation timed out. You can catch this exception to handle timeout scenarios:

try { ChatResponse response = client.chat(request); } catch (OllamaTimeoutException e) { // Handle timeout System.err.println("Request timed out: " + e.getMessage()); } catch (IOException e) { // Handle other IO errors System.err.println("IO error: " + e.getMessage()); }

Agent Support

The client supports integration with external APIs through agents. Here's how to use agents:

// Create a weather agent (API key will be loaded from environment or system properties) WeatherAgent weatherAgent = new WeatherAgent(); // Add the agent to the client OllamaClient client = new OllamaClient() .withAgent(weatherAgent); // Now you can chat with the model, and it will automatically use the weather agent when appropriate ChatRequest request = new ChatRequest(); request.setModel("qwen2.5:7b"); request.setMessages(Arrays.asList( new Message(Message.Role.USER, "What's the weather in New York?") )); ChatResponse response = client.chat(request); System.out.println(response.getMessage().getContent());

API Key Configuration

For agents that require API keys (like WeatherAgent), you can configure them in two ways:

Environment Variable:

export OPENWEATHERMAP_API_KEY=your_api_key_here

System Property:

java -Dopenweathermap.api.key=your_api_key_here -jar your-application.jar

The agent will first check for the environment variable, then fall back to the system property. If neither is found, it will throw an IllegalStateException with a helpful message.

中文

Ollama API 的 Java 客户端，提供简单高效的方式与 Ollama 的语言模型进行交互。

特性

文本生成
- 支持流式响应
- 可定制生成参数
- 系统提示词和模板支持
聊天功能
- 多轮对话支持
- 基于角色的消息处理
- 流式聊天响应
文本嵌入
- 文本向量嵌入
- 支持多种模型
模型管理
- 列出可用模型
- 删除模型
- 模型信息检索
高级功能
- 完整支持 Ollama API 参数
- 全面的错误处理
- 详细的日志记录
- Spring Boot 集成
- 可配置的 HTTP 客户端
- 可自定义的超时设置

安装

从 releases 页面下载最新版本并添加到你的项目依赖中。

前置条件

Java 8 或更高版本
本地运行或通过网络可访问的 Ollama 服务器

安装 Ollama

macOS

curl -fsSL https://ollama.com/install.sh | sh

Linux

curl -fsSL https://ollama.com/install.sh | sh

Windows
- 从 Ollama 官网下载安装程序
- 运行安装程序并按照安装向导操作

安装完成后，启动 Ollama 服务：

ollama serve

在新的终端中，你可以拉取推荐的模型：

ollama pull qwen2.5:7b

使用方法

快速开始

import com.matrixhero.ollama.client.OllamaClient; import com.matrixhero.ollama.client.model.*; // 创建客户端 OllamaClient client = new OllamaClient(); // 生成文本 GenerateRequest request = new GenerateRequest(); request. setModel("qwen2.5:7b"); // 推荐模型 request. setPrompt("讲个故事"); GenerateResponse response = client.generate(request); System.out. println(response.getResponse()); // 聊天 ChatRequest chatRequest = new ChatRequest(); chatRequest. setModel("qwen2.5:7b"); // 推荐模型 chatRequest. setMessages(Arrays.asList( new Message(Message.Role.USER, "你好，最近怎么样？") )); ChatResponse chatResponse = client.chat(chatRequest); System.out. println(chatResponse.getMessage(). getContent()); // 流式生成 request. setStream(true); client. generateStream(request) . forEach(r ->System.out. print(r.getResponse()));

API 文档

生成文本

GenerateRequest request = new GenerateRequest(); request.setModel("qwen2.5:7b"); request.setPrompt("你的提示词"); request.setSystem("可选的系统提示词"); request.setTemplate("可选的模板"); request.setContext(new long[]{...}); // 可选的上下文 request.setStream(true); // 启用流式输出 request.setOptions(new Options()); // 可选的生成选项 GenerateResponse response = client.generate(request);

聊天

ChatRequest request = new ChatRequest(); request.setModel("qwen2.5:7b"); request.setMessages(Arrays.asList( new Message(Message.Role.USER, "用户消息"), new Message(Message.Role.ASSISTANT, "助手消息") )); request.setStream(true); // 启用流式输出 request.setOptions(new Options()); // 可选的生成选项 ChatResponse response = client.chat(request);

文本嵌入

EmbedRequest request = new EmbedRequest(); request.setModel("qwen2.5:7b"); request.setInput("要嵌入的文本"); EmbedResponse response = client.embed(request); float[] embedding = response.getEmbedding();

列出模型

ListResponse response = client.list(); List<Model> models = response.getModels();

删除模型

DeleteRequest request = new DeleteRequest(); request.setName("模型名称"); client.delete(request);

错误处理

客户端包含全面的错误处理：

try { GenerateResponse response = client.generate(request); } catch (IOException e) { // 处理网络错误 log.error("生成文本时出错", e); } catch (Exception e) { // 处理其他错误 log.error("意外错误", e); }

日志记录

客户端使用 SLF4J 进行日志记录。配置你的日志框架以查看详细日志：

<dependency> <groupId>ch.qos.logback</groupId> <artifactId>logback-classic</artifactId> <version>1.2.11</version> </dependency>

Spring Boot 集成

如果你使用 Spring Boot，可以轻松集成 Ollama 客户端：

@Configuration public class OllamaConfig { @Bean public OllamaClient ollamaClient() { return new OllamaClient(); } }

开发

从源码构建

git clone https://github.com/yourusername/ollama-java.git cd ollama-java ./gradlew build

运行测试

./gradlew test

贡献

贡献是受欢迎的！请随时提交 Pull Request。

许可证

本项目采用 MIT 许可证 - 详见 LICENSE 文件。

常见问题

Q: Ollama 服务器的默认 URL 是什么？

A: The client connects to http://localhost:11434 by default. You can configure a different URL using the OllamaClient constructor.

Q: 如何处理速率限制？

A: The client includes built-in retry mechanisms for rate limiting. You can configure retry behavior through the client options.

Q: 我可以使用其他 Ollama 模型吗？

A: Yes, the client supports all Ollama models. We recommend Qwen for best performance, but you can use any model available in your Ollama installation.

示例

查看我们的 examples 目录获取更详细的使用示例：

客户端配置

超时设置

你可以使用构建器模式配置客户端的各种超时设置：

// Create a client with custom timeouts OllamaClient client = new OllamaClient() .withConnectTimeout(60) .withReadTimeout(120) .withWriteTimeout(60); // Or set timeouts individually OllamaClient client = new OllamaClient() .withConnectTimeout(60) .withReadTimeout(120) .withWriteTimeout(60);

Default timeout values:

Connection timeout: 10 seconds
Read timeout: 30 seconds
Write timeout: 10 seconds

When a timeout occurs, the client will throw an OllamaTimeoutException with a descriptive message indicating which operation timed out. You can catch this exception to handle timeout scenarios:

try { ChatResponse response = client.chat(request); } catch (OllamaTimeoutException e) { // Handle timeout System.err.println("Request timed out: " + e.getMessage()); } catch (IOException e) { // Handle other IO errors System.err.println("IO error: " + e.getMessage()); }

Agent Support

客户端支持通过 agents 集成外部 API。以下是使用 agents 的方法：

// 创建天气 agent（API key 将从环境变量或系统属性中加载） WeatherAgent weatherAgent = new WeatherAgent(); // 将 agent 添加到客户端 OllamaClient client = new OllamaClient() .withAgent(weatherAgent); // 现在你可以与模型聊天，当输入合适时它会自动使用天气 agent ChatRequest request = new ChatRequest(); request.setModel("qwen2.5:7b"); request.setMessages(Arrays.asList( new Message(Message.Role.USER, "北京天气如何？") )); ChatResponse response = client.chat(request); System.out.println(response.getMessage().getContent());

API Key 配置

对于需要 API key 的 agents（如 WeatherAgent），你可以通过两种方式配置：

环境变量：

export OPENWEATHERMAP_API_KEY=your_api_key_here

系统属性：

java -Dopenweathermap.api.key=your_api_key_here -jar your-application.jar

agent 会首先检查环境变量，如果没有找到则使用系统属性。如果两者都没有找到，它会抛出一个带有帮助信息的 IllegalStateException。

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
src		src
target		target
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pom.xml		pom.xml
settings.xml		settings.xml

License

MatrixHero/ollama-java

Folders and files

Latest commit

History

Repository files navigation

Ollama Java Client

Table of Contents

English

Features

Installation

Prerequisites

Installing Ollama

Usage

Quick Start

Recommended Model

API Documentation

Generate Text

Chat

Embed Text

List Models

Delete Model

Error Handling

Logging

Spring Boot Integration

Development

Building from Source

Running Tests

Contributing

License

FAQ

Q: What is the default Ollama server URL?

Q: How do I handle rate limiting?

Q: Can I use this client with other Ollama models?

Examples

Client Configuration

Timeout Settings

Agent Support

API Key Configuration

中文

特性

安装

前置条件

安装 Ollama

使用方法

快速开始

推荐模型

API 文档

生成文本

聊天

文本嵌入

列出模型

删除模型

错误处理

日志记录

Spring Boot 集成

开发

从源码构建

运行测试

贡献

许可证

常见问题

Q: Ollama 服务器的默认 URL 是什么？

Q: 如何处理速率限制？

Q: 我可以使用其他 Ollama 模型吗？

示例

客户端配置

超时设置

Agent Support

API Key 配置

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Packages