侧边栏壁纸
博主头像
毕业帮 博主等级

提供丰富的资源和服务,涵盖从论文写作、毕业设计、职业规划、就业准备等多个方面

  • 累计撰写 81 篇文章
  • 累计创建 18 个标签
  • 累计收到 3 条评论

目 录CONTENT

文章目录

LangChain4J 全面技术详解

流苏
2026-03-11 / 0 评论 / 0 点赞 / 6 阅读 / 0 字 / 正在检测是否收录...
温馨提示:
部分素材来自网络,若不小心影响到您的利益,请联系我们删除。

LangChain4J 全面技术详解

Java生态的LLM应用开发框架


目录


一、LangChain4J 概述

1.1 什么是LangChain4J

LangChain4J是LangChain框架的Java/Kotlin实现,为Java开发者提供构建LLM应用的全栈能力。

核心定位

  • 通用LLM应用框架:覆盖从简单Prompt到复杂Agent的全场景
  • Java原生:类型安全、IDE友好、企业级特性
  • 模块化设计:按需引入依赖,避免全量打包
  • Spring生态融合:与Spring Boot、Spring AI深度集成

1.2 核心价值主张

  1. 开发者友好:Java类型系统、编译时检查、丰富的IDE支持
  2. 企业就绪:安全、监控、可测试性、可维护性
  3. 灵活组合:模块化设计,自由组合 Chains、Tools、Agents
  4. 多模型支持:OpenAI、Azure、Anthropic、HuggingFace、本地模型等
  5. 生产级:异步、流式、重试、熔断、缓存

1.3 与相关框架关系

LangChain(Python) → LangChain4J(Java移植)
    ↓
Spring AI(更高抽象,更简单)
    ↓
LlamaIndex(专注RAG,可与LangChain4J配合)

选择建议

  • 简单RAG:Spring AI或LlamaIndex
  • 通用LLM应用:LangChain4J
  • 复杂Agent:LangChain4J(Agent功能更强)
  • 已有Spring生态:Spring AI优先

1.4 适用场景

  • 智能客服:对话Agent、工具调用(查订单、退货)
  • 数据分析:自然语言查询数据库(SQL生成)
  • 代码助手:代码生成、审查、解释
  • 文档问答:RAG应用(生产就绪)
  • 自动化流程:多步骤推理、API调用链
  • 内容生成:报告生成、摘要、翻译

1.5 生态组件

  • 核心模块langchain4j-core
  • 模型集成langchain4j-open-ailangchain4j-azure-open-ailangchain4j-anthropiclangchain4j-ollama
  • 工具集成langchain4j-tools(搜索、计算、数据库等)
  • 向量存储langchain4j-vector-store-*(Milvus、Qdrant、Pinecone等)
  • Spring集成langchain4j-spring
  • 评估langchain4j-evaluation

二、核心架构设计

2.1 设计哲学

“LCEL”(LangChain Expression Language)理念

// 声明式链式调用,类似函数组合
Chain chain = prompt.then(llm).then(outputParser);

关键原则

  • 可组合性:每个组件都是独立的,可任意组合
  • 可配置性:所有参数暴露,便于调优
  • 可测试性:每个组件可单独测试
  • 异步优先:所有操作支持异步(CompletableFuture)

2.2 架构层次

┌─────────────────────────────────────────────────────────────┐
│                   Applications                              │
│      (Spring Boot/Quarkus/Plain Java/Microservices)       │
└───────────────────────────┬─────────────────────────────────┘
                            │
┌───────────────────────────┴─────────────────────────────────┐
│                   LangChain4J Core                         │
│  ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌─────────────┐  │
│  │  Prompt  │ │  LLM     │ │  Chain    │ │   Agent     │  │
│  │ Templates│ │          │ │           │ │             │  │
│  └──────────┘ └──────────┘ └──────────┘ └─────────────┘  │
│  ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌─────────────┐  │
│  │  Tools   │ │  Memory  │ │ Retriever │ │  Callbacks  │  │
│  └──────────┘ └──────────┘ └──────────┘ └─────────────┘  │
└───────────────────────────┬─────────────────────────────────┘
                            │
┌───────────────────────────┴─────────────────────────────────┐
│                   Model Providers                          │
│  (OpenAI/Azure/Anthropic/Ollama/HuggingFace/本地模型)     │
└───────────────────────────────────────────────────────────┘

2.3 核心抽象

  • ChatLanguageModel:聊天模型接口(支持多轮对话、工具调用)
  • LanguageModel:纯文本补全模型(单轮)
  • Prompt:提示模板,支持变量插值
  • Chain:链式调用,多个组件串联
  • Tool:可调用函数(外部API、数据库查询等)
  • Agent:自主决策,选择工具完成任务
  • Memory:对话历史管理
  • Retriever:文档检索(RAG场景)
  • OutputParser:解析LLM输出为结构化数据

三、核心组件详解

3.1 模型(LLM)

3.1.1 ChatLanguageModel

主流聊天模型接口

import dev.langchain4j.model.chat.ChatLanguageModel;
import dev.langchain4j.model.chat.request.ChatRequest;
import dev.langchain4j.model.chat.response.ChatResponse;

ChatLanguageModel model = new OpenAiChatModel(
    new OpenAiChatModelOptions.Builder()
        .apiKey(System.getenv("OPENAI_API_KEY"))
        .modelName("gpt-4-turbo-preview")
        .temperature(0.7)
        .maxTokens(1000)
        .build()
);

ChatResponse response = model.chat(
    ChatRequest.builder()
        .messages(UserMessage.from("Hello, how are you?"))
        .build()
);

String answer = response.aiMessage().text();

3.1.2 LanguageModel

文本补全模型(无对话)

import dev.langchain4j.model.LanguageModel;

LanguageModel lm = new OpenAiLanguageModel(
    new OpenAiLanguageModelOptions.Builder()
        .apiKey("...")
        .modelName("gpt-3.5-turbo-instruct")
        .build()
);

String completion = lm.generate("Once upon a time");

3.1.3 流式响应(Streaming)

import dev.langchain4j.model.chat.StreamingChatLanguageModel;

StreamingChatLanguageModel streamingModel = new OpenAiStreamingChatModel(...);

streamingModel.chat(
    ChatRequest.builder()
        .messages(UserMessage.from("Write a story"))
        .build()
    (token) -> {
        System.out.print(token);  // 实时输出
        return true;  // 返回false停止
    }
);

3.2 Prompt与模板

3.2.1 基础PromptTemplate

import dev.langchain4j.prompt.PromptTemplate;

PromptTemplate promptTemplate = PromptTemplate.from(
    "You are a {{role}}. Answer the following question: {{question}}"
);

Prompt prompt = promptTemplate.apply(
    Map.of(
        "role", "financial advisor",
        "question", "What is compound interest?"
    )
);

String text = prompt.text();  // 渲染后的完整Prompt

3.2.2 ChatPromptTemplate

多轮对话模板

import dev.langchain4j.prompt.ChatPromptTemplate;
import dev.langchain4j.prompt.chat.ChatMessage;
import dev.langchain4j.prompt.chat.MessageType;

ChatPromptTemplate chatPrompt = ChatPromptTemplate.builder()
    .systemMessage(MessageType.SYSTEM, "You are a helpful assistant.")
    .message(MessageType.HUMAN, "Hello!")
    .message(MessageType.AI, "Hi there!")
    .message(MessageType.HUMAN, "{{question}}")  // 变量占位符
    .build();

Prompt prompt = chatPrompt.apply(
    Map.of("question", "What is AI?")
);

预定义角色

  • MessageType.SYSTEM:系统指令
  • MessageType.HUMAN / USER:用户消息
  • MessageType.AI / ASSISTANT:助手消息

3.2.3 Few-Shot Prompting

ChatPromptTemplate fewShot = ChatPromptTemplate.builder()
    .message(MessageType.HUMAN, "Translate: Hello")
    .message(MessageType.AI, "你好")
    .message(MessageType.HUMAN, "Translate: {{text}}")
    .build();

3.2.4 Prompt字符串插值

使用Mustache语法

You are a {{role}}. 
Context: {{context}}
Question: {{question}}
Answer:

3.3 OutputParsers(输出解析器)

将LLM文本输出解析为结构化数据

3.3.1 简单解析器

import dev.langchain4j.output.parsers.AbstractOutputParser;

OutputParser<String> parser = new AbstractOutputParser<>() {
    @Override
    public String parse(String text) {
        return text.trim();  // 自定义逻辑
    }
};

3.3.2 JsonOutputParser

解析JSON

import dev.langchain4j.output.parsers.JsonOutputParser;
import com.fasterxml.jackson.annotation.JsonProperty;

public class Answer {
    @JsonProperty
    public String answer;
    
    @JsonProperty
    public List<String> sources;
}

JsonOutputParser<Answer> parser = JsonOutputParser.from(Answer.class);

Prompt prompt = PromptTemplate.from(
    "Answer the question. Respond in JSON format: {{format_instructions}}\n\nQuestion: {{question}}",
    Map.of("format_instructions", parser.getFormatInstructions())
).apply(...);

String jsonResponse = model.generate(prompt.text());
Answer answer = parser.parse(jsonResponse);

3.3.3 RegexOutputParser

import dev.langchain4j.output.parsers.RegexOutputParser;

RegexOutputParser<Answer> parser = RegexOutputParser.from(
    Answer.class,
    ".*Answer: (.*?)\\. Sources: (.*?)(?:\\n|$).*"
);

3.3.4 EnumOutputParser

import dev.langchain4j.output.parsers.EnumOutputParser;

enum Category { SPORTS, TECHNOLOGY, FINANCE, HEALTH }

EnumOutputParser<Category> parser = EnumOutputParser.from(Category.class);

3.4 Chains(链)

3.4.1 简单Chain

PromptTemplate promptTemplate = PromptTemplate.from(
    "Translate '{{text}}' to {{language}}"
);

Chain chain = promptTemplate.then(model).then(new StringTrimmerOutputParser());

String result = chain.execute(
    Map.of("text", "Hello", "language", "French")
);
// 输出: "Bonjour"

3.4.2 LLMChain

最常用:Prompt + LLM + Parser

import dev.langchain4j.chain.LLMChain;

LLMChain chain = LLMChain.builder()
    .prompt(promptTemplate)
    .llm(model)
    .outputParser(parser)
    .build();

Answer answer = chain.execute(
    Map.of("question", "What is RAG?")
).orElseThrow();

3.4.3 函数式Chain(LCEL风格)

Chain chain = PromptTemplate.from("{{input}}")
    .then(model)
    .then(outputParser);

String result = chain.run("Hello");

3.4.4 复合Chain

多个Chain顺序执行

Chain firstChain = PromptTemplate.from("Q: {{q}}\nA:").then(model);
Chain secondChain = PromptTemplate.from("Summarize: {{text}}").then(model);

Chain combinedChain = firstChain.andThen(secondChain);

String finalOutput = combinedChain.run("What is Java?");
// 步骤:firstChain生成答案 → secondChain总结答案

3.4.5 并行Chain

List<Chain> parallelChains = List.of(chain1, chain2, chain3);
List<String> results = ChainExecutionUtils.executeAll(parallelChains);

3.5 Retrievers(检索器)

3.5.1 VectorStoreRetriever

import dev.langchain4j.retriever.DocumentRetriever;
import dev.langchain4j.store.embedding.VectorStore;

VectorStore vectorStore = ...;  // 初始化向量库(Milvus/Pinecone等)

DocumentRetriever retriever = VectorStoreRetriever.builder()
    .vectorStore(vectorStore)
    .similaritySearchRequest(
        SimilaritySearchRequest.builder()
            .topK(5)
            .build()
    )
    .build();

List<Content> relevantDocuments = retriever.retrieve("What is RAG?");

3.5.2 HybridRetriever

向量+关键词混合检索

import dev.langchain4j.retriever.HybridRetriever;

HybridRetriever retriever = HybridRetriever.builder()
    .vectorRetriever(vectorRetriever)
    .keywordRetriever(keywordRetriever)
    .rankFusionStrategy(RerankingStrategy.RECIPROCAL_RANK_FUSION)
    .build();

3.5.3 自定义Retriever

import dev.langchain4j.retriever.DocumentRetriever;

public class MyRetriever implements DocumentRetriever {
    private final VectorStore vectorStore;
    
    @Override
    public List<Content> retrieve(String query) {
        // 自定义检索逻辑
        return vectorStore.similaritySearch(query);
    }
}

四、Tools与Agents

4.1 Tools(工具)

4.1.1 内置Tools

  • CalculatorTool:数学计算
  • SearchTool:搜索引擎(Tavily、Google、Bing)
  • HttpRequestTool:HTTP请求
  • SqlQueryTool:SQL查询
  • FileSystemTool:文件操作
  • WeatherTool:天气查询
  • 等等

4.1.2 自定义Tool

import dev.langchain4j.agent.tool.Tool;

public class OrderTool implements Tool {
    
    @Override
    public String name() {
        return "getOrderStatus";
    }
    
    @Override
    public String description() {
        return "Get the status of an order by order ID";
    }
    
    @Override
    public String execute(@JsonProperty("orderId") String orderId) {
        // 调用业务API
        Order order = orderService.getOrder(orderId);
        return order.getStatus();
    }
    
    @Override
    public ToolParameter[] parameters() {
        return new ToolParameter[] {
            ToolParameter.builder()
                .name("orderId")
                .description("The order ID")
                .type(ToolParameterType.STRING)
                .required(true)
                .build()
        };
    }
}

4.1.3 ToolRegistry

import dev.langchain4j.agent.tool.ToolRegistry;

ToolRegistry registry = ToolRegistry.builder()
    .add(new CalculatorTool())
    .add(new WeatherTool())
    .add(new OrderTool())
    .build();

4.2 Agents(智能体)

4.2.1 ZeroShotAgent

单步决策,选择工具并执行

import dev.langchain4j.agent.Agent;
import dev.langchain4j.agent.tool.ToolExecutor;

Agent agent = Agent.builder()
    .chatLanguageModel(model)
    .tools(tools)
    .build();

String result = agent.run("What's the weather in Paris and multiply by 2?");
// 自动调用: WeatherTool(Paris) → Calculator(...)

4.2.2 ReAct Agent

多步推理(Reason + Act)

import dev.langchain4j.agent.ChainOfThoughtsThoughtGenerator;
import dev.langchain4j.agent.ChainOfThoughtsThoughtExtractor;
import dev.langchain4j.agent.ReActAgent;

ReActAgent agent = ReActAgent.builder()
    .chatLanguageModel(model)
    .tools(tools)
    .maxIterations(10)  // 最大步数
    .thoughtExtractor(new ChainOfThoughtsThoughtExtractor())
    .build();

AgentExecutionResult result = agent.execute("Complex question...");

Thought步骤

  1. Thought: 思考下一步
  2. Action: 选择工具及参数
  3. Observation: 工具返回结果
  4. 重复直到完成

4.2.3 ConversationAgent

有记忆的对话Agent

import dev.langchain4j.agent.ConversationalAgent;
import dev.langchain4j.memory.ChatMemory;

ConversationalAgent agent = ConversationalAgent.builder()
    .chatLanguageModel(model)
    .tools(tools)
    .chatMemory(memory)  // 记忆组件
    .build();

agent.execute("Hi, I'm John");
agent.execute("What's my name?");  // 记住名字

4.2.4 Agent执行控制

AgentExecutionResult result = agent.execute(
    AgentExecutorRequest.builder()
        .input("Calculate 123 * 456")
        .interrupt(Runnable::run)  // 中断回调
        .callback(new AgentCallback() {
            @Override
            public void onTool(String toolName, String input, String output) {
                log.info("Tool {} called with {}, returned {}", toolName, input, output);
            }
            
            @Override
            public void onThought(String thought) {
                log.debug("Thought: {}", thought);
            }
        })
        .build()
);

五、Memory(记忆)

5.1 ChatMemory

管理对话历史

import dev.langchain4j.memory.chat.ChatMemory;
import dev.langchain4j.memory.chat.MessageWindowChatMemory;

ChatMemory memory = MessageWindowChatMemory.withMaxMessages(10);
// 保留最近10条消息

memory.add(UserMessage.from("Hello"));
memory.add(AiMessage.from("Hi! How can I help?"));

// 自动包含历史到下一轮
ChatResponse response = model.chat(
    ChatRequest.builder()
        .messages(memory.messages(), UserMessage.from("How are you?"))
        .build()
);

5.2 内存实现

5.2.1 MessageWindowChatMemory

固定数量窗口

ChatMemory memory = MessageWindowChatMemory.withMaxMessages(20);

5.2.2 TokenWindowChatMemory

按Token数量窗口

ChatMemory memory = TokenWindowChatMemory.withMaxTokens(2000);

5.2.3 ChatMemoryStore(自定义存储)

import dev.langchain4j.memory.chat.ChatMemoryStore;
import dev.langchain4j.memory.chat.ChatMemory;

ChatMemoryStore store = new ChatMemoryStore() {
    @Override
    public List<ChatMessage> getMessages(ChatMemoryId id) {
        // 从DB/Redis读取
        return loadFromDB(id);
    }
    
    @Override
    public void updateMessages(ChatMemoryId id, List<ChatMessage> messages) {
        // 保存到DB/Redis
        saveToDB(id, messages);
    }
    
    @Override
    public void deleteMessages(ChatMemoryId id) {
        deleteFromDB(id);
    }
};

ChatMemory memory = ChatMemory.builder()
    .chatMemoryStore(store)
    .maxMessages(50)
    .build();

5.3 对话ID管理

ChatMemoryId memoryId = ChatMemoryId.of("user-123", "conversation-456");
memory.add(memoryId, UserMessage.from("Hello"));

5.4 总结记忆(Summarization)

避免无限增长

import dev.langchain4j.memory.chat.SummarizingChatMemory;

ChatMemory memory = SummarizingChatMemory.with(
    model,  // 用于总结的LLM
    TokenWindowChatMemory.withMaxTokens(2000),
    SummarizationStrategy.WITHIN_TIME_WINDOW  // 每10分钟总结一次
);

六、RAG实现

6.1 完整RAG流程

// 1. 加载文档
Document document = Document.from("path/to/file.pdf");
List<Document> documents = List.of(document);

// 2. 分块
TextSplitter splitter = RecursiveCharacterTextSplitter.builder()
    .chunkSize(512)
    .chunkOverlap(50)
    .build();
List<TextSegment> segments = splitter.split(documents);

// 3. 向量化并存储
EmbeddingModel embeddingModel = new OpenAiEmbeddingModel(...);
VectorStore vectorStore = new PineconeVectorStore(...);
EmbeddingStore<TextSegment> embeddingStore = new EmbeddingStoreAdapter<>(vectorStore);

embeddingModel.embedAll(segments, embeddingStore);

// 4. 检索
RetrievalQuery query = RetrievalQuery.query("What is RAG?");
List<Content> relevant = embeddingStore.findRelevant(query, 5);

// 5. 构建Prompt
PromptTemplate promptTemplate = PromptTemplate.from(
    "Context:\n{{context}}\n\nQuestion: {{question}}\nAnswer:"
);
Prompt prompt = promptTemplate.apply(
    Map.of(
        "context", relevant.stream().map(Content::text).collect(Collectors.joining("\n---\n")),
        "question", "What is RAG?"
    )
);

// 6. 生成回答
String answer = model.generate(prompt.text());

// 7. 返回(answer + 引用来源)

6.2 使用RetrievalQAChain(封装)

import dev.langchain4j.chain.qa.ConversationalRetrievalChain;

RetrievalQAChain chain = ConversationalRetrievalChain.builder()
    .embeddingStore(embeddingStore)
    .embeddingModel(embeddingModel)
    .chatLanguageModel(model)
    .chatMemory(memory)
    .build();

Answer answer = chain.execute("What is RAG?");

6.3 Rerank(重排序)

import dev.langchain4j.retriever.Reranker;

Reranker reranker = new CrossEncoderReranker(
    "BAAI/bge-reranker-base"
);

List<Content> retrieved = retriever.retrieve(query);
List<ScoredContent<T>> reranked = reranker.rerank(query, retrieved);

七、Agents深度

7.1 Tool Calling(函数调用)

支持OpenAI Function Calling

ChatLanguageModel model = new OpenAiChatModel(
    OpenAiChatModelOptions.builder()
        .modelName("gpt-4-turbo-preview")
        .functionRegistry(registry)  // 注册工具
        .build()
);

// 模型自动决定调用工具
ChatResponse response = model.chat(
    ChatRequest.builder()
        .messages(UserMessage.from("What's the weather in Tokyo?"))
        .build()
);

if (response.aiMessage().hasToolExecutionRequests()) {
    List<ToolExecutionRequest> requests = response.aiMessage().toolExecutionRequests();
    for (ToolExecutionRequest request : requests) {
        String result = registry.execute(request);  // 执行工具
        // 将结果回传给模型
    }
}

7.2 Planning(规划)

复杂任务分解

import dev.langchain4j.agent.planner.Planner;
import dev.langchain4j.agent.planner.SimplePlanner;

Planner planner = SimplePlanner.builder()
    .chatLanguageModel(model)
    .tools(tools)
    .maxIterations(10)
    .build();

Plan plan = planner.plan("Analyze quarterly sales and send report via email");

for (Step step : plan.steps()) {
    // 执行步骤
}

7.3 Multi-Agent(多Agent协作)

import dev.langchain4j.chain.MultiPromptRouterChain;
import dev.langchain4j.chain.MultiRouteChain;

// 路由到不同专家Agent
MultiPromptRouterChain router = MultiPromptRouterChain.builder()
    .llm(model)
    .promptTemplates(Map.of(
        "legal", legalPromptTemplate,
        "finance", financePromptTemplate,
        "tech", techPromptTemplate
    ))
    .build();

String result = router.execute(question);

八、Callbacks与监控

8.1 Callback机制

拦截关键事件

import dev.langchain4j.callback.*;

CallbackManager callbackManager = new CallbackManager.Builder()
    .addListener(new StdOutCallbackListener())  // 输出到控制台
    .addListener(new LoggingCallbackListener())
    .addListener(new MetricsCallbackListener())
    .build();

// 传递到所有组件
ChatLanguageModel model = new OpenAiChatModel(..., callbackManager);
Chain chain = ..., callbackManager);
Agent agent = ..., callbackManager);

// 执行时会触发回调
chain.execute(...);

8.2 自定义Listener

public class MetricsListener extends AbstractCallbackListener {
    
    private final MeterRegistry meterRegistry;
    
    @Override
    public void onLLMStart(LLMStartData data) {
        timer = meterRegistry.timer("llm.request");
        timer.start();
    }
    
    @Override
    public void onLLMEnd(LLMEndData data) {
        timer.stop();
        meterRegistry.counter("llm.tokens").increment(data.tokenUsage().total());
    }
    
    @Override
    public void onChainStart(ChainStartData data) {
        // 记录chain开始
    }
    
    @Override
    public void onChainEnd(ChainEndData data) {
        // 记录chain结束、耗时
    }
}

8.3 分布式追踪

集成OpenTelemetry

import dev.langchain4j.callback.opus.OpusCallback;

OpusCallback opus = OpusCallback.builder()
    .applicationName("my-rag-app")
    .environment("production")
    .build();

CallbackManager callbackManager = CallbackManager.builder()
    .addListener(opus)
    .build();

九、Spring集成

9.1 Spring Boot Starter

最简配置

<dependency>
    <groupId>dev.langchain4j</groupId>
    <artifactId>langchain4j-spring-boot-starter</artifactId>
    <version>0.30.0</version>
</dependency>
langchain4j:
  open-ai:
    api-key: ${OPENAI_API_KEY}
    model-name: gpt-4-turbo-preview
    temperature: 0.7
  embedding:
    model-name: text-embedding-ada-002
  vector-store:
    type: pinecone
    api-key: ${PINECONE_API_KEY}
    index-name: my-index

9.2 自动配置Bean

@SpringBootApplication
public class MyApp {
    
    @Bean
    public ChatLanguageModel chatLanguageModel() {
        return OpenAiChatModel.builder()
            .apiKey(env.get("OPENAI_API_KEY"))
            .modelName("gpt-4")
            .build();
    }
    
    @Bean
    public EmbeddingModel embeddingModel() {
        return OpenAiEmbeddingModel.builder()
            .apiKey(env.get("OPENAI_API_KEY"))
            .build();
    }
    
    @Bean
    public VectorStore vectorStore(EmbeddingModel embeddingModel) {
        return PineconeVectorStore.builder()
            .apiKey(env.get("PINECONE_API_KEY"))
            .indexName("my-index")
            .embeddingModel(embeddingModel)
            .build();
    }
}

9.3 配置方式对比

Java Config vs Spring Boot AutoConfig

// 传统方式
ChatLanguageModel model = OpenAiChatModel.builder().build();

// Spring Boot方式(自动注入)
@Autowired
private ChatLanguageModel chatLanguageModel;

// 同时支持多模型
@Bean
@Qualifier("gpt4")
public ChatLanguageModel gpt4() { ... }

@Bean
@Qualifier("claude")
public ChatLanguageModel claude() { ... }

9.4 Spring AI对比

LangChain4J vs Spring AI

特性 LangChain4J Spring AI
功能完整度 ★★★★★ ★★★☆☆
Spring集成 ★★★★☆ ★★★★★
Agent能力 ★★★★★ ★★☆☆☆
企业特性 ★★★★★ ★★★★☆
学习曲线 陡峭 平缓
  • 选择Spring AI:简单RAG、快速起效、深度Spring集成
  • 选择LangChain4J:复杂Agent、高级功能、生产级监控

9.5 @ConfigurationProperties

@ConfigurationProperties(prefix = "langchain4j")
public class LangChain4JProperties {
    private OpenAi openAi = new OpenAi();
    private VectorStore vectorStore = new VectorStore();
    
    // getters & setters
    
    public static class OpenAi {
        private String apiKey;
        private String modelName;
        // ...
    }
}

十、模型支持

10.1 云服务API

10.1.1 OpenAI

ChatLanguageModel model = new OpenAiChatModel(
    OpenAiChatModelOptions.builder()
        .apiKey("sk-...")
        .modelName("gpt-4-turbo-preview")
        .temperature(0.7)
        .maxTokens(2000)
        .organizationId("org-...")  // 可选
        .build()
);
  • 支持:gpt-4、gpt-4-turbo、gpt-3.5-turbo、embedding模型
  • 流式:OpenAiStreamingChatModel
  • 函数调用:自动注册Tool

10.1.2 Azure OpenAI

ChatLanguageModel model = new AzureOpenAiChatModel(
    AzureOpenAiChatModelOptions.builder()
        .endpoint("https://my-resource.openai.azure.com")
        .apiKey("...")
        .deploymentName("gpt-4-deployment")
        .apiVersion("2024-02-15-preview")
        .build()
);

10.1.3 Anthropic(Claude)

ChatLanguageModel model = new AnthropicChatModel(
    AnthropicChatModelOptions.builder()
        .apiKey("sk-ant-...")
        .modelName("claude-3-opus-20240229")
        .maxTokens(4000)
        .temperature(0.7)
        .build()
);
  • 支持:claude-3-opus/sonnet/haiku

10.1.4 Google PaLM / Gemini

ChatLanguageModel model = new GoogleAiChatModel(
    GoogleAiChatModelOptions.builder()
        .apiKey("...")
        .modelName("gemini-pro")
        .temperature(0.8)
        .build()
);

10.2 本地推理

10.2.1 Ollama

本地运行开源模型

ChatLanguageModel model = new OllamaChatModel(
    OllamaChatModelOptions.builder()
        .baseUrl("http://localhost:11434")
        .modelName("llama2:13b")  // 或codellama、mistral
        .temperature(0.7)
        .build()
);

优势

  • 无API费用
  • 数据不离开内网
  • 低延迟(本地网络)

要求:机器有GPU(推荐)或足够内存

10.2.2 HuggingFace

使用Transformers

ChatLanguageModel model = new HuggingFaceChatModel(
    HuggingFaceChatModelOptions.builder()
        .accessToken("hf_...")
        .modelId("meta-llama/Llama-2-13b-chat-hf")
        .temperature(0.7)
        .build()
);

注意:需Java环境有足够内存(>32GB)

10.2.3 LocalAI

兼容OpenAI API的本地模型

ChatLanguageModel model = new OpenAiChatModel(
    OpenAiChatModelOptions.builder()
        .baseUrl("http://localhost:8080/v1")  // LocalAI endpoint
        .apiKey("not-needed")
        .modelName("llama2:13b")
        .build()
);

10.3 嵌入模型

EmbeddingModel embeddingModel = new OpenAiEmbeddingModel(
    OpenAiEmbeddingModelOptions.builder()
        .apiKey("...")
        .modelName("text-embedding-3-small")
        .build()
);

// 单条
List<Float> embedding = embeddingModel.embed("Hello world");

// 批量
List<List<Float>> embeddings = embeddingModel.embedAll(List.of("text1", "text2"));

本地嵌入模型

EmbeddingModel embeddingModel = new HuggingFaceEmbeddingModel(
    HuggingFaceEmbeddingModelOptions.builder()
        .modelId("BAAI/bge-small-en-v1.5")
        .build()
);

十一、向量存储

11.1 支持的向量数据库

完整支持列表

  • Pinecone(云托管)
  • Weaviate(开源+云)
  • Qdrant(开源+云)
  • Milvus(开源+云)
  • Elasticsearch(全文+向量混合)
  • Azure AI Search
  • OpenSearch
  • Cassandra/ScyllaDB
  • Redis(RediSearch模块)
  • Neo4j(图+向量)
  • Chroma(轻量级)
  • Vespa
  • 自定义:实现VectorStore接口

11.2 向量存储用法

import dev.langchain4j.store.embedding.VectorStore;
import dev.langchain4j.store.embedding.EmbeddingStore;
import dev.langchain4j.store.embedding.RetrievedEmbedding;

// 通用接口
EmbeddingStore<TextSegment> store = new PineconeVectorStore(...);

// 添加
store.add(Embeddable.from(segment, embedding));

// 搜索(相似度)
List<RetrievedEmbedding<TextSegment>> results = store.findRelevant(
    query, 5
);

11.3 向量+元数据过滤

import dev.langchain4j.store.embedding.MetadataFilter;
import dev.langchain4j.store.embedding.MetadataFilterBuilder;

List<RetrievedEmbedding<TextSegment>> results = store.findRelevant(
    query,
    5,
    MetadataFilterBuilder
        .metadata("author", "John Doe")
        .and(MetadataFilterBuilder.metadata("date", ">2024-01-01"))
        .build()
);

11.4 向量存储配置示例

11.4.1 Pinecone

VectorStore store = PineconeVectorStore.builder()
    .apiKey(System.getenv("PINECONE_API_KEY"))
    .indexName("my-index")
    .environment("us-east-1")
    .embeddingModel(embeddingModel)
    .build();

11.4.2 Milvus

VectorStore store = MilvusVectorStore.builder()
    .host("localhost")
    .port(19530)
    .database("default")
    .collectionName("my_collection")
    .embeddingModel(embeddingModel)
    .build();

11.4.3 PostgreSQL + pgvector

VectorStore store = PgVectorVectorStore.builder()
    .host("localhost")
    .port(5432)
    .database("mydb")
    .user("user")
    .password("pass")
    .tableName("embeddings")
    .dimension(1536)
    .build();

十二、性能优化

12.1 批处理嵌入

List<String> texts = List.of("text1", "text2", ...);
List<List<Float>> embeddings = embeddingModel.embedAll(texts);  // 批量API调用

for (int i = 0; i < texts.size(); i++) {
    store.add(Embeddable.from(texts.get(i), embeddings.get(i)));
}

12.2 缓存策略

12.2.1 Embedding缓存

EmbeddingModel cachingEmbeddingModel = EmbeddingModelAdapters.cache(
    embeddingModel,
    Caffeine.newBuilder()
        .maximumSize(10_000)
        .expireAfterAccess(1, TimeUnit.HOURS)
        .build()
);

12.2.2 LLM响应缓存

ChatLanguageModel cachingModel = ChatLanguageModelAdapters.cache(
    model,
    Caffeine.newBuilder()
        .maximumSize(1_000)
        .expireAfterWrite(10, TimeUnit.MINUTES)
        .build()
);

注意:缓存失效策略(基于prompt hash)

12.3 异步与非阻塞

CompletableFuture<String> future = model.generateAsync("Hello");
future.thenAccept(answer -> {
    // 处理答案
});

12.4 连接池配置

// HTTP客户端配置
HttpClientBuilderCustomizer customizer = httpClientBuilder -> 
    httpClientBuilder
        .maxConnections(100)
        .maxConnectionsPerRoute(50)
        .connectionTimeout(10, TimeUnit.SECONDS)
        .responseTimeout(60, TimeUnit.SECONDS);

12.5 超时与重试

import dev.langchain4j.retry.RetryHelper;

RetryHelper retryHelper = RetryHelper.builder()
    .maxAttempts(3)
    .initialInterval(100, TimeUnit.MILLISECONDS)
    .maxInterval(5, TimeUnit.SECONDS)
    .multiplier(2.0)
    .build();

String result = retryHelper.execute(() -> model.generate(prompt));

12.6 成本控制

// Token使用追踪
TokenUsage tokenUsage = response.tokenUsage();
log.info("Tokens: prompt={}, completion={}, total={}",
    tokenUsage.inputTokenCount(),
    tokenUsage.outputTokenCount(),
    tokenUsage.totalTokenCount()
);

// 预算限制
if (tokenUsage.totalTokenCount() > MAX_DAILY_TOKENS) {
    throw new BudgetExceededException();
}

十三、测试与评估

13.1 单元测试

@Test
void testChain() {
    // Mock LLM
    ChatLanguageModel mockModel = mock(ChatLanguageModel.class);
    when(mockModel.generate(any()))
        .thenReturn(ChatResponse.from(AiMessage.from("mocked answer")));
    
    // 测试Chain
    LLMChain chain = LLMChain.builder()
        .prompt(promptTemplate)
        .llm(mockModel)
        .build();
    
    String result = chain.execute(Map.of("question", "test")).orElseThrow();
    
    assertEquals("mocked answer", result);
}

13.2 评估框架

import dev.langchain4j.evaluation.Evaluator;
import dev.langchain4j.evaluation.qa.QAEvaluator;

QAEvaluator evaluator = QAEvaluator.builder()
    .chatLanguageModel(model)  // 使用GPT-4作为评判
    .build();

EvaluationResult result = evaluator.evaluate(
    EvaluationInput.builder()
        .referenceAnswer("Expected answer")
        .actualAnswer("Generated answer")
        .query("Question")
        .build()
);

result.criteria().forEach(criteria -> {
    log.info("{}: {}", criteria.name(), criteria.score());
});

评估维度

  • Correctness:答案正确性
  • Faithfulness:基于上下文,无幻觉
  • Relevance:答案相关性
  • Coherence:连贯性
  • Toxicity:毒性

十四、生产环境最佳实践

14.1 配置管理

@Configuration
public class LangChainConfig {
    
    @Value("${langchain4j.openai.api-key}")
    private String apiKey;
    
    @Bean
    public ChatLanguageModel chatLanguageModel() {
        return OpenAiChatModel.builder()
            .apiKey(apiKey)
            .modelName(env.getProperty("langchain4j.openai.model", "gpt-4"))
            .temperature(
                env.getProperty("langchain4j.openai.temperature", Double.class, 0.7)
            )
            .callTimeout(30, TimeUnit.SECONDS)
            .maxRetries(3)
            .build();
    }
}

14.2 多环境配置

# application-dev.yml
langchain4j:
  open-ai:
    api-key: ${OPENAI_API_KEY_DEV}
    model-name: gpt-3.5-turbo

# application-prod.yml
langchain4j:
  open-ai:
    api-key: ${OPENAI_API_KEY_PROD}
    model-name: gpt-4-turbo
    max-retries: 5

14.3 健康检查

import org.springframework.boot.actuate.health.Health;
import org.springframework.boot.actuate.health.HealthIndicator;

@Component
public class LangChainHealthIndicator implements HealthIndicator {
    
    @Autowired
    private ChatLanguageModel model;
    
    @Override
    public Health health() {
        try {
            // 简单测试调用
            model.generate("test");
            return Health.up().build();
        } catch (Exception e) {
            return Health.down(e).build();
        }
    }
}

14.4 指标暴露(Micrometer)

@Bean
public MeterRegistryCustomizer<MeterRegistry> langchainMetrics() {
    return registry -> {
        Counter.builder("langchain.requests.total")
            .description("Total LLM requests")
            .register(registry);
        
        Timer.builder("langchain.request.duration")
            .description("LLM request duration")
            .register(registry);
    };
}

十五、常见问题与解决方案

Q1: 如何选择模型?

  • 高质量:GPT-4、Claude 3 Opus
  • 性价比:GPT-4o mini、Claude 3 Haiku
  • 中文优化:国内大模型(通义、文心、ChatGLM)通过OpenAI兼容API
  • 免费部署:Llama 3、Mistral(Ollama本地)

Q2: 如何处理长上下文?

  • 使用支持长上下文的模型(GPT-4-32k、Claude 200k)
  • 分块检索(RAG)
  • 使用TokenWindowChatMemory限制记忆长度
  • 总结(Summarization)压缩历史

Q3: 如何降低Token消耗?

  • 精简Prompt(删除冗余)
  • 缓存常用结果
  • 使用更小模型处理简单任务
  • 限制返回长度(maxTokens)
  • 分块检索时只返回top相关

Q4: 如何调试Prompt?

import dev.langchain4j.prompt.PromptTemplate;

PromptTemplate prompt = PromptTemplate.from("{{input}}");
String rendered = prompt.apply(Map.of("input", "test"));
log.debug("Prompt: {}", rendered);  // 查看实际发送的prompt

Q5: 流式响应如何集成WebSocket?

@ServerEndpoint("/chat/stream")
public class ChatWebSocket {
    
    @OnMessage
    public void onMessage(String message, Session session) {
        streamingModel.chat(
            ChatRequest.builder()
                .messages(UserMessage.from(message))
                .build(),
            token -> {
                session.getBasicRemote().sendText(token);
                return true;
            }
        );
    }
}

Q6: 如何处理异常?

try {
    String response = model.generate(prompt);
} catch (Exception e) {
    if (e instanceof RateLimitException) {
        // 限流,等待后重试
    } else if (e instanceof InvalidRequestException) {
        // 请求错误,记录并降级
        return fallbackResponse();
    } else if (e instanceof RuntimeException) {
        // 网络超时等
        throw new ServiceUnavailableException("LLM service down", e);
    }
}

Q7: 如何保证线程安全?

  • ChatLanguageModel是无状态的,线程安全
  • ChatMemory不是线程安全,需每个会话单独实例
  • PromptTemplate不可变,线程安全

Q8: 如何支持多租户?

public class TenantAwareChatService {
    
    public String chat(String tenantId, String message) {
        // 1. 根据tenant选择模型(不同配额)
        ChatLanguageModel model = modelRegistry.getModelForTenant(tenantId);
        
        // 2. 租户专属memory(Redis key带上tenant)
        ChatMemory memory = getMemoryForTenant(tenantId);
        
        // 3. 租户权限检查(工具过滤)
        List<Tool> allowedTools = getToolsForTenant(tenantId);
        
        // 4. 审计日志
        auditLog(tenantId, message);
        
        return model.chat(...);
    }
}

Q9: 如何评估RAG质量?

  • 人工评估:随机抽样,打分
  • 自动评估:使用GPT-4作为Judge(Relevancy、Faithfulness)
  • A/B测试:对比不同chunk策略
  • 用户反馈:👍/👎按钮收集

Q10: 生产环境监控哪些指标?

  • 业务:QPS、响应时间(P50/P95/P99)、错误率、用户满意度
  • 模型:Token使用量、成本、各模型比例、限流触发
  • 向量库:查询延迟、QPS、内存/CPU、命中率
  • 应用:JVM内存、GC次数、线程池状态、数据库连接

十六、总结

LangChain4J核心优势

  1. 功能完整:覆盖LLM应用全场景(Prompt、Chain、Agent、Memory、RAG)
  2. Java原生:类型安全、编译检查、企业级
  3. 生产就绪:异步、流式、监控、重试、缓存
  4. 多模型支持:OpenAI、Azure、Anthropic、本地Ollama等
  5. Spring集成:Spring Boot Starter、自动配置

与竞品对比

  • LangChain(Python):功能同步,生态更成熟,但Java需跟随
  • Spring AI:更简单,Spring深度集成,适合快速RAG
  • LlamaIndex:专注RAG,检索能力更强,可作为Retriever组件配合使用

学习路线建议

  1. 入门(1-2天):PromptTemplate + LLMChain + 简单RAG
  2. 进阶(1周):Agents + Tools + Memory + 复杂Chain
  3. 高级(1月):自定义组件、生产部署、性能优化
  4. 专家(数月):框架二次开发、源码阅读

适用团队

  • Java技术栈:后端服务、微服务、企业应用
  • 已有Spring Boot:可无缝集成
  • 需要复杂Agent逻辑:工具调用、多步推理
  • 生产环境要求高:监控、稳定性、可维护性

不建议场景

  • 简单RAG:Spring AI或LlamaIndex更简单
  • Python团队:直接用LangChain Python
  • 强实时流处理:考虑专门流处理框架(Flink)

文档版本:v1.0
最后更新:2025年3月
参考:https://langchain4j.dev

0
  1. 支付宝打赏

    qrcode alipay
  2. 微信打赏

    qrcode weixin

评论区