AI 框架层出不穷，前面 instructor 还没有研究明白，这边 Pydantic 官方团队又出了一个 pydantic-ai。

使用框架有好有坏，比如框架一哥 LangChain，工具链一套一套的，对话管理、链式调用、外部数据源交互也都是应有尽有。如果用的熟练，开发一个 LLM 应用也就是分分钟的事。

但可惜我用的不熟，之前我在《LangChain 小记：LLM 多任务并行处理》也夸过 LangChain 在多任务并行上处理如何如何方便，而这两天就踩坑了，在一个 Python-3.13 环境里，框架的依赖库始终无法安装成功，最后只能移除了框架，手动撸了一个责任链。

鉴于我的 LLM 应用主要场景是数据获取 + 信息提取 + 结构化输出，所以我对框架的主要需求是：

依赖少，不能依赖太多第三方库。一个 lib 夹带 numpy, pandas, matplotlib 等若干重量级 libs，这种包是万万要不得的。
结构化输出简单易配置。这一点 instructor 及 pydantic-ai 做得都不错，巧妙地利用了 Pydantic 的 schemas，可以方便的引导 LLM 输出结构化数据。
调用简单，设计直观。这一点同样有利有弊，封装的过于彻底，就不方便自定义配置，但过于松散，又不利于上手。

PydanticAI 在这几点上做得都不错，虽然框架刚推出不久，且在持续迭代中，但已经可以满足大部分需求了。当然也有一些坑需要注意，比如下面提到的工具调用时，行为不一致的问题。

模型支持

在模型支持上，主流模型基本上都覆盖到了，OpenAI、Anthropic、Gemini、Ollama、Groq 等服务商的模型都可以直接使用，其他的服务商，比如 Cohere，开发团队也在积极适配中。

根据官方文档给出的示例，pydantic_ai.models 模块中提供了特定模型类，可以方便的配置 base_url、api_key 等参数，比如：

from pydantic_ai.models.openai import OpenAIModel
model = OpenAIModel(
    'anthropic/claude-3.5-sonnet',
    base_url='https://openrouter.ai/api/v1',
    api_key='your-api-key',
)
agent = Agent(model)

但我实际测试时，发现 OpenAIModel 的 __init__ 方法并没有 base_url 参数，不知道是哪里出了问题。

另一种自定义 base_url 的方法

另一个稍微复杂一点的方法是，先借用 openai.AsyncOpenAI 来创建一个 client，然后传入到 OpenAIModel 中，最后再传给 Agent。

from openai import AsyncOpenAI
from pydantic_ai.models.openai import OpenAIModel
from pydantic_ai import Agent, Tool

clint = AsyncOpenAI(
    base_url='https://api.deepseek.com/v1',
    api_key='sk-7bxxx',
)

model = OpenAIModel(
    'deepseek-chat',
    openai_client=clint,
)

agent = Agent(
    model,
    result_type=ResponseType,
    system_prompt='You are a web page analyser. You will be given a web page and you will need to analyse it and return the result in the format of ResponseType.',
    tools=[Tool(get_content)]
)

如果已经配置了环境变量，也可以忽略掉 model 对象的创建，直接使用 Agent 类。比如:

agent = Agent('gpt-4o')
agent2 = Agent('anthropic/claude-3.5-sonnet')

工具调用

作用相当于 RAG 中的 'R'，但对比传统的 vector search，这里的函数调用更通用，能提供任意格式的输入（文本格式）。

通过 Agent 类中的 tools 参数，可以方便的配置工具函数。

async def get_context() -> str:
    url='https://www.paulgraham.com/'
    async with httpx.AsyncClient() as client:
        response = await client.get(url)
        return response.text

agent = Agent(
    'openai:gpt-4o-mini',
    tools=[Tool(get_context)]
)

当然也可以通过 @agent.tool 装饰器来定义函数，比如：

@agent.tool
async def get_context() -> str:
    url='https://www.paulgraham.com/'
    async with httpx.AsyncClient() as client:
        response = await client.get(url)
        return response.text

Note: 0.0.12 版本中的 tools 有个 bug，如果使用的是非 OpenAI 模型，则有可能不会执行函数。参见 issue ¹。

如果上下文信息的获取需要通过配置实现怎么办？比如上例中的网址不是 https://www.paulgraham.com/，而是 https://www.bbc.com/，再或者我不想用 httpx 而是 aiohttp，又该如何操作呢？

这个问题也好解决，通过依赖注入，可以方便的配置函数。

依赖注入

实际上是配置了一个上下文环境，把相关信息添加到上下文环境里，然后通过依赖注入，在调用函数时使用。

@dataclass
class ExtraDeps:
    url: str
    client: httpx.AsyncClient

agent = Agent(
    'openai:gpt-4o-mini',
    deps_type=ExtraDeps,
)

@agent.tool
async def get_context(ctx: RunContext[ExtraDeps]) -> str:
    response = await ctx.deps.client.get(ctx.deps.url)
    return response.text

async def main():
    deps = ExtraDeps(url='https://www.bbc.com/', client=httpx.AsyncClient())
    result = await agent.run(deps, deps=deps)
    print(result)

结构化输出

这个也简单，在 Agent 中指定 result_type（BaseModel）即可。

注意：如果文本中里没有包含相关信息，最好在模型里设置好默认值，并在下游妥善处理，不然 LLM 返回的结果可能是随意编造的，或者是完全不相关的内容。

from pydantic_ai import Agent
import pydantic
import asyncio

class UserInfo(pydantic.BaseModel):
    name: str
    age: int = pydantic.Field(description='age in years', default=25)
    email: str

agent = Agent(
    'gemini-1.5-flash',
    result_type=UserInfo,
)

async def main():
    text = 'Tom is tall, with last name is Li, his email is [email protected].'
    result = await agent.run(text)
    print(result.data)

PydanticAI 是 model schema 不知道是如何传入到 LLM 的，但 instructor 是通过隐式 prompt 注入来实现的，查看源码，就可以发现是通过注入了下面的 prompt 来实现的，感觉就非常机智。

message = dedent(
    f'''You are a genius assistant, your task is to understand the content and provide the parsed objects in json that matches the following json_schema: \n

    {json.dumps(response_model.model_json_schema())}

    Make sure to return an instance of the JSON, not the schema itself.'''
).strip()

题外话

instructor 的源码非常值得学习，比 pydantic-ai 要简洁清晰的多。很多函数都是寥寥数行，可见设计的非常精简。

当然，PydanticAI 的优势是异步支持比较好，其实代码是完全异步的，同步函数比如 run_sync 只是对异步函数进行了封装，因此对 fastapi 的适配就天然支持了。

文档

Footnotes

后来检验了一下，这个问题其实不是框架的 bug，而是部分模型不支持 function calling，groq 里支持 function calling 的模型有两个，llama3-groq-70b-8192-tool-use-preview, llama3-groq-8b-8192-tool-use-preview。但话说出来，如果函数调用不支持，那框架应该在调用时给出提示，而不是直接忽略。 ↩

Pydantic AI 初探

模型支持

另一种自定义 base_url 的方法

工具调用

依赖注入

结构化输出

题外话

文档

Footnotes

Comments

理想拖

Previous Article

Next Article