LangChain Testing & Evaluation

AgentCI provides comprehensive LangChain CI/CD integration for automated testing and evaluation of your LangChain agents. Whether you're building with LangGraph, using the classic AgentExecutor pattern, or creating custom tools, our platform automatically discovers and evaluates your LangChain code without requiring any modifications.

LangChain evals that integrate with your existing development workflow

AgentCI automatically discovers and evaluates LangChain agents, including:

  • Agent discovery: create_react_agent() calls, custom agents, and tool definitions
  • Evaluation types: Accuracy, safety, performance, and consistency testing
  • CI/CD integration: Automated testing on pull requests via GitHub
  • Zero code changes: No modifications to your Python code required

Supported Agent Patterns

LangGraph create_react_agent (Recommended)

from langgraph.prebuilt import create_react_agent
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langgraph.checkpoint.memory import MemorySaver

custom_prompt = ChatPromptTemplate([
    ("system", "You are a research assistant."),
    MessagesPlaceholder("messages")
])

memory = MemorySaver()

agent = create_react_agent(
    model="openai:gpt-4",
    tools=[search_web, calculator_tool],
    prompt=custom_prompt,
    checkpointer=memory
)

Legacy create_react_agent

from langchain.agents import create_react_agent
from langchain.chat_models import init_chat_model

llm = init_chat_model("openai:gpt-4")
agent = create_react_agent(
    llm=llm,
    tools=[search_web, calculator_tool],
    prompt="You are a helpful assistant."
)

Legacy initialize_agent

from langchain.agents import initialize_agent
from langchain.chat_models import init_chat_model

llm = init_chat_model("openai:gpt-4")
agent = initialize_agent(
    llm=llm,
    tools=[search_web, calculator_tool]
)

Supported Tool Patterns

@tool Decorator

from langchain_core.tools import tool
from pydantic import BaseModel, Field
from typing import Literal

# Tool with Pydantic schema and custom name
class WeatherInput(BaseModel):
    location: str = Field(description="City name or coordinates")
    units: Literal["celsius", "fahrenheit"] = Field(default="celsius")

@tool("get_weather", args_schema=WeatherInput)
def get_weather(location: str, units: str = "celsius") -> str:
    """Get current weather for a location."""
    return f"Weather in {location}: 72°{units[0].upper()}, sunny"

StructuredTool.from_function

from langchain_core.tools import StructuredTool

def calculate_sum(a: int, b: int) -> int:
    """Add two numbers."""
    return a + b

calculator_tool = StructuredTool.from_function(
    func=calculate_sum,
    name="Calculator",
    description="Perform addition of two numbers"
)

BaseTool Subclass

from langchain_core.tools import BaseTool
from pydantic import BaseModel, Field

class CustomSearchInput(BaseModel):
    query: str = Field(description="Search query")
    max_results: int = Field(default=10)

class CustomSearchTool(BaseTool):
    name: str = "CustomSearch"
    description: str = "Search with customizable result limits"
    args_schema: type[BaseModel] = CustomSearchInput

    def _run(self, query: str, max_results: int = 10) -> str:
        return f"Found {max_results} results for: {query}"

custom_search = CustomSearchTool()

Plain Functions

def get_time() -> str:
    """Get the current time."""
    from datetime import datetime
    return datetime.now().strftime("%Y-%m-%d %H:%M:%S")

# Use directly as a tool
time_agent = create_react_agent(
    model="openai:gpt-3.5-turbo",
    tools=[get_time],
    prompt="You are a time-keeping assistant."
)

What Gets Auto-Discovered

AgentCI automatically finds:

  • create_react_agent() calls with model, prompt, and tools parameters
  • Functions decorated with @tool
  • Tools created with StructuredTool.from_function()
  • Classes inheriting from BaseTool
  • Plain Python functions used as tools

No configuration or code changes required.

Next Steps