Skip to main content

Documentation Index

Fetch the complete documentation index at: https://datagen.dev/llms.txt

Use this file to discover all available pages before exploring further.

Custom Tools are deterministic Python workflows that assist your agents with fast, reliable operations. Instead of letting the agent figure out multi-step logic each time, you package it as a custom tool — predictable, testable, and reusable. Once deployed, custom tools become REST API endpoints and MCP tools — callable from agents, Claude conversations, external systems, or scheduled jobs.
Custom Tools are separate from agents. Agents are autonomous and use reasoning. Custom Tools are deterministic — same input, same output, every time.
Using Claude Code? Run /datagen:create-custom-tool for a guided workflow that handles schema design, implementation, testing, and deployment.

Create via Claude Code

Ask Claude to build and deploy your tool:
"Create a custom tool that enriches company leads using LinkedIn and Perplexity"
Or use the guided workflow:
/datagen:create-custom-tool
This walks you through:
  1. Plan — Define the tool’s purpose and input/output schema
  2. Implement — Write the Python logic using the DataGen SDK
  3. Test — Run in a sandbox with executeCode
  4. Deploy — Deploy as a reusable API endpoint

Writing the Code

Use the DataGen SDK to call MCP tools as Python functions:
from datagen_sdk import DatagenClient

client = DatagenClient()

# Call MCP tools
issues = client.execute_tool("mcp_Linear_list_issues", {
    "filter": {"state": {"name": {"eq": "In Progress"}}}
})

# Process data
active_count = len(issues)
result = f"Found {active_count} active issues"

Input Schema

Define what parameters your tool accepts using JSON Schema:
{
  "type": "object",
  "properties": {
    "campaign_id": {
      "type": "string",
      "description": "The campaign ID to analyze"
    },
    "days": {
      "type": "integer",
      "description": "Number of days to look back",
      "default": 30
    }
  },
  "required": ["campaign_id"]
}

Output Variables

List the variables your code produces. These become the tool’s return values:
# Your code assigns these variables
result = {"total_leads": 150, "conversion_rate": 0.12}
summary = "Campaign performed above average"

# Output variables: ["result", "summary"]

Dependencies

TypeDescriptionExample
MCP ServersWhich MCP servers to connect["Linear", "Gmail"]
SecretsEnvironment variables needed["OPENAI_API_KEY"]
ImportsPython packages to import["pandas", "httpx"]

Testing

Test with executeCode

Before deploying, test your code interactively:
"Run this code to test my campaign analysis logic"
Claude calls executeCode to run your code in a sandbox and return results immediately.

Test a Deployed Tool

After deployment, test with submitCustomToolRun:
"Run my analyze_campaign tool with campaign_id='abc123'"

Deploying

Ask Claude to deploy using natural language:
"Deploy this as a custom tool called 'analyze_campaign'"
Or with more detail:
"Deploy this tool with campaign_id and date_range parameters, schedule daily at 9 AM"
Claude calls createCustomTool with your code, schema, and configuration.

Using Your Deployed Tool

Once deployed, your tool is available as:
  • MCP tool — Callable from Claude conversations via submitCustomToolRun
  • REST API — HTTP endpoint for external integrations
  • Scheduled job — Run automatically via schedules

Managing Custom Tools

Find Your Tools

"Search for my campaign tools"
Claude calls searchCustomTools to list your deployed tools.

Update a Tool

"Update my analyze_campaign tool to also return top performing messages"
Claude calls updateCustomTool to modify the code, schema, or dependencies.

Check Run Status

"Check the status of my last campaign analysis run"
Claude calls checkRunStatus to show the current state and output.

CLI Management

# List all custom tools
datagen tools list

# Show tool details
datagen tools show <uuid>

# Run a tool
datagen tools run <uuid> --input '{"campaign_id": "abc123"}'

Example: Lead Enrichment Tool

from datagen_sdk import DatagenClient

client = DatagenClient()

# Input: list of company domains
domains = input_domains  # From input schema

enriched = []
for domain in domains:
    # Web research
    research = client.execute_tool("mcp_Perplexity_search", {
        "query": f"What does {domain} company do? Who are the founders?"
    })

    # LinkedIn data
    company = client.execute_tool("mcp_LinkedIn_get_company", {
        "domain": domain
    })

    enriched.append({
        "domain": domain,
        "description": research.get("answer"),
        "employee_count": company.get("employeeCount"),
        "industry": company.get("industry")
    })

# Output variables
result = enriched
total_enriched = len(enriched)
Input Schema:
{
  "type": "object",
  "properties": {
    "input_domains": {
      "type": "array",
      "items": {"type": "string"},
      "description": "List of company domains to enrich"
    }
  },
  "required": ["input_domains"]
}
Dependencies: MCP Servers: ["Perplexity", "LinkedIn"] | Output Variables: ["result", "total_enriched"]

Best Practices

Each tool should do one thing well. Create separate tools for different tasks rather than one massive tool.
try:
    result = client.execute_tool("mcp_Linear_create_issue", {...})
except DatagenToolError as e:
    error_message = f"Failed to create issue: {e}"
Name tools clearly: enrich_company_data not tool1. Good descriptions help Claude understand when to use your tool.
Always test with executeCode first. Deployed tools are harder to debug.