How to integrate Firecrawl MCP with CrewAI

This guide walks you through connecting Firecrawl to CrewAI using the Composio tool router. By the end, you'll have a working Firecrawl agent that can extract all product prices from this e-commerce site, crawl competitor blogs for latest article summaries, map all subpages linked from homepage url through natural language commands. This guide will help you understand how to give your CrewAI agent real control over a Firecrawl account through Composio's Firecrawl MCP server. Before we dive in, let's take a quick look at the key ideas and tools involved.

Firecrawl logoFirecrawl
Api Key

Firecrawl automates large-scale web crawling and data extraction. It helps organizations efficiently gather, index, and analyze content from online sources.

29 Tools

Introduction

This guide walks you through connecting Firecrawl to CrewAI using the Composio tool router. By the end, you'll have a working Firecrawl agent that can extract all product prices from this e-commerce site, crawl competitor blogs for latest article summaries, map all subpages linked from homepage url through natural language commands.

This guide will help you understand how to give your CrewAI agent real control over a Firecrawl account through Composio's Firecrawl MCP server.

Before we dive in, let's take a quick look at the key ideas and tools involved.

Also integrate Firecrawl with

TL;DR

Here's what you'll learn:
  • Get a Composio API key and configure your Firecrawl connection
  • Set up CrewAI with an MCP enabled agent
  • Create a Tool Router session or standalone MCP server for Firecrawl
  • Build a conversational loop where your agent can execute Firecrawl operations

What is CrewAI?

CrewAI is a powerful framework for building multi-agent AI systems. It provides primitives for defining agents with specific roles, creating tasks, and orchestrating workflows through crews.

Key features include:

  • Agent Roles: Define specialized agents with specific goals and backstories
  • Task Management: Create tasks with clear descriptions and expected outputs
  • Crew Orchestration: Combine agents and tasks into collaborative workflows
  • MCP Integration: Connect to external tools through Model Context Protocol

What is the Firecrawl MCP server, and what's possible with it?

The Firecrawl MCP server is an implementation of the Model Context Protocol that connects your AI agent and assistants like Claude, Cursor, etc directly to your Firecrawl account. It provides structured and secure access to automated web crawling, scraping, and data extraction, so your agent can perform actions like indexing sites, extracting structured content, mapping URLs, and searching the web on your behalf.

  • Automated web crawling and indexing: Let your agent launch and manage web crawl jobs to gather content or index entire websites efficiently.
  • Structured data extraction: Instruct your agent to extract targeted data from web pages using custom prompts or schemas, turning unstructured sites into actionable information.
  • URL mapping and discovery: Have the agent explore and map all URLs within a website, including options for subdomain inclusion, sitemap processing, or search-based discovery.
  • On-demand scraping and content retrieval: Enable your agent to scrape specific URLs, retrieve page content, and even extract structured JSON using LLM-powered methods.
  • Integrated web search and data collection: Task your agent with running web searches, scraping top result pages, and returning relevant details—all in one workflow.

What is the Composio tool router, and how does it fit here?

What is Composio SDK?

Composio's Composio SDK helps agents find the right tools for a task at runtime. You can plug in multiple toolkits (like Gmail, HubSpot, and GitHub), and the agent will identify the relevant app and action to complete multi-step workflows. This can reduce token usage and improve the reliability of tool calls. Read more here: Getting started with Composio SDK

The tool router generates a secure MCP URL that your agents can access to perform actions.

How the Composio SDK works

The Composio SDK follows a three-phase workflow:

  1. Discovery: Searches for tools matching your task and returns relevant toolkits with their details.
  2. Authentication: Checks for active connections. If missing, creates an auth config and returns a connection URL via Auth Link.
  3. Execution: Executes the action using the authenticated connection.

Step-by-step Guide

Step by step08 STEPS
1

Prerequisites

Before starting, make sure you have:
  • Python 3.9 or higher
  • A Composio account and API key
  • A Firecrawl connection authorized in Composio
  • An OpenAI API key for the CrewAI LLM
  • Basic familiarity with Python
2

Getting API Keys for OpenAI and Composio

OpenAI API Key
  • Go to the OpenAI dashboard and create an API key. You'll need credits to use the models, or you can connect to another model provider.
  • Keep the API key safe.
Composio API Key
  • Log in to the Composio dashboard.
  • Navigate to your API settings and generate a new API key.
  • Store this key securely as you'll need it for authentication.
3

Install dependencies

bash
pip install composio crewai crewai-tools[mcp] python-dotenv
What's happening:
  • composio connects your agent to Firecrawl via MCP
  • crewai provides Agent, Task, Crew, and LLM primitives
  • crewai-tools[mcp] includes MCP helpers
  • python-dotenv loads environment variables from .env
4

Set up environment variables

bash
COMPOSIO_API_KEY=your_composio_api_key_here
USER_ID=your_user_id_here
OPENAI_API_KEY=your_openai_api_key_here

Create a .env file in your project root.

What's happening:

  • COMPOSIO_API_KEY authenticates with Composio
  • USER_ID scopes the session to your account
  • OPENAI_API_KEY lets CrewAI use your chosen OpenAI model
5

Import dependencies

python
import os
from composio import Composio
from crewai import Agent, Task, Crew
from crewai_tools import MCPServerAdapter
import dotenv

dotenv.load_dotenv()

COMPOSIO_API_KEY = os.getenv("COMPOSIO_API_KEY")
COMPOSIO_USER_ID = os.getenv("COMPOSIO_USER_ID")

if not COMPOSIO_API_KEY:
    raise ValueError("COMPOSIO_API_KEY is not set")
if not COMPOSIO_USER_ID:
    raise ValueError("COMPOSIO_USER_ID is not set")
What's happening:
  • CrewAI classes define agents and tasks, and run the workflow
  • MCPServerHTTP connects the agent to an MCP endpoint
  • Composio will give you a short lived Firecrawl MCP URL
6

Create a Composio Tool Router session for Firecrawl

python
composio_client = Composio(api_key=COMPOSIO_API_KEY)
session = composio_client.create(user_id=COMPOSIO_USER_ID, toolkits=["firecrawl"])

url = session.mcp.url
What's happening:
  • You create a Firecrawl only session through Composio
  • Composio returns an MCP HTTP URL that exposes Firecrawl tools
7

Initialize the MCP Server

python
server_params = {
    "url": url,
    "transport": "streamable-http",
    "headers": {"x-api-key": COMPOSIO_API_KEY},
}

with MCPServerAdapter(server_params) as tools:
    agent = Agent(
        role="Search Assistant",
        goal="Help users search the internet effectively",
        backstory="You are a helpful assistant with access to search tools.",
        tools=tools,
        verbose=False,
        max_iter=10,
    )
What's Happening:
  • Server Configuration: The code sets up connection parameters including the MCP server URL, streamable HTTP transport, and Composio API key authentication.
  • MCP Adapter Bridge: MCPServerAdapter acts as a context manager that converts Composio MCP tools into a CrewAI-compatible format.
  • Agent Setup: Creates a CrewAI Agent with a defined role (Search Assistant), goal (help with internet searches), and access to the MCP tools.
  • Configuration Options: The agent includes settings like verbose=False for clean output and max_iter=10 to prevent infinite loops.
  • Dynamic Tool Usage: Once created, the agent automatically accesses all Composio Search tools and decides when to use them based on user queries.
8

Create a CLI Chatloop and define the Crew

python
print("Chat started! Type 'exit' or 'quit' to end.\n")

conversation_context = ""

while True:
    user_input = input("You: ").strip()

    if user_input.lower() in ["exit", "quit", "bye"]:
        print("\nGoodbye!")
        break

    if not user_input:
        continue

    conversation_context += f"\nUser: {user_input}\n"
    print("\nAgent is thinking...\n")

    task = Task(
        description=(
            f"Conversation history:\n{conversation_context}\n\n"
            f"Current request: {user_input}"
        ),
        expected_output="A helpful response addressing the user's request",
        agent=agent,
    )

    crew = Crew(agents=[agent], tasks=[task], verbose=False)
    result = crew.kickoff()
    response = str(result)

    conversation_context += f"Agent: {response}\n"
    print(f"Agent: {response}\n")
What's Happening:
  • Interactive CLI Setup: The code creates an infinite loop that continuously prompts for user input and maintains the entire conversation history in a string variable.
  • Input Validation: Empty inputs are ignored to prevent processing blank messages and keep the conversation clean.
  • Context Building: Each user message is appended to the conversation context, which preserves the full dialogue history for better agent responses.
  • Dynamic Task Creation: For every user input, a new Task is created that includes both the full conversation history and the current request as context.
  • Crew Execution: A Crew is instantiated with the agent and task, then kicked off to process the request and generate a response.
  • Response Management: The agent's response is converted to a string, added to the conversation context, and displayed to the user, maintaining conversational continuity.

Complete Code

Here's the complete code to get you started with Firecrawl and CrewAI:

python
from crewai import Agent, Task, Crew, LLM
from crewai_tools import MCPServerAdapter
from composio import Composio
from dotenv import load_dotenv
import os

load_dotenv()

GOOGLE_API_KEY = os.getenv("GOOGLE_API_KEY")
COMPOSIO_API_KEY = os.getenv("COMPOSIO_API_KEY")
COMPOSIO_USER_ID = os.getenv("COMPOSIO_USER_ID")

if not GOOGLE_API_KEY:
    raise ValueError("GOOGLE_API_KEY is not set in the environment.")
if not COMPOSIO_API_KEY:
    raise ValueError("COMPOSIO_API_KEY is not set in the environment.")
if not COMPOSIO_USER_ID:
    raise ValueError("COMPOSIO_USER_ID is not set in the environment.")

# Initialize Composio and create a session
composio = Composio(api_key=COMPOSIO_API_KEY)
session = composio.create(
    user_id=COMPOSIO_USER_ID,
    toolkits=["firecrawl"],
)
url = session.mcp.url

# Configure LLM
llm = LLM(
    model="gpt-5",
    api_key=os.getenv("OPENAI_API_KEY"),
)

server_params = {
    "url": url,
    "transport": "streamable-http",
    "headers": {"x-api-key": COMPOSIO_API_KEY},
}

with MCPServerAdapter(server_params) as tools:
    agent = Agent(
        role="Search Assistant",
        goal="Help users with internet searches",
        backstory="You are an expert assistant with access to Composio Search tools.",
        tools=tools,
        llm=llm,
        verbose=False,
        max_iter=10,
    )

    print("Chat started! Type 'exit' or 'quit' to end.\n")

    conversation_context = ""

    while True:
        user_input = input("You: ").strip()

        if user_input.lower() in ["exit", "quit", "bye"]:
            print("\nGoodbye!")
            break

        if not user_input:
            continue

        conversation_context += f"\nUser: {user_input}\n"
        print("\nAgent is thinking...\n")

        task = Task(
            description=(
                f"Conversation history:\n{conversation_context}\n\n"
                f"Current request: {user_input}"
            ),
            expected_output="A helpful response addressing the user's request",
            agent=agent,
        )

        crew = Crew(agents=[agent], tasks=[task], verbose=False)
        result = crew.kickoff()
        response = str(result)

        conversation_context += f"Agent: {response}\n"
        print(f"Agent: {response}\n")

Conclusion

You now have a CrewAI agent connected to Firecrawl through Composio's Tool Router. The agent can perform Firecrawl operations through natural language commands.

Next steps:

  • Add role-specific instructions to customize agent behavior
  • Plug in more toolkits for multi-app workflows
  • Chain tasks for complex multi-step operations
TOOLS

Supported Tools

Every Firecrawl action and event your agent gets out of the box.

Cancel an agent job

Tool to cancel an in-progress agent job by its ID.

Batch scrape multiple URLs

Tool to scrape multiple URLs in batch with concurrent processing.

Cancel a batch scrape job

Tool to cancel a running batch scrape job using its unique identifier.

Get batch scrape status

Retrieves the current status and results of a batch scrape job using the job ID.

Get errors from batch scrape job

Tool to retrieve error details from a batch scrape job, including failed URLs and URLs blocked by robots.

Start a web crawl

Initiates a Firecrawl web crawl from a given URL, applying various filtering and content extraction rules, and polls until the job is complete; ensure the URL is accessible and any regex patterns for paths are valid.

Cancel a crawl job

Cancels an active or queued web crawl job using its ID; attempting to cancel completed, failed, or previously canceled jobs will not change their state.

Cancel a crawl job

Tool to cancel a running crawl job by its ID.

Get crawl job status

Tool to retrieve the status and results of a Firecrawl crawl job.

Get errors from a crawl job

Tool to retrieve errors from a Firecrawl crawl job.

Get all active crawl jobs

Tool to retrieve all active crawl jobs for the authenticated team.

Preview crawl parameters

Preview crawl parameters before starting a crawl by generating optimal configuration from natural language instructions.

Start a web crawl (v2) [NEW]

[NEW v2 API] Initiates a Firecrawl v2 web crawl with enhanced features over v1: natural language prompts for automatic crawler configuration, crawlEntireDomain for sibling/parent page discovery, better depth control with maxDiscoveryDepth, subdomain support, and full webhook configuration.

Get team credit usage

Tool to get current team credit usage information.

Get historical team credit usage

Tool to retrieve historical team credit usage on a monthly basis.

Extract structured data

Extracts structured data from web pages by initiating an extraction job and polling for completion; requires a natural language `prompt` or a JSON `schema` (one must be provided).

Get extract job status

Tool to retrieve the status and results of a previously submitted extract job.

Get agent job status

Tool to get the status and results of an agent job.

Get deep research status

Retrieves the status and results of a deep research job by its ID.

Get the status of a crawl job

Retrieves the current status, progress, and details of a web crawl job, using the job ID obtained when the crawl was initiated.

Generate LLMs.txt for a website

Initiates an async job to generate an LLMs.

Get LLMs.txt generation job status

Tool to get the status and results of an LLMs.

Map multiple URLs

Maps a website by discovering URLs from a starting base URL, with options to customize the crawl via search query, subdomain inclusion, sitemap handling, and result limits; search effectiveness is site-dependent.

Get team queue status

Tool to retrieve metrics about the team's scrape queue.

Scrape URL

Scrapes a publicly accessible URL, optionally performing pre-scrape browser actions or extracting structured JSON using an LLM, to retrieve content in specified formats.

Search

Performs a web search for a query, scrapes content from the top search results using Firecrawl, and returns details in specified formats.

Start an agent job

Tool to start an agent job for agentic web extraction with multi-page navigation and interaction capabilities.

Get team token usage

Tool to retrieve the current team's token usage and balance information for Firecrawl's Extract feature.

Get historical team token usage

Tool to retrieve historical team token usage on a monthly basis.

FAQ

Frequently asked questions

With a standalone Firecrawl MCP server, the agents and LLMs can only access a fixed set of Firecrawl tools tied to that server. However, with the Composio Tool Router, agents can dynamically load tools from Firecrawl and many other apps based on the task at hand, all through a single MCP endpoint.

Yes, you can. CrewAI fully supports MCP integration. You get structured tool calling, message history handling, and model orchestration while Tool Router takes care of discovering and serving the right Firecrawl tools.

Yes, absolutely. You can configure which Firecrawl scopes and actions are allowed when connecting your account to Composio. You can also bring your own OAuth credentials or API configuration so you keep full control over what the agent can do.

All sensitive data such as tokens, keys, and configuration is fully encrypted at rest and in transit. Composio is SOC 2 Type 2 compliant and follows strict security practices so your Firecrawl data and credentials are handled as safely as possible.

Start with Firecrawl.It takes 30 seconds.

Managed auth, hosted MCP servers, and every Firecrawl tool your agent needs.Free to start.

Start building