MCP Cursor

Enhance your development workflow with AI-powered MCP tools and extensions for Cursor IDE.

Product

  • MCP Servers
  • Getting Started
  • Documentation
  • Open Source

Resources

  • MCP Specification
  • Cursor IDE
  • MCP GitHub
  • Contributing

Legal

  • Privacy Policy
  • Terms of Service
  • Cookie Policy
Made withfor the developer community
© 2025 MCP Cursor. All rights reserved.
MCP Logo
MCP Cursor
IntroductionMCPs
IntroductionMCPs
3D MCP Cursor Visualization
  1. Home
  2. Servers
  3. Web Crawler MCP
Web Crawler MCP Logo

Web Crawler MCP

Model Context Protocol Integration

Overview

Configurable web crawler that extracts structured content from websites while respecting robots.txt rules and offering customizable settings for depth, delay, and concurrency.

Web Crawler

Configurable web crawler that extracts structured content from websites while respecting robots.txt rules and offering customizable settings for depth, delay, and concurrency.

Installation Instructions


README: https://github.com/jitsmaster/WebScrapeMCPServer

Web Crawler MCP Server Deployment Guide

Prerequisites

  • Node.js (v18+)
  • npm (v9+)

Installation

  1. Clone the repository:

    git clone https://github.com/jitsmaster/web-crawler-mcp.git
    cd web-crawler-mcp
    
  2. Install dependencies:

    npm install
    
  3. Build the project:

    npm run build
    

Configuration

Create a .env file with the following environment variables:

CRAWL_LINKS=false
MAX_DEPTH=3
REQUEST_DELAY=1000
TIMEOUT=5000
MAX_CONCURRENT=5

Running the Server

Start the MCP server:

npm start

MCP Configuration

Add the following to your MCP settings file:

{
  "mcpServers": {
    "web-crawler": {
      "command": "node",
      "args": ["/path/to/web-crawler/build/index.js"],
      "env": {
        "CRAWL_LINKS": "false",
        "MAX_DEPTH": "3",
        "REQUEST_DELAY": "1000",
        "TIMEOUT": "5000",
        "MAX_CONCURRENT": "5"
      }
    }
  }
}

Usage

The server provides a crawl tool that can be accessed through MCP. Example usage:

{
  "url": "https://example.com",
  "depth": 1
}

Configuration Options

Environment VariableDefaultDescription
CRAWL_LINKSfalseWhether to follow links
MAX_DEPTH3Maximum crawl depth
REQUEST_DELAY1000Delay between requests (ms)
TIMEOUT5000Request timeout (ms)
MAX_CONCURRENT5Maximum concurrent requests

Featured MCPs

Github MCP - Model Context Protocol for Cursor IDE

Github

This server provides integration with Github's issue tracking system through MCP, allowing LLMs to interact with Github issues.

Sequential Thinking MCP - Model Context Protocol for Cursor IDE

Sequential Thinking

An MCP server implementation that provides a tool for dynamic and reflective problem-solving through a structured thinking process. Break down complex problems into manageable steps, revise and refine thoughts as understanding deepens, and branch into alternative paths of reasoning.

Puppeteer MCP - Model Context Protocol for Cursor IDE

Puppeteer

A Model Context Protocol server that provides browser automation capabilities using Puppeteer. This server enables LLMs to interact with web pages, take screenshots, execute JavaScript, and perform various browser-based operations in a real browser environment.