Crawlee for Python

Crawlee for Python

Quickly build reliable web crawling tools

  • Written in modern Python, including type hints, providing code auto completion functionality in the IDE.
  • Built on Playwright, the crawler can be switched from HTTP to headless browser within 3 lines of code.
  • Supports multiple browsers such as Chrome and Firefox.
  • Automatically manage and rotate agents, intelligently discard poorly performing agents.
  • Provide CLI tools to quickly create new projects and add template code.
  • Support data extraction and dataset export functions, facilitating data management and analysis.

Product Details

Crawlee is a Python library used to build reliable web crawlers. It is built by professional web crawler developers and is used to crawl millions of pages every day. Crawle supports JavaScript rendering and can easily switch to browser crawlers without rewriting code. In addition, it also provides automatic expansion and proxy management functions, which can intelligently manage and rotate proxies based on system resources, discarding those proxies that frequently timeout or return network errors.