Skip to content
@apify

Apify

We're making the web more programmable.

Pinned Loading

  1. crawlee-python crawlee-python Public

    Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Wo…

    Python 3.9k 265

  2. crawlee crawlee Public

    Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, an…

    TypeScript 14.9k 622

  3. proxy-chain proxy-chain Public

    Node.js implementation of a proxy server (think Squid) with support for SSL, authentication and upstream proxy chaining.

    JavaScript 835 139

  4. apify-sdk-js apify-sdk-js Public

    Apify SDK monorepo

    TypeScript 119 31

  5. got-scraping got-scraping Public

    HTTP client made for scraping based on got.

    TypeScript 521 38

  6. fingerprint-suite fingerprint-suite Public

    Browser fingerprinting tools for anonymizing your scrapers. Developed by Apify.

    TypeScript 896 94

Repositories

Showing 10 of 127 repositories
  • apify-sdk-python Public

    The Apify SDK for Python is the official library for creating Apify Actors in Python. It provides useful features like actor lifecycle management, local storage emulation, and actor event handling.

    apify/apify-sdk-python’s past year of commit activity
    Python 115 Apache-2.0 11 10 1 Updated Sep 17, 2024
  • apify-client-python Public

    Apify API client for Python

    apify/apify-client-python’s past year of commit activity
    Python 46 Apache-2.0 11 8 5 Updated Sep 17, 2024
  • crawlee-python Public

    Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with BeautifulSoup, Playwright, and raw HTTP. Both headful and headless mode. With proxy rotation.

    apify/crawlee-python’s past year of commit activity
    Python 3,866 Apache-2.0 265 67 1 Updated Sep 17, 2024
  • apify-cli Public

    Apify command-line interface helps you create, develop, build and run Apify actors, and manage the Apify cloud platform.

    apify/apify-cli’s past year of commit activity
    TypeScript 120 18 33 (1 issue needs help) 6 Updated Sep 17, 2024
  • workflows Public

    Apify's reusable github workflows

    apify/workflows’s past year of commit activity
    6 3 2 3 Updated Sep 16, 2024
  • apify-docs Public

    This project is the home of Apify's documentation.

    apify/apify-docs’s past year of commit activity
    API Blueprint 26 Apache-2.0 72 63 21 Updated Sep 16, 2024
  • crawlee Public

    Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation.

    apify/crawlee’s past year of commit activity
    TypeScript 14,941 Apache-2.0 622 110 (1 issue needs help) 17 Updated Sep 16, 2024
  • actor-templates Public

    This project is the �? home of Apify actor template projects to help users quickly get started.

    apify/actor-templates’s past year of commit activity
    Python 25 15 7 3 Updated Sep 16, 2024
  • apify-shared-js Public

    Utilities and constants shared across Apify projects.

    apify/apify-shared-js’s past year of commit activity
    TypeScript 12 Apache-2.0 10 4 3 Updated Sep 16, 2024
  • rag-web-browser Public

    Retrieve website content from the top Google Search Results Pages (SERPs)

    apify/rag-web-browser’s past year of commit activity
    0 0 0 1 Updated Sep 16, 2024