Scrapy, a fast high-level web crawling & scraping framework for Python.
#
scraping
Repositories 1,223
Pythonic HTML Parsing for Humans™
A scalable web crawler framework for Java.
Java
Updated Nov 7, 2018
Elegant Scraper and Crawler Framework for Golang
Distributed crawler powered by Headless Chrome
JavaScript
Updated Mar 21, 2019
Declarative web scraping
Getting started with Puppeteer and Chrome Headless for Web Scraping
JavaScript
Updated Oct 18, 2018
A Python module to scrape several search engines (like Google, Yandex, Bing, Duckduckgo, ...). Including asynchronous…
HTML
Updated Mar 3, 2019
A browser testing and web crawling library for PHP and Symfony
Get info from any web service or page
PHP
Updated Mar 21, 2019
artoo.js - the client-side scraping companion.
JavaScript
Updated Mar 4, 2019
Scrape the Instagram frontend. Inspired from twitter-scraper by @kennethreitz.
Python
Updated Jun 29, 2018
Creating Scrapy scrapers via the Django admin interface
Python
Updated Feb 15, 2019
This Scrapy project uses Redis and Kafka to create a distributed on demand scraping cluster.
[Unmaintained] A simple and clean video/music/image downloader 👾
A curated list of awesome puppeteer resources.
Updated Mar 21, 2019
Free Web Scraping Tool with Java
JavaScript
Updated Feb 3, 2019
Analyze facebook copy of your data with ruby language. Download zip file from facebook and get info about friends ran…
A framework for creating semi-automatic web content extractors
python
css-selector
xpath-expression
web-scraper
web-scraping
scrapers
scraping
scrapy
selector
extractor
crawler
selector-expression
tutorial
lxml
beautifulsoup
Python
Updated Jan 7, 2019
Parsel lets you extract data from XML/HTML documents using XPath or CSS selectors
Python
Updated Jan 22, 2019
Web scraping library made by the Phantombuster team. Modern, simple & works on all websites.
JavaScript
Updated Aug 9, 2018
Jekyll-based static site for The Programming Historian
programming-historian
text-analysis
api
data-management
data-manipulation
data-mining
pedagogy
linked-open-data
mapping
network-analysis
exhibits
scraping
python
dh
digital-humanities
HTML
Updated Mar 22, 2019
Jsoup Annotations POJO
Java
Updated May 23, 2017
Extract structured data from web sites. Web sites scraping.
golang
golang-library
extract-data
chrome-fetcher
scraping-websites
crawling
scraper
scraping
cdp
go
headless
Go
Updated Mar 9, 2019
一个灵活、友好的爬虫框架
Python
Updated Dec 13, 2017
Use SQL on various data sources
C#
Updated Mar 19, 2019
Simple but useful Python web scraping tutorial code.
Jupyter Notebook
Updated Jul 25, 2018
[WIP] GOPA, a spider written in Golang, for Elasticsearch. DEMO: http://index.elasticsearch.cn
Go
Updated Feb 23, 2019
Topic wise PDFs of Geeks for Geeks articles. (Last updated in October 2018)
Python
Updated Nov 5, 2018