Scrapy, a fast high-level web crawling & scraping framework for Python.
#
crawler
Repositories 3,097
A Powerful Spider(Web Crawler) System in Python.
Python
Updated Mar 18, 2019
News, full-text, and article metadata extraction in Python 3. Advanced docs:
Python
Updated Mar 17, 2019
A scalable web crawler framework for Java.
Java
Updated Nov 7, 2018
Elegant Scraper and Crawler Framework for Golang
Python爬虫代理IP池(proxy pool)
[Crawler for Golang] Pholcus is a distributed, high concurrency and powerful web crawler software.
crawler
spider
multi-interface
golang
distributed-crawler
high-concurrency-crawler
fastest-crawler
cross-platform-crawler
Go
Updated Mar 5, 2019
Web Crawler/Spider for NodeJS + server-side jQuery ;-)
JavaScript
Updated Mar 12, 2019
Incredibly fast crawler designed for OSINT.
Python
Updated Mar 21, 2019
Distributed crawler powered by Headless Chrome
JavaScript
Updated Mar 21, 2019
Redis-based components for Scrapy.
Python
Updated Mar 19, 2019
Declarative web scraping
A collection of awesome web crawler,spider in different languages
Updated Feb 11, 2019
Python
Updated Nov 11, 2018
基于搜狗微信搜索的微信公众号爬虫接口
Python
Updated Mar 22, 2019
Every web site provides APIs.
Python
Updated Dec 6, 2018
Python脚本。模拟登录知乎, 爬虫,操作excel,微信公众号,远程开机
Python
Updated Mar 8, 2019
Web Application Security Scanner Framework
Intelligent proxy pool for Humans™ [Maintainer needed]
The DomCrawler component eases DOM navigation for HTML and XML documents.
PHP
Updated Mar 22, 2019
Web crawling framework based on asyncio.
Python
Updated Mar 19, 2018
A Python module to scrape several search engines (like Google, Yandex, Bing, Duckduckgo, ...). Including asynchronous…
HTML
Updated Mar 3, 2019
DotnetSpider, a .NET Standard web crawling library similar to WebMagic and Scrapy. It is a lightweight ,efficient and…
Polite, slim and concurrent web crawler.
Go
Updated Apr 29, 2018
Easy to use lightweight web crawler(易用的轻量化网络爬虫)
Java
Updated Mar 12, 2019
[爬虫框架 (golang)] An awesome Go concurrent Crawler(spider) framework. The crawler is flexible and modular. It can be ex…
Go
Updated Nov 16, 2017
dynamic server-side rendering using headless Chrome to effortlessly solve the SEO problem for modern javascript websites
ssr
react
vue
angular
reactjs
vuejs
go
golang
chrome-headless
chrome-devtools
javascript
seo
seo-optimization
server-side-rendering
dynamic-rendering
spa
crawler
puppeteer
Go
Updated Jan 25, 2019
Proxy [Finder | Checker | Server]. HTTP(S) & SOCKS 🎭
proxy
anonymity
privacy
socks
http-proxy
crawler
proxy-server
anonymous
proxy-checker
proxy-list
proxypool
proxies
Python
Updated Mar 13, 2019