Build software better, together

实战🐍多种网站、电商数据爬虫🕷。包含🕸：淘宝商品、微信公众号、大众点评、企查查、招聘网站、闲鱼、阿里任务、博客园、微博、百度贴吧、豆瓣电影、包图网、全景网、豆瓣音乐、某省药监局、搜狐新闻、机器学习文本采集、fofa资产采集、汽车之家、国家统计局、百度关键词收录数、蜘蛛泛目录、今日头条、豆瓣影评、携程、小米应用商店、安居客、途家民宿❤️

❤️

❤️。微信爬虫展示项目:

Updated Jul 8, 2021
Python

scrapy-plugins / scrapy-splash

Star

Scrapy+Splash for JavaScript integration

scrapy hacktoberfest headless-browsers

Updated May 9, 2021
Python

DormyMo / SpiderKeeper

Star

admin ui for scrapy/open source scrapinghub

spider dashboard scrapy scrapyd scrapy-ui scrapyd-ui scrapyd-dashboard

Updated Mar 19, 2021
Python

Gerapy / Gerapy

Star

Distributed Crawler Management Framework Based on Scrapy, Scrapyd, Django and Vue.js

docker vuejs django spider dashboard vue distributed scrapy scrapyd webspider gerapy

Updated Jun 10, 2021
Python

nghuyong / WeiboSpider

Star

This is a sina weibo spider built by scrapy [微博爬虫/持续维护]

python docker redis docker-compose scrapy weibo docker-stack sina weibospider

Updated Jun 6, 2021
Python

my8100 / scrapydweb

Star

Open

linux：HTTPConnectionPool(host='192.168.0.24', port=6801): Max retries exceeded with url: /listprojects.json (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x7f0a78b2d828>: Failed to establish a new connection: [Errno 111] Connection refused',))
windows：HTTPConnectionPool(host='localhost', port=6801): Max retries exceeded with url: /jobs (Caused by Ne

wkunzhi / Python3-Spider

Star

Python爬虫实战 - 模拟登陆各大网站包含但不限于：滑块验证、拼多多、美团、百度、bilibili、大众点评、淘宝，如果喜欢请start ❤️

python crawler spider selenium crawl scrapy splash geek taobao scrapy-crawler meituan dianping pyppeteer

Updated Jul 24, 2020
Python

LuckyZXL2016 / Movie_Recommend

Star

基于Spark的电影推荐系统，包含爬虫项目、web网站、后台管理系统以及spark推荐系统

mysql nginx scala hive hadoop spark-streaming scrapy ssm-maven spark-mllib

Updated Apr 1, 2019
Java

fabienvauchelles / scrapoxy

Star

Scrapoxy hides your scraper behind a cloud. It starts a pool of proxies to send your requests. Now, you can crawl without thinking about blacklisting!

nodejs angularjs crawler scraper cloud proxy scrapy blacklisting

Updated Dec 18, 2020
JavaScript

sczhengyabin / Image-Downloader

Star

Download images from Google, Bing, Baidu. 谷歌、百度、必应图片下载.

google spider bing scrapy google-images baidu pyqt image-downloader

Updated Jun 2, 2021
Python

librauee / Reptile

Star

🏀 Python3 网络爬虫实战（部分含详细教程）猫眼腾讯视频豆瓣研招网微博笔趣阁小说百度热点 B站 CSDN 网易云阅读阿里文学百度股票今日头条微信公众号网易云音乐拉勾有道 unsplash 实习僧汽车之家英雄联盟盒子大众点评链家 LPL赛程台风梦幻西游、阴阳师藏宝阁天气牛客网百度文库睡前故事知乎 Wish

spider python3 requests scrapy

Updated Apr 19, 2021
Python

holgerd77 / django-dynamic-scraper

Star

Creating Scrapy scrapers via the Django admin interface

python scraper django spider scraping scrapy webscraping

Updated Jun 27, 2021
Python

istresearch / scrapy-cluster

Star

This Scrapy project uses Redis and Kafka to create a distributed on demand scraping cluster.

python redis kafka scraping distributed scrapy

Updated Apr 7, 2021
Python

ScrapingBoot / JSpider

Star

JSpider会每周更新至少一个网站的JS解密方式，欢迎 Star，交流微信：13298307816

nodejs javascript spider python3 scrapy

Updated Feb 2, 2020
JavaScript

mtianyan / FunpySpiderSearchEngine

Star

Word2vec 千人千面个性化搜索 + Scrapy2.3.0(爬取数据) + ElasticSearch7.9.1(存储数据并提供对外Restful API) + Django3.1.1 搜索

mysql python redis search-engine elasticsearch django spider zhihu scrapy lagou elasticsearch-analysis-ik

Updated Jun 8, 2021
Python

vifreefly / kimuraframework

Star

Kimurai is a modern web scraping framework written in Ruby which works out of box with Headless Chromium/Firefox, PhantomJS, or simple HTTP requests and allows to scrape and interact with JavaScript rendered websites

crawler scraper scrapy headless-chrome kimurai

Updated Apr 28, 2021
Ruby

kezhenxu94 / house-renting

Star

Possibly the best practice of Scrapy 🕷 and renting a house 🏡

python docker scrapy-spider scrapy scrapy-crawler scrapyd

Updated Jun 9, 2021
Python

jonbakerfish / TweetScraper

Star

TweetScraper is a simple crawler/spider for Twitter Search without using API

twitter tweets scrapy twitter-search

Updated Apr 3, 2021
Python

ramsayleung / jd_spider

Star

两只蠢萌京东的分布式爬虫.

docker mongodb graphite python3 scrapy

Updated Apr 8, 2019
Python

juancarlospaco / faster-than-requests

Star

Faster requests on Python 3

Updated Jun 23, 2021
Nim

scrapinghub / scrapyrt

Star

HTTP API for Scrapy spiders

python crawler scraper crawling twisted scrapy webcrawler webcrawling

Updated Jun 1, 2021
Python

hellock / icrawler

Star

A multi-thread crawler framework with many builtin image crawlers provided.

python crawler spider scrapy google-images flickr-api bing-image

Updated Jul 18, 2021
Python

lb2281075105 / Python-Spider

Star

豆瓣电影top250、斗鱼爬取json数据以及爬取美女图片、淘宝、有缘、CrawlSpider爬取红娘网相亲人的部分基本信息以及红娘网分布式爬取和存储redis、爬虫小demo、Selenium、爬取多点、django开发接口、爬取有缘网信息、模拟知乎登录、模拟github登录、模拟图虫网登录、爬取多点商城整站数据、爬取微信公众号历史文章、爬取微信群或者微信好友分享的文章、itchat监听指定微信公众号分享的文章

mysql python redis django spider mongodb selenium xpath scrapy pymysql itchat crawlspider weichat beautifulsoup4

Updated Dec 24, 2020
Python

MorvanZhou / easy-scraping-tutorial

Star

Simple but useful Python web scraping tutorial code.

crawler regex scraping crawling requests asyncio scrapy beautifulsoup distributed-scraper urllib

Updated Mar 21, 2021
Jupyter Notebook

clemfromspace / scrapy-selenium

Star

Scrapy middleware to handle javascript pages using selenium

crawling selenium scrapy

Updated May 16, 2021
Python

scrapy

Here are 2,618 public repositories matching this topic...

crawlab-team / crawlab

使用docker-compose部署，访问前端页面请求404

不能使用非crawlab里面mongodb么?

可配置爬虫，在界面上对进行修改，提示保存成功，然而并没有

lining0806 / PythonSpiderNotes

chyroc / WechatSogou

rmax / scrapy-redis

SpiderClub / haipproxy

DropsDevopsOrg / ECommerceCrawlers

scrapy-plugins / scrapy-splash

DormyMo / SpiderKeeper

Gerapy / Gerapy

nghuyong / WeiboSpider

my8100 / scrapydweb

User Guide | Q&A | 用户指南 | 问答

wkunzhi / Python3-Spider

LuckyZXL2016 / Movie_Recommend

fabienvauchelles / scrapoxy

sczhengyabin / Image-Downloader

librauee / Reptile

holgerd77 / django-dynamic-scraper

istresearch / scrapy-cluster

ScrapingBoot / JSpider

mtianyan / FunpySpiderSearchEngine

vifreefly / kimuraframework

kezhenxu94 / house-renting

jonbakerfish / TweetScraper

ramsayleung / jd_spider

juancarlospaco / faster-than-requests

scrapinghub / scrapyrt

hellock / icrawler

lb2281075105 / Python-Spider

MorvanZhou / easy-scraping-tutorial

clemfromspace / scrapy-selenium

Improve this page

Add this topic to your repo