-
Updated
Feb 5, 2021 - Makefile
#
web-scraping
Here are 2,002 public repositories matching this topic...
List of libraries, tools and APIs for web scraping and data processing.
javascript
ruby
python
go
golang
php
awesome
proxy
proxy-server
web-scraping
awesome-list
proxy-list
proxylist
data-processing
captcha-solving
captcha-breaking
captcha-solver
anti-captcha
captcha-recognition
proxyserver
A Smart, Automatic, Fast and Lightweight Web Scraper for Python
python
crawler
machine-learning
scraper
automation
ai
scraping
artificial-intelligence
web-scraping
scrape
webscraping
webautomation
-
Updated
Feb 3, 2021 - Python
PHP Curl Class makes it easy to send HTTP requests and integrate with web APIs
api
php
http
client
json
framework
curl
xml
proxy
restful
class
http-client
http-proxy
api-client
web-scraper
requests
web-scraping
php-curl
web-service
php-curl-library
-
Updated
Jan 22, 2021 - PHP
Web Scraping Framework
-
Updated
Dec 10, 2020 - Python
A New Version of 30 Days of Python is nearly here. Get started today.
python
api
flask
automation
tutorial
csv
jupyter
rest-api
selenium
pandas
python3
web-scraping
selenium-webdriver
fastapi
-
Updated
Feb 2, 2021 - HTML
General Assembly's 2015 Data Science course in Washington, DC
python
data-science
machine-learning
natural-language-processing
course
clustering
naive-bayes
linear-regression
scikit-learn
jupyter-notebook
pandas
data-visualization
web-scraping
data-analysis
ensemble-learning
logistic-regression
decision-trees
regular-expressions
data-cleaning
model-evaluation
-
Updated
Apr 18, 2016 - Jupyter Notebook
A Devtools driver for web automation and scraping
scraper
automation
chrome-devtools
headless
devtools
web-scraping
cdp
chrome-headless
rod
chrome-devtools-protocol
devtools-protocol
-
Updated
Feb 5, 2021 - Go
Snoop — инструмент разведки на основе открытых данных (OSINT world)
windows
linux
security
osint
scanner
geo
geolocation
web-scraping
ip
police
infosec
ctf
termux
pentest
nickname
blueteam
redteam
username-checker
intelligence-service
username-search
-
Updated
Feb 5, 2021 - Python
Collection of scripts corresponding to LucidProgramming YouTube tutorials
python
python3
web-scraping
youtube-tutorial
python-tutorial
ctci-solutions
lucidprogramming
python3-tutorial
technical-interview
-
Updated
Feb 6, 2021 - Python
A versatile Ruby web spidering library that can spider a site, multiple domains, certain links or infinitely. Spidr is designed to be fast and easy to use.
-
Updated
Dec 27, 2020 - Ruby
Nextjs server to query websites with GraphQL
-
Updated
Jan 29, 2021 - JavaScript
Faster requests on Python 3
python
curl
high-performance
cython
python-library
web-scraper
python3
speed
open-data
http-requests
web-scraping
scrapy
ndjson
python-requests
urllib
download-file
urllib3
faster-than-requests
requests3
requests-toolbelt
-
Updated
Jan 21, 2021 - Python
Random User-Agent middleware based on fake-useragent
-
Updated
Sep 17, 2020 - Python
A JavaScript library for generating random user agents with data that's updated daily.
javascript
user-agent
random
randomization
navigator
web-scraping
browsers
browser-automation
user-agent-spoofer
-
Updated
Dec 11, 2020 - JavaScript
A framework for creating semi-automatic web content extractors
python
crawler
tutorial
extractor
scraping
web-scraper
selector
css-selector
web-scraping
scrapy
scrapers
beautifulsoup
xpath-expression
lxml
selector-expression
-
Updated
Oct 24, 2020 - Python
UI.Vision: Open-Source RPA Software (formerly Kantu) - Modern Robotic Process Automation with Selenium IDE++
opencv
automation
webassembly
web-scraping
autohotkey
browser-extension
imacros
selenium-ide
browser-automation
visual-recognition
sikulix
web-automation
ui-tests
uipath
data-driven-tests
-
Updated
Oct 28, 2020 - JavaScript
The Python Code Tutorials
python
python-tutorials
machine-learning
natural-language-processing
computer-vision
text-classification
tutorials
python3
web-scraping
face-detection
scapy
network-analysis
network-programming
programming-tutorial
ethical-hacking
network-security
socket-programming
scapy-tutorials
-
Updated
Feb 3, 2021 - Jupyter Notebook
Python binding to Modest engine (fast HTML5 parser with CSS selectors).
-
Updated
Jan 9, 2021 - Python
ACHE is a web crawler for domain-specific search.
-
Updated
Jan 10, 2021 - Java
[WIP] GOPA, a spider written in Golang, for Elasticsearch. DEMO: http://index.elasticsearch.cn
-
Updated
Nov 24, 2019 - Go
NBA Stats API via Basketball Reference
-
Updated
Jan 13, 2021 - HTML
Công cụ quét và phân tích từ khoá các trang báo mạng Việt Nam
-
Updated
Feb 1, 2021 - Python
3
alaimia
commented
Feb 26, 2020
Hello,
I need to scrape linkedin POSTS: extract coments, views, frofiles of peoples who interact wth the post...
So please, Austin or anyone else, have you any idea to do it using scrape company !!
A command-line utility and Scrapy middleware for scraping time series data from Archive.org's Wayback Machine.
-
Updated
Apr 26, 2019 - Python
pjsier
commented
Jan 25, 2021
URL: https://www2.illinois.gov/sites/hfsrb/events/Pages/Board-Meetings.aspx
Spider Name: il_health_facilities
Agency Name: Illinois Health Facilities and Services Review Board
Python scripts for building 'Short Jokes' dataset, featured on Kaggle
-
Updated
Oct 1, 2020 - Python
Guide, reference and cheatsheet on web scraping using rvest, httr and Rselenium.
-
Updated
Dec 17, 2020 - R
Improve this page
Add a description, image, and links to the web-scraping topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the web-scraping topic, visit your repo's landing page and select "manage topics."
Main examples at Apify SDK webpage, Github repo and CLI templates should demonstrate how to manipulate with DOM and retrieve data from it.
Also add one example of scraping with Apify SDK + jQuery to https://sdk.apify.com/docs/examples/basiccrawler
Feedback from: https://medium.com/better-programming/do-i-need-python-scrapy-to-build-a-web-scraper-7cc7cac2081d
I lost an hour trying to make