Frequent 'python+web-scraping' Questions

13 votes

3 answers

2k views

AniPop - The anime downloader

Note: The topics of performance and Selenium/BS4 have not yet been addressed, so this question can still receive a better answer! Chat Room: https://chat.stackexchange.com/rooms/100275/anipop-...

T145

2,947

asked Oct 20, 2019 at 21:42

5 votes

1 answer

590 views

Instagram scraper Posts (Videos and Photos)

I wrote this code which has the ability to download images and videos from a specific Instagram profile. Using multiprocessing and threading I managed to speed up the extraction of data. My goal is ...

AlexDotis

417

asked Mar 11, 2020 at 14:38

3 votes

1 answer

3k views

Scraping Instagram with selenium, extract URLs, download posts

I made a very simple Instagram Bot that can download images and videos of the user, like Gallery with photos or videos. It saves the data in the folder. How it works: Creating directory for saving ...

AlexDotis

417

asked Mar 14, 2020 at 20:50

6 votes

1 answer

524 views

Web scraper that extracts urls from Amazon and eBay

Description: This is a simple script for scraping Amazon and eBay category, sub-category and product URLs and saving contents to files. In case of previously saved files, the files will be read and no ...

watch-this

1

asked Oct 16, 2019 at 2:51

6 votes

1 answer

2k views

Scraping Instagram - Download posts, photos - videos

Python script that can downloads public and private profiles images and videos, like Gallery with photos or videos. It saves the data in the folder. How it works: Log in in instragram using selenium ...

AlexDotis

417

asked Apr 12, 2020 at 16:27

5 votes

0 answers

1k views

Scraping OddsPortal with requests only

This is a scraper written to do most of what had been attempted by another user in this question: How can I optimise this webscraping code I did the rewrite because I felt bad that the new user didn't ...

Reinderien

55.5k

asked Jun 24, 2021 at 22:27

3 votes

1 answer

737 views

Web scraping using selenium, multiprocessing, InstagramBot

An Instagram Bot which downloads the posts from profile I have to mention my previous posts: Instagram scraper Posts (Videos and Photos) Scraping Instagram with selenium, extract URLs, download ...

AlexDotis

417

asked Mar 16, 2020 at 1:14

2 votes

1 answer

630 views

Instagram Scraping Using Selenium

Python script that can download images and videos of the user, like Gallery with photos or videos. It saves the data in the folder. How it works: Log in in instragram using selenium and navigate to ...

AlexDotis

417

asked Mar 28, 2020 at 1:40

2 votes

1 answer

307 views

Download pictures (or videos) from Instagram using Selenium

Python script that can downloads public and private profiles images and videos, like Gallery with photos or videos. It saves the data in the folder. How it works: Log in in instragram using selenium ...

AlexDotis

417

asked Apr 9, 2020 at 3:07

2 votes

1 answer

1k views

Instagram Scraping Posts Using Selenium

Python script that can download images and videos of the user, like Gallery with photos or videos. It saves the data in the folder. How it works: Log in in instragram using selenium and navigate to ...

AlexDotis

417

asked Apr 4, 2020 at 20:15

1 vote

1 answer

718 views

Instagram Bot, selenium, web scraping

I made some changes in my code from the previous post. The changes that I made: I put all the functions to the class All the global arrays I moved them to class too Created ...

AlexDotis

417

asked Mar 17, 2020 at 20:01

44 votes

3 answers

2k views

We'll be counting stars

Lately, I've been, I've been losing sleep Dreaming about the things that we could be But baby, I've been, I've been praying hard, Said, no more counting dollars We'll be counting stars, yeah we'll be ...

Simon Forsberg

58.7k

asked Apr 26, 2014 at 19:14

20 votes

4 answers

10k views

Web scraping the titles and descriptions of trending YouTube videos

This scrapes the titles and descriptions of trending YouTube videos and writes them to a CSV file. What improvements can I make? ...

austingae

1,080

asked Dec 30, 2018 at 20:57

14 votes

2 answers

840 views

VIM colors downloader in Python

Recently, I wanted to change my vim colors to something new. So I went to the vim colors website and then I decided that I wanted to download ALL the colors. So I ...

Kernel.Panic

173

asked Aug 30, 2016 at 14:04

12 votes

3 answers

1k views

Minimal webcrawler - bad structure and error handling?

I did this code over one day as a part of a job application, where they wanted me to make a minimal webcrawler in any language. The purpose was to crawl a site, find all of the URLs on that page, and ...

bjornasm

417

asked May 11, 2014 at 20:16

10 votes

1 answer

636 views

River Flood Warning system in Python

This code represents my first real Python 3 program. It retrieves flood data from the NWS weather center for the river near my home and posts a warning to my Facebook page whenever certain flood ...

Aaron Nelson

163

asked Jun 6, 2017 at 14:29

10 votes

1 answer

832 views

Let's read a random Goodreads book in an optimal way

I have made the following program to gather data on random books from Goodreads, via their random books feature. ...

esote

3,770

asked Apr 29, 2017 at 4:04

9 votes

1 answer

2k views

A library for interacting with Pinnacle Sports Bets API

My code provides the following functionality for interacting with Pinnacle Bets API: retrieving betting history retrieving fixtures (future events) retrieving odds for the given leagues (competitions)...

Konstantin Kostanzhoglo

143

asked Jul 29, 2020 at 2:46

8 votes

2 answers

2k views

Image downloader for a website v2

This code takes a website and downloads all .jpg images in the webpage. It supports only websites that have the img element and src contains a .jpg link. The previous version can be found here ...

Salah Eddine

303

asked Apr 30, 2017 at 5:23

8 votes

2 answers

2k views

Parsing Wikipedia table with Python

I am new to Python and recently started exploring web crawling. The code below parses the S&P 500 List Wikipedia page and writes the data of a specific table into a database. While this script is ...

DatenBergwerker

83

asked Feb 26, 2017 at 5:58

8 votes

2 answers

4k views

Finding words that rhyme

Preface I was trying to review this question on the same topic, but in the end many points I wanted to make were excellently explained by @ferada so I felt that posting my code and explaining the ...

Caridorc

27.3k

asked Aug 30, 2016 at 21:50

6 votes

3 answers

692 views

Scraping a webpage copying with the logic of scrapy

Today, while coming across a tutorial made by ScrapingHub on Scrapy about how it usually deals with a webpage while scraping it's content. I could see that the same logic applied in Scrapy can be ...

SIM

2,471

asked Aug 31, 2017 at 12:31

6 votes

3 answers

9k views

Email a notification when detecting changes on a website

The text of a website is checked in a given time period. If there are any changes a mail is sent. There is a option to show/mail the new parts in the website. What could be improved? ...

questionanswer

73

asked Feb 14, 2016 at 11:24

5 votes

1 answer

545 views

Ultra fast Amazon scraper multi-threaded

This is a follow up to the code here: Web scraper that extracts urls from Amazon and eBay A multi-threaded modification to the previous version that is Amazon focused and most of the necessary ...

watch-this

1

asked Oct 22, 2019 at 1:29

5 votes

1 answer

210 views

Richelieu - product scraper

I wanted to see how would I deal with a large amount of data being scraped and written into a CSV file, so I decided to get the info out of a random website. First off, I found a way to search for ...

Cajuu'

331

asked Nov 18, 2017 at 23:06

4 votes

1 answer

884 views

Cleaner way of appending data to List in BeautifulSoup

So I've been experimenting various way to get data from different variety of website; as such, between the use of JSON or BeautifulSoup. Currently, I have written a scraper to collect data such as <...

Minial

229

asked Jan 10, 2019 at 4:54

4 votes

0 answers

124 views

The anime downloader [duplicate]

NOTE: Here's the latest version of this program, since this question idled out. This is a recreational script made to update my home server w/ the latest anime from HorribleSubs. I'd like to know if ...

T145

2,947

asked Oct 16, 2019 at 0:41

4 votes

2 answers

10k views

Scraping HTML using Beautiful Soup

I have written a script using Beautiful Soup to scrape some HTML and do some stuff and produce HTML back. However, I am not convinced with my code and I am looking for some improvements. Structure of ...

avi

973

asked Sep 9, 2013 at 12:05

4 votes

1 answer

2k views

Web scraper for Football (Soccer) data with BeautifulSoup and Requests

I wrote a web scraper to get football scores from here. I'm getting the data for all seasons for the three major German leagues. It all works at the moment, but I'm sure it's possible to make it a lot ...

iuvbio

442

asked Nov 25, 2017 at 20:18

4 votes

1 answer

697 views

Scraping a dynamic website with Scrapy (or Requests) and Selenium

I am trying to use Scrapy for one of the sites I've scraped before using Selenium over here. Because the search field for this site is dynamically generated and requires the user to hover the cursor ...

Sati

435

asked Aug 2, 2021 at 9:33

4 votes

1 answer

333 views

Beginner web scraper for Nagios

I am attempting to learn Python. It was suggested to me to try a web scraper, so I thought to get myself to look at multiple Nagios instances. I have not programmed in Python before, but learned from ...

Canadian Luke

811

asked Jan 22, 2020 at 22:56

3 votes

2 answers

2k views

Instagram Scraping Using Selenium - Download Posts - Photos - Videos

Python script that can downloads public and private profiles images and videos, like Gallery with photos or videos. It saves the data in the folder. How it works: Log in in instragram using selenium ...

AlexDotis

417

asked Apr 19, 2020 at 13:51

3 votes

1 answer

84 views

Collect Pokémon from a URL and store them in a dataframe

...

Ashok

31

asked Jul 1, 2021 at 15:29

3 votes

0 answers

119 views

requests vs selenium vs scrapy

This is a follow-up of my question over here. I have been working on my web-scraping techniques by trying out different approaches and rewriting code for a handful of online databases. I've recently ...

Sati

435

asked Jun 23, 2021 at 15:39

3 votes

1 answer

478 views

Scraping current day Counter-Strike match results from a website

As a fan of competitive Counter-Strike, I like to keep up with who is currently winning and who is losing. There is a website that provides me with just that. I thought it would be cool if I could ...

Luke

1,090

asked Jul 12, 2017 at 16:05

3 votes

1 answer

147 views

Reaching the philosophy wiki page - Follow Up

This is a follow up to my original post: I've written a class that will start from a random Wikipedia page, then choose the first link in the main body, and then navigate following the links until ...

loremIpsum1771

459

asked Apr 3, 2017 at 4:02

3 votes

3 answers

2k views

Image downloader for a website

This code takes a website and downloads all .jpg images in the webpage. It supports only websites that have the <img> element and ...

Salah Eddine

303

asked Apr 29, 2017 at 15:46

3 votes

1 answer

116 views

Reaching the philosophy wiki page

I've written a class that will start from a random Wikipedia page, then choose the first link in the main body, and then navigate following the links until it finds the Philosophy page. When I run the ...

loremIpsum1771

459

asked Apr 2, 2017 at 20:04

3 votes

1 answer

589 views

Web crawlers for three image sites

I'm very new to python and only vaguely remember OOP from doing some Java a few years ago so I don't know what the best way to do this is. I've build a bunch of classes that represent a crawler that ...

davegri

153

asked Oct 2, 2015 at 13:55

2 votes

1 answer

113 views

Take information from a webpage and compare to previous request

After I have been doing some improvements from my Previous code review. I have taken the knowledge to upgrade and be a better coder but now im here again asking for Code review where I think it could ...

PythonNewbie

609

asked Apr 22, 2021 at 16:09

2 votes

0 answers

2k views

A web crawler for scraping images from stock photo websites

I created a web crawler that uses beautiful soup to crawl images from a website and scrape them to a database. in order to use it you have to create a class that inherits from Crawler and implements 4 ...

davegri

153

asked Oct 3, 2015 at 15:02

1 vote

0 answers

20 views

Organizing things together to form a minimum viable Scraper App (part 2)

This is a follow-up of my question over here. Response to @Reinderien's answer: I have corrected the more trivial issues highlighted in @Reinderien's answer below as follows. ...

Sati

435

asked Jul 7, 2021 at 10:56

1 vote

2 answers

1k views

Basic search engine

I want to improve efficiency of this search engine. It works in about 10 seconds for a search depth of 1, but 4 minutes at 2 etc. I tried to give straightforward comments and variable names, any ...

Ralph

111

asked Oct 2, 2014 at 10:18

1 vote

1 answer

5k views

Crawl multiple pages at once

This an update to my last question. I want to process multiple pages at once pulling URLs from tier_list in the crawl_web ...

Ralph

111

asked Oct 4, 2014 at 23:52

1 vote

0 answers

2k views

Email a notification when detecting changes on a website - follow-up

I read through other questions here and improved the code and added a new feature. The old question can be found at: Email a notification when detecting changes on a website The improvements that are ...

questionanswer

73

asked Feb 16, 2016 at 18:07

1 vote

1 answer

13k views

Extract html content based on tags, specifically headers

I want the function to take as an input json file containing html_body with its corresponding url and return list of tuples containing headers and their corresponding url (so could be tuple with one ...

oba2311

197

asked Jun 26, 2017 at 13:36

1 vote

1 answer

118 views

Organizing things together to form a minimum viable Scraper App

This is a follow-up of my group of scraper questions starting from here. I have thus far, with the help of @Reinderien, written 4 separate "modules" that expose a ...

Sati

435

asked Jun 26, 2021 at 1:09

All Questions

Related Tags