All Questions

Tagged with
Filter by
Sorted by
Tagged with
13 votes
3 answers
2k views

AniPop - The anime downloader

Note: The topics of performance and Selenium/BS4 have not yet been addressed, so this question can still receive a better answer! Chat Room: https://chat.stackexchange.com/rooms/100275/anipop-...
  • 2,947
5 votes
1 answer
590 views

Instagram scraper Posts (Videos and Photos)

I wrote this code which has the ability to download images and videos from a specific Instagram profile. Using multiprocessing and threading I managed to speed up the extraction of data. My goal is ...
  • 417
3 votes
1 answer
3k views

Scraping Instagram with selenium, extract URLs, download posts

I made a very simple Instagram Bot that can download images and videos of the user, like Gallery with photos or videos. It saves the data in the folder. How it works: Creating directory for saving ...
  • 417
6 votes
1 answer
524 views

Web scraper that extracts urls from Amazon and eBay

Description: This is a simple script for scraping Amazon and eBay category, sub-category and product URLs and saving contents to files. In case of previously saved files, the files will be read and no ...
6 votes
1 answer
2k views

Scraping Instagram - Download posts, photos - videos

Python script that can downloads public and private profiles images and videos, like Gallery with photos or videos. It saves the data in the folder. How it works: Log in in instragram using selenium ...
  • 417
5 votes
0 answers
1k views

Scraping OddsPortal with requests only

This is a scraper written to do most of what had been attempted by another user in this question: How can I optimise this webscraping code I did the rewrite because I felt bad that the new user didn't ...
  • 55.5k
3 votes
1 answer
737 views

Web scraping using selenium, multiprocessing, InstagramBot

An Instagram Bot which downloads the posts from profile I have to mention my previous posts: Instagram scraper Posts (Videos and Photos) Scraping Instagram with selenium, extract URLs, download ...
  • 417
2 votes
1 answer
630 views

Instagram Scraping Using Selenium

Python script that can download images and videos of the user, like Gallery with photos or videos. It saves the data in the folder. How it works: Log in in instragram using selenium and navigate to ...
  • 417
2 votes
1 answer
307 views

Download pictures (or videos) from Instagram using Selenium

Python script that can downloads public and private profiles images and videos, like Gallery with photos or videos. It saves the data in the folder. How it works: Log in in instragram using selenium ...
  • 417
2 votes
1 answer
1k views

Instagram Scraping Posts Using Selenium

Python script that can download images and videos of the user, like Gallery with photos or videos. It saves the data in the folder. How it works: Log in in instragram using selenium and navigate to ...
  • 417
1 vote
1 answer
718 views

Instagram Bot, selenium, web scraping

I made some changes in my code from the previous post. The changes that I made: I put all the functions to the class All the global arrays I moved them to class too Created ...
  • 417
44 votes
3 answers
2k views

We'll be counting stars

Lately, I've been, I've been losing sleep Dreaming about the things that we could be But baby, I've been, I've been praying hard, Said, no more counting dollars We'll be counting stars, yeah we'll be ...
20 votes
4 answers
10k views

Web scraping the titles and descriptions of trending YouTube videos

This scrapes the titles and descriptions of trending YouTube videos and writes them to a CSV file. What improvements can I make? ...
  • 1,080
14 votes
2 answers
840 views

VIM colors downloader in Python

Recently, I wanted to change my vim colors to something new. So I went to the vim colors website and then I decided that I wanted to download ALL the colors. So I ...
12 votes
3 answers
1k views

Minimal webcrawler - bad structure and error handling?

I did this code over one day as a part of a job application, where they wanted me to make a minimal webcrawler in any language. The purpose was to crawl a site, find all of the URLs on that page, and ...
  • 417
10 votes
1 answer
636 views

River Flood Warning system in Python

This code represents my first real Python 3 program. It retrieves flood data from the NWS weather center for the river near my home and posts a warning to my Facebook page whenever certain flood ...
10 votes
1 answer
832 views

Let's read a random Goodreads book in an optimal way

I have made the following program to gather data on random books from Goodreads, via their random books feature. ...
  • 3,770
9 votes
1 answer
2k views

A library for interacting with Pinnacle Sports Bets API

My code provides the following functionality for interacting with Pinnacle Bets API: retrieving betting history retrieving fixtures (future events) retrieving odds for the given leagues (competitions)...
8 votes
2 answers
2k views

Image downloader for a website v2

This code takes a website and downloads all .jpg images in the webpage. It supports only websites that have the img element and src contains a .jpg link. The previous version can be found here ...
8 votes
2 answers
2k views

Parsing Wikipedia table with Python

I am new to Python and recently started exploring web crawling. The code below parses the S&P 500 List Wikipedia page and writes the data of a specific table into a database. While this script is ...
8 votes
2 answers
4k views

Finding words that rhyme

Preface I was trying to review this question on the same topic, but in the end many points I wanted to make were excellently explained by @ferada so I felt that posting my code and explaining the ...
  • 27.3k
6 votes
3 answers
692 views

Scraping a webpage copying with the logic of scrapy

Today, while coming across a tutorial made by ScrapingHub on Scrapy about how it usually deals with a webpage while scraping it's content. I could see that the same logic applied in Scrapy can be ...
  • 2,471
6 votes
3 answers
9k views

Email a notification when detecting changes on a website

The text of a website is checked in a given time period. If there are any changes a mail is sent. There is a option to show/mail the new parts in the website. What could be improved? ...
5 votes
1 answer
545 views

Ultra fast Amazon scraper multi-threaded

This is a follow up to the code here: Web scraper that extracts urls from Amazon and eBay A multi-threaded modification to the previous version that is Amazon focused and most of the necessary ...
5 votes
1 answer
210 views

Richelieu - product scraper

I wanted to see how would I deal with a large amount of data being scraped and written into a CSV file, so I decided to get the info out of a random website. First off, I found a way to search for ...
  • 331
4 votes
1 answer
884 views

Cleaner way of appending data to List in BeautifulSoup

So I've been experimenting various way to get data from different variety of website; as such, between the use of JSON or BeautifulSoup. Currently, I have written a scraper to collect data such as <...
  • 229
4 votes
0 answers
124 views

The anime downloader [duplicate]

NOTE: Here's the latest version of this program, since this question idled out. This is a recreational script made to update my home server w/ the latest anime from HorribleSubs. I'd like to know if ...
  • 2,947
4 votes
2 answers
10k views

Scraping HTML using Beautiful Soup

I have written a script using Beautiful Soup to scrape some HTML and do some stuff and produce HTML back. However, I am not convinced with my code and I am looking for some improvements. Structure of ...
  • 973
4 votes
1 answer
2k views

Web scraper for Football (Soccer) data with BeautifulSoup and Requests

I wrote a web scraper to get football scores from here. I'm getting the data for all seasons for the three major German leagues. It all works at the moment, but I'm sure it's possible to make it a lot ...
  • 442
4 votes
1 answer
697 views

Scraping a dynamic website with Scrapy (or Requests) and Selenium

I am trying to use Scrapy for one of the sites I've scraped before using Selenium over here. Because the search field for this site is dynamically generated and requires the user to hover the cursor ...
  • 435
4 votes
1 answer
333 views

Beginner web scraper for Nagios

I am attempting to learn Python. It was suggested to me to try a web scraper, so I thought to get myself to look at multiple Nagios instances. I have not programmed in Python before, but learned from ...
3 votes
2 answers
2k views

Instagram Scraping Using Selenium - Download Posts - Photos - Videos

Python script that can downloads public and private profiles images and videos, like Gallery with photos or videos. It saves the data in the folder. How it works: Log in in instragram using selenium ...
  • 417
3 votes
1 answer
84 views

Collect Pokémon from a URL and store them in a dataframe

...
  • 31
3 votes
0 answers
119 views

requests vs selenium vs scrapy

This is a follow-up of my question over here. I have been working on my web-scraping techniques by trying out different approaches and rewriting code for a handful of online databases. I've recently ...
  • 435
3 votes
1 answer
478 views

Scraping current day Counter-Strike match results from a website

As a fan of competitive Counter-Strike, I like to keep up with who is currently winning and who is losing. There is a website that provides me with just that. I thought it would be cool if I could ...
  • 1,090
3 votes
1 answer
147 views

Reaching the philosophy wiki page - Follow Up

This is a follow up to my original post: I've written a class that will start from a random Wikipedia page, then choose the first link in the main body, and then navigate following the links until ...
3 votes
3 answers
2k views

Image downloader for a website

This code takes a website and downloads all .jpg images in the webpage. It supports only websites that have the <img> element and ...
3 votes
1 answer
116 views

Reaching the philosophy wiki page

I've written a class that will start from a random Wikipedia page, then choose the first link in the main body, and then navigate following the links until it finds the Philosophy page. When I run the ...
3 votes
1 answer
589 views

Web crawlers for three image sites

I'm very new to python and only vaguely remember OOP from doing some Java a few years ago so I don't know what the best way to do this is. I've build a bunch of classes that represent a crawler that ...
  • 153
2 votes
1 answer
113 views

Take information from a webpage and compare to previous request

After I have been doing some improvements from my Previous code review. I have taken the knowledge to upgrade and be a better coder but now im here again asking for Code review where I think it could ...
2 votes
0 answers
2k views

A web crawler for scraping images from stock photo websites

I created a web crawler that uses beautiful soup to crawl images from a website and scrape them to a database. in order to use it you have to create a class that inherits from Crawler and implements 4 ...
  • 153
1 vote
0 answers
20 views

Organizing things together to form a minimum viable Scraper App (part 2)

This is a follow-up of my question over here. Response to @Reinderien's answer: I have corrected the more trivial issues highlighted in @Reinderien's answer below as follows. ...
  • 435
1 vote
2 answers
1k views

Basic search engine

I want to improve efficiency of this search engine. It works in about 10 seconds for a search depth of 1, but 4 minutes at 2 etc. I tried to give straightforward comments and variable names, any ...
  • 111
1 vote
1 answer
5k views

Crawl multiple pages at once

This an update to my last question. I want to process multiple pages at once pulling URLs from tier_list in the crawl_web ...
  • 111
1 vote
0 answers
2k views

Email a notification when detecting changes on a website - follow-up

I read through other questions here and improved the code and added a new feature. The old question can be found at: Email a notification when detecting changes on a website The improvements that are ...
1 vote
1 answer
13k views

Extract html content based on tags, specifically headers

I want the function to take as an input json file containing html_body with its corresponding url and return list of tuples containing headers and their corresponding url (so could be tuple with one ...
  • 197
1 vote
1 answer
118 views

Organizing things together to form a minimum viable Scraper App

This is a follow-up of my group of scraper questions starting from here. I have thus far, with the help of @Reinderien, written 4 separate "modules" that expose a ...
  • 435