All Questions

Tagged with
46 questions with no upvoted or accepted answers
Filter by
Sorted by
Tagged with
7
votes
0answers
150 views

Get a movie directory, rename it and save imdb info as a webpage

I have my movie folders formatted: ~: no sub included {}: already watched nothing: sub included, ready to watch the program asks for the movie directory; extracts the name and searches IMDb for the ...
6
votes
0answers
73 views

Booking an East London Tennis Court

Description I'm not sure if it Covid-19 but lately it is impossible to book a tennis court in my area on time. It's always full or maybe just don't check enough :) To beat the queue and get notified ...
6
votes
2answers
132 views

Web scraper for driver's license test times

I have created a small selenium script that checks for available times to write a test for a drivers license. The program runs every minute and takes approx 50 seconds to run. I have noticed that it's ...
6
votes
0answers
849 views

Parsing different categories using Scrapy from a webpage

I've written a script in Python Scrapy to parse different "model", "country" and "year" of various bikes from a webpage. There are several subcategories to track to reach ...
4
votes
0answers
60 views

Scraping a hiring website using python's requests and BeautifulSoup

I'm designing a scraping application using python, requests and BeautifulSoup4. I decided to divide the logic into two classes: Spider : gets the base url ...
4
votes
0answers
88 views

List all links in a website

I wrote this code as part of a project I'm working on. It's supposed to get all links from a website and then perform black-box tests on them. How can I improve this code to be faster and more ...
4
votes
0answers
812 views

Speeding web-scraping up Python 3

I want to get information from two websites and display it on 'real-time' in the console. To get the information from the website I am using BeautifulSoup 4. I have read, that the bottleneck of ...
3
votes
0answers
60 views

Scraping OddsPortal with requests only

This is a scraper written to do most of what had been attempted by another user in this question: How can I optimise this webscraping code I did the rewrite because I felt bad that the new user didn't ...
3
votes
0answers
47 views

Speed Up API Requests & Overall Python Code

I'm not asking for help solving a problem but rather asking for help for possible ways to improve the speed of my program. Essentially what this does is: Tracks market data by pulling the data from ...
3
votes
0answers
68 views

Scraping reddit using Python

My objective is to find out on what other subreddit users from r/(subreddit) are posting on; you can see my code below. It works pretty well, but I am curious to know if I could improve it by: First, ...
3
votes
0answers
45 views

Scraping local news sites

This is my first Python web scraper (and overall my first Python project). I am also relatively new to OOP but do understand its core fundamentals. The script below scrapes two local news sites for ...
3
votes
0answers
299 views

Python web scraper

This is my first attempt at a sizeable amount of code in Python and I've made an attempt to standardize a script into a library so that it can be reused. However, ...
3
votes
0answers
159 views

Web scrape data for a list of stocks

I wrote a script that will web scrape data for a list of stocks. The scraper has to get the data from 2 separate pages so each stock symbol must scrape 2 different pages. If I run the process on a ...
3
votes
0answers
181 views

Scrape a stock value and add it to a csv file along with other data. [Python 3]

Any and all suggestions are welcome, mainly looking for suggestions on clean up, and making it more readable as I feel that it is inefficient and not very clean. Some specific things: A better way ...
3
votes
0answers
51 views

VIM colors downloader in Python, using multiprocessing

I recently posted this script: VIM colors downloader in Python But since I'm not allowed to update the code there, I wanted to get an idea on this version, that uses multiprocessing: ...
3
votes
0answers
203 views

Checking paginated website for new entries

I'm interested in determining the best way to check a paginated website for new entries. I want to be able to scrape pages 1, 2, 3, ... as necessary to get all updates. However the scraping is fairly ...
3
votes
0answers
562 views

Scraping links from the first page of Google using Kivy

I'm making a scraper/web crawler using Kivy when I run the code it works but I'm not sure if what I'm doing is Pythonic because all the language I can find is about using the Kivy library. I'm unsure ...
2
votes
0answers
64 views

Python script to scrape google maps without API

I wrote a python script to scrape google maps for my app. I really want my code to be readable and have tried to follow PEP-8 wherever I could, so I have come to you all for guidance. It uses selenium ...
2
votes
0answers
32 views

requests vs selenium vs scrapy

This is a follow-up of my question over here. I have been working on my web-scraping techniques by trying out different approaches and rewriting code for a handful of online databases. I've recently ...
2
votes
0answers
278 views

Discord Bot Python. Selenium screenshots

I wrote my own Discord Bot which is taking screenshot of a specific website. This scrip is very simple and thought it should work fast, but it isn't. I read a lot, and I think I'm not able to improve ...
2
votes
0answers
42 views

Scraping Forum Tables w/ Links Using Beautiful Soup

This code scrapes post links (among other information) from a table on a forum. While the current code works, I would like to know if there is a better/simpler way of writing it (maybe not as many for-...
2
votes
0answers
834 views

Downloading multiple urls with aiohttp Python 3

I am trying to use aiohttp library in python to download information from url. I have about 300 000 urls. They are saved in file "my_file.txt". When I get web page, I extract pairs of a question and ...
2
votes
0answers
298 views

Wrapping a web scraper in a RESTful API

The problem I am looking to solve is wrapping a web scraper in a RESTful API such that it can be called programmatically from another application, frontend or microservice. The overall goal is that ...
2
votes
0answers
968 views

Grabbing information traversing multiple pages

I've written a script in python in combination with selenium to parse different information from a webpage and store the collected data in a csv file. The data are rightly coming through. The email ...
2
votes
0answers
831 views

Recursively scrape links from web pages and check them

I'm new to programming and especially new to object oriented programming. I have built a web scraper using functional programming and am trying to build another using OOP principles. The overall idea ...
2
votes
0answers
158 views

Handling IndexError using lambda function within scrapy

I've written a script using python's scrapy library to parse some fields from craigslist. The spider I've created here is way normal than what usually gets considered ideal to be reviewed. However, I'...
2
votes
0answers
169 views

Football Web Scraper Part 2

A revised version of the code in this question. Things I have done so far: Adjusted some formatting things like constant & variable naming and indentation Wrapped most of the functions into a ...
2
votes
0answers
53 views

River Flood Warning System v2.1 - Those Pesky NoneTypes

Below is my River Flood Warning System version 2 build 1. After following the help and advice given for version 1 the whole program is looking and behaving much better. The original code would only ...
2
votes
0answers
1k views

Google Searching Bot with Proxy support

I have been asked by a client to program a bot which searches Google and will show how many no of results I get. Note: I know about Google Custom Search API and it will not produce the exact output ...
2
votes
0answers
2k views

A web crawler for scraping images from stock photo websites

I created a web crawler that uses beautiful soup to crawl images from a website and scrape them to a database. in order to use it you have to create a class that inherits from Crawler and implements 4 ...
2
votes
0answers
266 views

BeautifulSoup web spider for driver links

The following spider will grab some driver links, OS version, and the name. All the info is in a table class, but some pages might be a little different in the location and Number of cells in each row....
2
votes
0answers
329 views

Prototype spider for indexing RSS feeds

This code is super slow. I'm looking for advice on how to improve its performance. ...
1
vote
0answers
16 views

Organizing things together to form a minimum viable Scraper App (part 2)

This is a follow-up of my question over here. Response to @Reinderien's answer: I have corrected the more trivial issues highlighted in @Reinderien's answer below as follows. ...
1
vote
0answers
52 views

Optimising this web-scraping code

A member of SO who I immensely respect just told me that the code below makes him uncomfortable. ...
1
vote
0answers
56 views

Efficient Multithreading in Python 3 - Webscraping behind a login

I'm a total beginner at Python and I would like someone to rate my code and to give me tips about talking between threads in Python. I don't know if I need these much threads for this script, but by ...
1
vote
0answers
37 views

Webscraping code to import logs from a website which is about to die

We all know that the famous twitch log site OverRustleLogs is getting shut down. So I decided to do some web scraping to download my favourite streamer's logs using BeautifulSoup. How can make this ...
1
vote
0answers
142 views

Web scraping dynamic content

Hi I'm fairly new to coding and would appreciate some feedback on the code. This is for a site that has a dynamic login page and infinite scrolling. I wanted to use scrapy not necessarily because it ...
1
vote
0answers
44 views

bs4 HTML cleaner function with repetitive nested if clauses,

[I made a little web-scraper that downloads the source html files as well. Now for the sake of saving storage space i wrote a small function to delete quite a bit of stuff in the html file (specific ...
1
vote
0answers
111 views

Scraping and printing titles from Craigslist

I've written a very tiny script using class to scrape some titles of products from craigslist. My intention is to make use of __str__() method so that my script can ...
1
vote
0answers
62 views

Python3.x Download(async) + Process(bs4) + Save(EPUB)

I had a simple webscraper with Beautiful Soup 4 which downloaded novel chapters from a website and converted them to an EPUB file. It was straight and simple imperative programming. Then I thought, ...
1
vote
0answers
237 views

Extracting data from a used car sales site

I am developing code for extracting data from a used car sales site. There are 4 sites in total. In 3 of them I use requests and beautifulsoup. The time taken to extract data from these sites was ...
1
vote
0answers
435 views

Scraping web data using asynchronous request

I've written a script using python to grab different categories from a webpage. I used "grequests" in my scraper to perform the activity. My intention here was to perform the action swiftly making ...
1
vote
0answers
1k views

GUI in Tkinter to log events for a web-scraper

I'm creating a GUI with tkinter that will handle starting/stopping/and logging events for a web-scraper (scraper not created yet). The current code is working... but I've been gathering my ...
1
vote
0answers
2k views

Email a notification when detecting changes on a website - follow-up

I read through other questions here and improved the code and added a new feature. The old question can be found at: Email a notification when detecting changes on a website The improvements that are ...
0
votes
0answers
25 views

Adding page iteration capability to Requests scraper

I am trying to build on @Reinderien's answer to my previous question over here to add page iteration functionality to the code: ...
0
votes
0answers
71 views

Asynchronous web scraping

This is my solution to a "vacancy test" task. I'm not sure at all if I have correctly implemented the task, but here is my solution. Goals of code: Parse rows of table from a URL and ...