Skip to main content

All Questions

Tagged with
Filter by
Sorted by
Tagged with
3 votes
1 answer
106 views

Multi-Page Web Scraping Code Using Selenium with Multithreading

I have written a web scraping script using Selenium to crawl blog content from multiple URLs. The script processes URLs in batches of 1000 and uses multithreading with the ThreadPoolExecutor to ...
Minnie's user avatar
  • 31
5 votes
2 answers
687 views

Readability and error handling improvements for Python web scraping class

Description I recently wrote a Python script to download files from the Library of Congress (LOC) based on a search query. The code fetches metadata, extracts file ...
IntegerEuler's user avatar
2 votes
1 answer
80 views

Scrapy Spider for fetching product data from multiple pages of a website

I have written a Scrapy spider to scrape product data from a website. The spider navigates through multiple pages to reach a specific product and extracts details such as the product name, price, ...
I DON'T KNOW's user avatar
3 votes
2 answers
93 views

Validating a web crawlers page visits with a decorator

I am writing a crawler that is going to end up in production and I was trying to come up with a way to validate its page visits. It scrapes asp.net pages so each scraping process involves a few ...
Gustavo Costa's user avatar
5 votes
3 answers
839 views

code format and steps web scraping using beautiful soup

I've done simple web scraping and want to make sure all my steps are correct? Is it considered clean code? Is there a better way to use the multi-page scraping feature? ...
Lpython's user avatar
  • 51
0 votes
2 answers
158 views

Drayage Webscraper: Limited to table structure

This is my first working scraper. I'm sure a lot can be improved. My biggest question is how can I better specify what data to pull? All the data I'm currently grabbing is needed, but I couldn't ...
wigglesthe3rd's user avatar
2 votes
1 answer
72 views

A selenium web scraper to package NBA data

I'm building a selenium web scraper for basketball-reference.com that takes a player name and returns data in either a JSON format or Pandas DataFrame object. The class in question is one of many that ...
BluffShove's user avatar
5 votes
1 answer
196 views

Scraping the Divar.ir

I've wrote a code to scrape the Divar, which is an equivalent of Ebay in Iran. I have a few questions: Am I doing the error handling and logging ok? Is there a better way to optimize this code? (note ...
Amirhossein Rezaei's user avatar
1 vote
2 answers
186 views

Web scraping spider

I'm currently working on my first web scraping project and I need to scrape a lot of websites. With my current code it takes more than a day but for my project I need to scan the same websites every 5 ...
Max's user avatar
  • 27
0 votes
1 answer
116 views

Poetry Web Scraping in Python [closed]

I have a script that obtains urls that lead to a specific poem. This code current works and uses multiprocessing pools. I currently am getting restricted or blocked by some way from the website that I ...
watrgoat's user avatar
3 votes
1 answer
72 views

HTTP scraper for Python Package

I'm trying to make my first Python package as a learning experience. There's a lot of things that I suspect I am doing poorly, but this post is specifically about my HttpRequest class. I made this ...
JTB's user avatar
  • 277
3 votes
1 answer
72 views

URL link scraper and analyser

I recently wrote a testing tool (called plink) for retrieving all the links from a website (and then retreiving links from the linked pages, and so on). Essentially,...
Jessica's user avatar
  • 890
4 votes
2 answers
266 views

Test generator I made for practice

Made this generator to practice using imports from other modules and better readability for coding. What could I have done better and what did I do wrong? File called test_generator.py ...
Beginner's user avatar
  • 199
2 votes
1 answer
116 views

Search Stack Overflow and GitHub for code in a specified language

This code is designed to scrape Stack Overflow and GitHub, pulling information based on a user-specified programming language and processing the data into a format for AI learning. It uses a number of ...
Robert3737's user avatar
3 votes
1 answer
229 views

A simple web scraper for nature.com news articles

I have created a simple web scraper that fetches news article previews from nature.com and saves each article to a file containing the article preview text. I am learning independently, so I would ...
razzleDazzle's user avatar

15 30 50 per page
1
2 3 4 5
29