Newest 'web-scraping' Questions

3

votes

0answers

30 views

Regularly watch recent posts of a blog for specific words with HTML scraping

Task I want to watch the "Recent Posts" section of a blog for changes/new posts but only for specific posts containing a pre-defined word. Afterwards a list should be outputted in the console with ...

asked Apr 4 at 13:38

sceiler

17015

3

votes

1answer

48 views

Reaching the philosophy wiki page - Follow Up

This is a follow up to my original post: I've written a class that will start from a random Wikipedia page, then choose the first link in the main body, and then navigate following the links until ...

python web-scraping matplotlib wikipedia

asked Apr 3 at 4:02

loremIpsum1771

1965

3

votes

1answer

22 views

Reaching the philosophy wiki page

I've written a class that will start from a random Wikipedia page, then choose the first link in the main body, and then navigate following the links until it finds the Philosophy page. When I run the ...

python web-scraping matplotlib wikipedia

asked Apr 2 at 20:04

loremIpsum1771

1965

0

votes

2answers

47 views

Scraping articles on a web site

I'm trying to create a scraping API in Express. The API scrapes the different articles featured on the home page. Here are the issues I'm trying to resolve: My code's turning into a ton of jquery ...

javascript node.js web-scraping express.js

asked Mar 31 at 14:44

beckah

1377

3

votes

1answer

35 views

Scraping data from a table in python

I'm new to python, and after doing a few tutorials, some about scraping, I been trying some simple scrapping on my own. Using beautifulsoup I manage to get data from webpages where everything has ...

python html web-scraping

asked Mar 30 at 19:15

Pablo

186

3

votes

1answer

40 views

Email finding bot

I'm working on a lead generation bot that helps you find the emails of people you want to reach out to. The bot grabs your spreadsheet from gdrive, logs into several email finding tools, and collect ...

ruby web-scraping email google-sheets webdriver

asked Mar 30 at 19:02

Tommy Adey

1183

3

votes

0answers

32 views

Parsing HTML to download e-books

I'm currently writing a little tool to get into Go. As I'm not familiar with the language I'm especially looking for Conventional go stuff. utility.go feels wrong.Should I wrap the client and email/...

beginner regex go web-scraping

asked Mar 27 at 22:42

Nordiii

312

3

votes

1answer

48 views

March Madness Simulator

It's a march madness simulator for the final 4!!! Please be as critical as you can :) ...

python web-scraping beautiful-soup

asked Mar 22 at 17:58

Matias K

161

3

votes

2answers

91 views

Article date extractor

I am quite new to Rust and this is my first library written in it. It's an article date extractor heavily inspired by the original Python library as well as its Haskell port. It is fairly small and ...

beginner parsing datetime web-scraping rust

asked Mar 22 at 15:49

Alexey Zabelin

214

1

vote

1answer

57 views

Composing best web page fetcher function by HttpClientHandler for C#

The returned class as result: ...

c# http web-scraping

asked Mar 21 at 1:56

MonsterMMORPG

121129

-1

votes

0answers

78 views

Webscraper using threads and grequest in Python

I have to scrape a government website and retrieve the information about a food's particular calorie information. I wrote a program that does this. I have been all over the internet looking at ...

python multithreading web-scraping

asked Mar 20 at 16:51

CasualCoder3

43

3

votes

0answers

89 views

Thousands of GET requests for brute-force authentication attempts

I am applying a brute force to discover 123456 passwords in a given site (I am not going to say which one, of course). The html gets a ...

performance authentication f# concurrency web-scraping

asked Mar 18 at 5:45

Gabriel

35513

8

votes

1answer

44 views

Scan a webpage to find the start time and date for an event

I am working on a simple web crawler that returns the start time and date for an event listed on a webpage. The webpage can be in two different formats and there are multiple other dates listed on the ...

python web-scraping

asked Mar 14 at 23:37

BruceM

433

5

votes

1answer

51 views

Link checker using Go channels

I've started to learn Golang and channels in it. I decided to write simple application - recursive link checker. Given some URL it tries to retrieve pages, parses them and goes deeper. Here's a code ...

beginner http concurrency go web-scraping

asked Mar 13 at 18:12

Eugene Lisitsky

1365

6

votes

1answer

100 views

Improving the performance of a webscraper

I have here a modified version of a web scraping code I wrote some weeks back. With some help from this forum, this modified version is faster (at 4secs per iteration) than the earlier version. ...

python performance python-2.7 web-scraping beautiful-soup

asked Mar 13 at 12:28

Nobi

1206

6

votes

1answer

54 views

Python siren control

Below is some code I've put together to contol a siren for a fire service. It works by webscraping a paging feed and looks for set triggers. Is there a better way of doing my code or is this "...

python beginner python-2.7 web-scraping raspberry-pi

asked Mar 9 at 11:33

shaggs

1336

4

votes

1answer

72 views

BFS/DFS Web Crawler

I've built a web crawler that starts at an origin URL and crawls the web using a BFS or DFS method. Everything is working fine, but the performance is horrendous. I think the major cause of this is ...

python performance web-scraping breadth-first-search depth-first-search

asked Mar 3 at 23:24

123

220310

7

votes

2answers

331 views

Parsing Wikipedia table with Python

I am new to Python and recently started exploring web crawling. The code below parses the S&P 500 List Wikipedia page and writes the data of a specific table into a database. While this script is ...

python python-3.x parsing web-scraping wikipedia

asked Feb 26 at 5:58

DatenBergwerker

383

5

votes

1answer

42 views

Magpi Magazine Downloader

I have created a simple program at https://github.com/Epic0ne/magpi-downloader in Python which downloades all issues of the MagPi magazine by parsing this web page for links ending in ...

python python-3.x web-scraping

asked Feb 25 at 18:57

An Epic Person

1726

7

votes

2answers

209 views

Optimizing the speed of a web scraper

I have just written this code to scrape some data from a website. In its current state it works fine, however, going by my tests on the script, I discovered that with the amount of data I am ...

python performance python-2.7 web-scraping beautiful-soup

asked Feb 18 at 10:45

Nobi

1206

5

votes

1answer

140 views

Simple recursive web crawler

I did a simple web crawler, I know there's many better ones out there, I did this just with the learning purpose. The problem is that I think there's some things I could improve here. I commented the ...

python python-3.x web-scraping

asked Feb 7 at 10:48

Miguel

1257

6

votes

1answer

126 views

Basic IMDb scraper and movie generator

I just built my first scraper and I'd like to get your thoughts on the structure and the way I went about it. The basic premise of the script: Get a random movie from IMDb's Top 250 Ask the user ...

python python-3.x web-scraping

asked Jan 30 at 0:12

user1318340

313

2

votes

1answer

73 views

Scrapy Username Spider

This currently starts out with the speed of 2000 pages/min but shortly after starting it becomes very slow with a speed of about 200 pages/min. Why is this happening? How can I improve this scraper? <...

python performance web-scraping scrapy

asked Jan 18 at 18:47

edsheeran

655

3

votes

2answers

249 views

Scraping HTML via async controller & classes + HTML agility pack

I've developed a simple application to grab golfer index scores from a website that has no API. The application works but is very slow, with 6 users that require updating takes 60 seconds. I've tried ...

c# entity-framework web-scraping async-await asp.net-mvc-5

asked Jan 12 at 13:39

Waragi

1379

0

votes

0answers

332 views

Python Scrapy code using Selenium

I have written some Python code that uses Scrapy and Selenium to scrap restaurant names and addresses from a website. I needed to use Selenium because the button to show more restaurants on a page is ...

python web-scraping selenium scrapy

asked Dec 22 '16 at 19:43

nevster

284

5

votes

1answer

119 views

Webscraping calendar events using Python 3, with or without BeautifulSoup

I'm trying to find out why my web-scraping code with BeautifulSoup (BS) is slower than my code without BS. I would think that BS code would be faster than the other code - so, maybe I'm doing ...

python performance comparative-review web-scraping beautiful-soup

asked Dec 19 '16 at 23:47

nick

261

5

votes

1answer

76 views

Scraping names of directors from a website

I am scraping the names of the directors from a website using Python / ScraPy. I am very new to coding (under a year and after work) - any views would be appreciated. The reason I have a ...

python web-scraping scrapy

asked Dec 19 '16 at 21:28

nevster

284

4

votes

2answers

73 views

Scrape 4chan for alive images

I'm trying to learn Clojure recently and I thought writing a simple web app would be a good way to dive in. This function gets the list of alive threads from the API and reduces, filters and maps ...

clojure web-scraping

asked Nov 13 '16 at 13:51

user114084

233

0

votes

3answers

123 views

Image hosting site image downloader using requests and BeautifulSoup

I went about this the way I have because Selenium is slow and inconvenient (opens browser) and I was unable to find the href attribute in the link element using ...

python web-scraping beautiful-soup

asked Nov 10 '16 at 16:09

Michael Johnson

1847

2

votes

1answer

139 views

Scrape an infinite-scroll page

My algorithm scrapes an infinite-scroll page but it takes too long. It scrolls three times but I'm wondering if there is a way to do a ScrollBottom() so no need of ...

javascript node.js web-scraping

asked Nov 9 '16 at 18:59

tribet

296110

4

votes

2answers

867 views

Simple Python job vacancies downloader

I have created a BeautifulSoup vacancies parser, which works, but I do not like how it looks. Therefore, I'd be very happy if somebody could give me any improvements. ...

python web-scraping beautiful-soup

asked Nov 7 '16 at 12:48

Vadim Kuznetsov

333

3

votes

1answer

107 views

VBA - XMLHTTP web scraping

I navigate with IE, do various things, then select all results option from a list and fire on click event. Once all results have been listed, I loop through their URLs, using the following code to ...

performance vba web-scraping

asked Oct 29 '16 at 9:06

Ryszard Jędraszyk

1363

0

votes

0answers

79 views

Scraping websites and saving to MySQL

I have the following piece of code which scrapes websites and saves some information back to MySQL. At the moment is consuming all the memory on my machine every time it runs. I've refactored the ...

javascript mysql node.js web-scraping memory-optimization

asked Oct 11 '16 at 11:57

tribet

296110

4

votes

1answer

276 views

Web scraping VBA - Internet Explorer

The code below extracts data from one web page - I emulate search, select all results from the list and when the list appears (42000 items) I loop through these items. I get an id value from their ...

vba web-scraping internet-explorer

asked Oct 2 '16 at 0:39

Ryszard Jędraszyk

1363

1

vote

1answer

256 views

Basic web scrape project written in NodeJS

Here is a short program web scraping program written in Node.js. I'm just getting to grips with node and this is the first thing I've written with it. I'm liking it so far though I guess I'm kinda ...

javascript node.js web-scraping promise

asked Sep 26 '16 at 5:26

bloppit

362

1

vote

0answers

171 views

GUI in Tkinter to log events for a web-scraper

I'm creating a GUI with tkinter that will handle starting/stopping/and logging events for a web-scraper (scraper not created yet). The current code is working... but I've been gathering my ...

python web-scraping gui tkinter

asked Sep 21 '16 at 20:36

DjH

61

3

votes

1answer

162 views

Sample scraping Project Gutenberg using Beautiful Soup and requests

I am trying to learn web scraping in Python using Beautiful Soup and requests. My program goes to the book page on Project Gutenberg with the given book number (Example). It then finds the link for ...

python beginner python-3.x web-scraping beautiful-soup

asked Sep 20 '16 at 10:14

syed saad

404

2

votes

1answer

63 views

Downloading and saving news articles

To be honest, I am pretty new to coding. I want to analyse the articles my local newspaper publishes on ther website. To do that, I have a list of URLs and from that, I want to download and save all ...

python performance python-3.x web-scraping beautiful-soup

asked Sep 17 '16 at 16:26

gempie

112

4

votes

1answer

50 views

Haikuifier (Or at least Haiku Identifier)

All the usual stuff. Style, substance, algorithm please! I'm vaguely considering plugging it into a bot, hence the lacklustre catching of exceptions right now. ...

python python-3.x web-scraping natural-language-proc

asked Sep 14 '16 at 12:59

Yann

1,7261734

2

votes

1answer

140 views

Translating text using Google Translate mobile site

I have this code to translate text using google translate mobile site. currently text size is limited by the request method. Everything else seems to works just fine I am also about to post this on ...

python web-scraping

asked Sep 14 '16 at 9:06

mou

1885

1

vote

0answers

241 views

Google Searching Bot with Proxy support

I have been asked by a client to program a bot which searches Google and will show how many no of results I get. Note: I know about Google Custom Search API and it will not produce the exact output ...

python web-scraping proxy

asked Sep 12 '16 at 14:24

VISWESWARAN1998

2410

0

votes

2answers

115 views

Movie data scraping

I enter in the IMDb link and YouTube trailer link in the command line to a movie and the first main program loads all the info about the movie. The second main program uses an IMDb link to the movie ...

php web-scraping curl

asked Sep 6 '16 at 0:21

user3736114

1

3

votes

0answers

26 views

VIM colors downloader in Python, using multiprocessing

I recently posted this script: VIM colors downloader in Python But since I'm not allowed to update the code there, I wanted to get an idea on this version, that uses multiprocessing: ...

python python-3.x web-scraping multiprocessing

asked Sep 1 '16 at 12:12

Kernel.Panic

885

8

votes

2answers

160 views

Finding words that rhyme

Preface I was trying to review this question on the same topic, but in the end many points I wanted to make were excellently explained by @ferada so I felt that posting my code and explaining the ...

python strings python-2.7 web-scraping

asked Aug 30 '16 at 21:50

Caridorc

21.9k32993

14

votes

2answers

731 views

VIM colors downloader in Python

Recently, I wanted to change my vim colors to something new. So I went to the vim colors website and then I decided that I wanted to download ALL the colors. So I ...

python python-3.x web-scraping beautiful-soup network-file-transfer

asked Aug 30 '16 at 14:04

Kernel.Panic

885

3

votes

1answer

221 views

Python Document Downloader

This is a python document (PDF) downloader I made to download some question papers automatically. However, it is very slow. Any better way to do this? The code: ...

python web-scraping

asked Aug 26 '16 at 6:08

Shantanu Bedajna

1256

9

votes

2answers

961 views

Simple Python username scraper

I started learning Python recently and I really like it, so I decided to share one of my first projects mainly in hopes of someone telling me what I can do to make it run faster (threading/...

python web-scraping beautiful-soup

asked Aug 26 '16 at 0:15

edsheeran

655

5

votes

2answers

294 views

Multithreaded webcrawler

I've been trying to learn Java for the last day or two. This is the first project I am working on, so please bear with me. I worked on a multithreaded web crawler. It is fairly simple but I'd like to ...

java beginner multithreading comparative-review web-scraping

asked Aug 20 '16 at 14:37

Jan

262

-1

votes

1answer

144 views

Selenium Kijiji web scraper

I have this script working pretty well but I know that there must be many things that I could do better to make it more efficient. ...

python python-2.7 web-scraping selenium

asked Aug 16 '16 at 20:06

gtdwat

61

5

votes

1answer

87 views

Python Politico API attempt

I love politics, and I love programming, so I figured why not try and combine the two for something to do? I'm making a work-in-progress (but runnable at this stage) Politico api that I call "...

python api web-scraping

asked Aug 15 '16 at 14:33

n1c9

23016

your communities

Tagged Questions

Related Tags