Newest 'web-scraping' Questions

4

votes

2answers

61 views

Scrape 4chan for alive images

I'm trying to learn Clojure recently and I thought writing a simple web app would be a good way to dive in. This function gets the list of alive threads from the API and reduces, filters and maps ...

clojure web-scraping

asked Nov 13 at 13:51

user114084

233

0

votes

3answers

89 views

Image hosting site image downloader using requests and BeautifulSoup

I went about this the way I have because Selenium is slow and inconvenient (opens browser) and I was unable to find the href attribute in the link element using ...

asked Nov 10 at 16:09

Michael Johnson

1627

2

votes

1answer

59 views

Scrape an infinite-scroll page

My algorithm scrapes an infinite-scroll page but it takes too long. It scrolls three times but I'm wondering if there is a way to do a ScrollBottom() so no need of ...

javascript node.js web-scraping

asked Nov 9 at 18:59

tribet

296110

4

votes

2answers

832 views

Simple Python job vacancies downloader

I have created a BeautifulSoup vacancies parser, which works, but I do not like how it looks. Therefore, I'd be very happy if somebody could give me any improvements. ...

python web-scraping beautiful-soup

asked Nov 7 at 12:48

Vadim Kuznetsov

333

2

votes

1answer

44 views

I navigate with IE, do various things, then select all results option from a list and fire on click event. Once all results have been listed, I loop through their URLs, using the following code to ...

performance vba web-scraping

asked Oct 29 at 9:06

Ryszard Jędraszyk

1273

0

votes

0answers

47 views

Scraping websites and saving to MySQL

I have the following piece of code which scrapes websites and saves some information back to MySQL. At the moment is consuming all the memory on my machine every time it runs. I've refactored the ...

javascript mysql node.js web-scraping memory-optimization

asked Oct 11 at 11:57

tribet

296110

4

votes

1answer

97 views

Web scraping VBA - Internet Explorer

The code below extracts data from one web page - I emulate search, select all results from the list and when the list appears (42000 items) I loop through these items. I get an id value from their ...

vba web-scraping internet-explorer

asked Oct 2 at 0:39

Ryszard Jędraszyk

1273

1

vote

1answer

107 views

Basic web scrape project written in NodeJS

Here is a short program web scraping program written in Node.js. I'm just getting to grips with node and this is the first thing I've written with it. I'm liking it so far though I guess I'm kinda ...

javascript node.js web-scraping promise

asked Sep 26 at 5:26

bloppit

112

1

vote

0answers

51 views

GUI in Tkinter to log events for a web-scraper

I'm creating a GUI with tkinter that will handle starting/stopping/and logging events for a web-scraper (scraper not created yet). The current code is working... but I've been gathering my ...

python web-scraping gui tkinter

asked Sep 21 at 20:36

DjH

61

2

votes

1answer

77 views

Sample scraping Project Gutenberg using Beautiful Soup and requests

I am trying to learn web scraping in Python using Beautiful Soup and requests. My program goes to the book page on Project Gutenberg with the given book number (Example). It then finds the link for ...

python beginner python-3.x web-scraping beautiful-soup

asked Sep 20 at 10:14

syed saad

304

2

votes

1answer

55 views

Downloading and saving news articles

To be honest, I am pretty new to coding. I want to analyse the articles my local newspaper publishes on ther website. To do that, I have a list of URLs and from that, I want to download and save all ...

python performance python-3.x web-scraping beautiful-soup

asked Sep 17 at 16:26

gempie

112

4

votes

1answer

41 views

Haikuifier (Or at least Haiku Identifier)

All the usual stuff. Style, substance, algorithm please! I'm vaguely considering plugging it into a bot, hence the lacklustre catching of exceptions right now. ...

python python-3.x web-scraping natural-language-proc

asked Sep 14 at 12:59

Yann

1,6861734

2

votes

1answer

83 views

Translating text using Google Translate mobile site

I have this code to translate text using google translate mobile site. currently text size is limited by the request method. Everything else seems to works just fine I am also about to post this on ...

python web-scraping

asked Sep 14 at 9:06

mou

1535

1

vote

0answers

86 views

Google Searching Bot with Proxy support

I have been asked by a client to program a bot which searches Google and will show how many no of results I get. Note: I know about Google Custom Search API and it will not produce the exact output ...

python web-scraping proxy

asked Sep 12 at 14:24

VISWESWARAN1998

1910

0

votes

2answers

94 views

Movie data scraping

I enter in the IMDb link and YouTube trailer link in the command line to a movie and the first main program loads all the info about the movie. The second main program uses an IMDb link to the movie ...

php web-scraping curl

asked Sep 6 at 0:21

user3736114

1

3

votes

0answers

25 views

VIM colors downloader in Python, using multiprocessing

I recently posted this script: VIM colors downloader in Python But since I'm not allowed to update the code there, I wanted to get an idea on this version, that uses multiprocessing: ...

python python-3.x web-scraping multiprocessing

asked Sep 1 at 12:12

Kernel.Panic

885

8

votes

2answers

77 views

Finding words that rhyme

Preface I was trying to review this question on the same topic, but in the end many points I wanted to make were excellently explained by @ferada so I felt that posting my code and explaining the ...

python strings python-2.7 web-scraping

asked Aug 30 at 21:50

Caridorc

21.1k32786

14

votes

2answers

721 views

VIM colors downloader in Python

Recently, I wanted to change my vim colors to something new. So I went to the vim colors website and then I decided that I wanted to download ALL the colors. So I ...

python python-3.x web-scraping beautiful-soup network-file-transfer

asked Aug 30 at 14:04

Kernel.Panic

885

3

votes

1answer

127 views

Python Document Downloader

This is a python document (PDF) downloader I made to download some question papers automatically. However, it is very slow. Any better way to do this? The code: ...

python web-scraping

asked Aug 26 at 6:08

Shantanu Bedajna

1256

5

votes

2answers

138 views

Multithreaded webcrawler

I've been trying to learn Java for the last day or two. This is the first project I am working on, so please bear with me. I worked on a multithreaded web crawler. It is fairly simple but I'd like to ...

java beginner multithreading comparative-review web-scraping

asked Aug 20 at 14:37

Jan

262

-1

votes

1answer

99 views

Selenium Kijiji web scraper

I have this script working pretty well but I know that there must be many things that I could do better to make it more efficient. ...

python python-2.7 web-scraping selenium

asked Aug 16 at 20:06

gtdwat

61

5

votes

1answer

60 views

Python Politico API attempt

I love politics, and I love programming, so I figured why not try and combine the two for something to do? I'm making a work-in-progress (but runnable at this stage) Politico api that I call "...

python api web-scraping

asked Aug 15 at 14:33

n1c9

22516

1

vote

0answers

19 views

Wikipath stack in Java - Part II/IV - The implicit Wikipedia article graph

This question is the continuation of the Wikipath stack series: the two classes that - given a Wikipedia article \$A\$ - return the lists of neighbour articles. The forward node expander return the ...

java json graph web-scraping pathfinding

asked Aug 10 at 15:20

coderodde

9,29621052

0

votes

1answer

64 views

Retrieving remote pages and parsing html

This is my code: ...

performance php web-scraping xpath

asked Aug 5 at 13:16

Lior

1342

1

vote

1answer

392 views

Downloading yahoo finance stock historical data as CSV using C++

This post is a continuation of my previous post where I used jsoncpp package to fetch exchange rates from fixer.io. In this post I have reused the above code and used it to fetch stock historical ...

c++ json finance web-scraping

asked Jul 23 at 16:14

Eka

1876

3

votes

1answer

67 views

Fetching specific foreign exchange rates from fixer using curl and jsconcpp in C++

I am trying to create my own algorithmic trading system using C++. I have searched the web for a nice tutorial for such systems and I didnt find any. Then I started to learn about ...

c++ json web-scraping

asked Jul 23 at 6:19

Eka

1876

2

votes

1answer

60 views

Optimizing Java HTML parser

I wrote a program that goes through a webpage and returns matches of regex. I used it on my letterboxd.com account to go through all of my movies (over 900 entries) and then find genres field for each ...

java performance html parsing web-scraping

asked Jul 22 at 17:55

Martin Lukáš

111

6

votes

2answers

97 views

Crawl site getting URL and status code

I wrote a crawler that for every page visited collects the status code. Below my solution. Is this code optimizable? ...

python http web-scraping

asked Jul 22 at 7:32

Sam

1334

5

votes

1answer

174 views

Session handling using Python Requests client

I'm using this code to login to an experiment login system created by me for this purpose. ...

python python-3.x http web-scraping session

asked Jul 21 at 16:51

Miguel

987

4

votes

1answer

93 views

Web scraping VBA and VB Script

I am working on a project on VBA where the objective is to have a "program" that fetches rates from a website called X-Rates, and outputs to excel the monthly averages of a chosen country. Initially ...

vba web-scraping vbscript

asked Jul 21 at 15:42

svacxpython

775

7

votes

0answers

61 views

Regex-guided crawler that downloads regex-matching images up to a crawling level

This is one simple crawler that downloads images from websites, the website's URL to be crawled to must match the regex, as well as any image-to-download's URL. (Also, I know, I made my own thread ...

python regex web-scraping

asked Jul 2 at 21:16

Gustavo6046

362111

6

votes

1answer

92 views

Simple image scraping

I wrote this code over the last few days and I learned a lot, and it works as I expect. However I suspect it's woefully inefficient: ...

python beginner web-scraping beautiful-soup

asked Jul 1 at 5:50

Lachy Vass

333

10

votes

1answer

586 views

Scraping after login using Scrapy

I just finished a scraper in python using scrapy. The scraper logs in to a certain page and then scrapes a list of other pages using the authenticated session. It retrieves the title of these pages ...

python web-scraping logging scrapy

asked Jun 30 at 7:13

OutOfTheBox

537

6

votes

1answer

42 views

Script to download sequentially named files, rename them, and delete smaller files

I've written a little script to download sequentially named files, rename them, and delete files smaller than an certain number of kilobytes. I came up with this but I'm not too happy. Any advice for ...

bash http web-scraping network-file-transfer

asked Jun 24 at 15:48

BashN00b

311

2

votes

1answer

304 views

PHP web crawler

I'm working on a "nice" crawler that start with one URL, and find the other URLs to process each page, a kind of "Google" crawler, to index pages. I worked hard on this crawler to respect many points ...

performance php parsing web-scraping

asked Jun 24 at 14:40

Cyril N.

11811

4

votes

2answers

64 views

I'll visit the 18th

I wrote this program, which purpose is to visit the 18th link on the list of links and then on the new page visit the 18th link again. This program works as intended, but it's a little repetitive and ...

python web-scraping beautiful-soup

asked Jun 17 at 15:29

Ekaterina1234

1213

4

votes

1answer

93 views

Recursive Web Crawler in Go

This is probably my third Go application. It essentially takes one or two command line arguments of wikipedia articles and pulls every /wiki/ link that isn't a special page, memoizes them to avoid ...

beginner go web-scraping

asked Jun 10 at 0:29

Keozon

1215

0

votes

0answers

50 views

Finding shortest paths in a Wikipedia article graph using Java - second attempt

I have improved Finding shortest paths in a Wikipedia article graph using Java. Now I have this: AbstractWikipediaShortestPathFinder.java: ...

java algorithm graph web-scraping pathfinding

asked Jun 7 at 14:27

coderodde

9,29621052

9

votes

3answers

227 views

Scraping the date of most recent post from various social media services

Task I have a large spreadsheet where each line should include: The URL of a social media account A field indicating whether the account is "active" A name and UID number for each account I have to ...

python beginner web-scraping beautiful-soup selenium

asked Jun 1 at 23:40

A_S00

1488

4

votes

0answers

88 views

Finding shortest paths in a Wikipedia article graph using Java

(See also Finding shortest paths in a Wikipedia article graph using Java - second attempt.) I have this sort of a web crawler that asks for two (English) Wikipedia article titles (the source and the ...

java algorithm graph web-scraping wikipedia

asked May 29 at 6:51

coderodde

9,29621052

10

votes

1answer

390 views

The YouTube crawler

I have coded a program to scrap YouTube data (for educational purposes). When the link of the channel is entered it scraps the channel name, description of the channel, the videos posted by the ...

python python-3.x web-scraping tkinter youtube

asked May 21 at 10:51

VISWESWARAN1998

1910

1

vote

2answers

60 views

Crawling SPOJ through cURL and C++

I am trying to write industry standard code. https://www.quora.com/How-do-I-follow-a-user-on-Spoj-for-solving-problems-Refer-Details Someone gave me this A2A. And I wrote this code for it ...

c++ c++11 bash web-scraping curl

asked May 20 at 12:13

Dhruv Sehgal

756

4

votes

1answer

77 views

Get a stock quote from a web page using F#

I want to extract a price for a single index fund, the price of which is available on dynamic web pages. Being new to this, my original idea was to download the single page of static HTML and get ...

f# finance web-scraping

asked May 18 at 9:42

Jack Chidley

414

1

vote

0answers

180 views

Scrape google apps page and store application details in database

Below is a python script which scrapes specific google apps url for example https://play.google.com/store/apps/details?id=com.mojang.minecraftpe and save the ...

python mysql python-2.7 web-scraping xpath

asked May 18 at 5:02

satch_boogie

1946

4

votes

2answers

567 views

Web crawler in F#

I have been writing a web crawler in F# that downloads pages with stylesheets and scripts. Can somebody give me suggestions on improving this code, please? Would appreciate any feedback that could ...

.net f# web-scraping

asked May 17 at 13:03

Diy

212

2

votes

1answer

53 views

Web scraping with VBA

I have this code that fetches rates from a website called X-Rates, and outputs to excel the monthly averages of a chosen country. The code runs quite fast, but I still think there's improvements to ...

vba excel web-scraping

asked May 17 at 8:45

svacxpython

775

8

votes

2answers

914 views

Fast(er) web scraping with VBA

I have a code that fetches rates from a website called X-Rates, and outputs to excel the monthly averages of a chosen country. The code runs quite fast, but I still think I could improve the code a ...

vba excel web-scraping

asked May 12 at 11:39

svacxpython

775

0

votes

0answers

45 views

SQL vulnerability pentesting tool

I've written a program that scrapes websites for SQL vulnerability (it does not exploit websites, it just provides a list of possible exploitable sites). I would like some critique on what I've done. ...

ruby web-scraping

asked May 2 at 1:37

13aal

16124

6

votes

2answers

85 views

Steam Community Market Strange Part Scraper

A while back I wrote a simple little PHP script that searches the Steam community market for any TF2 strange weapons with strange parts on the first page of results for that weapon type. It works by ...

php web-scraping

asked Apr 30 at 2:14

quartata

13317

-1

votes

1answer

36 views

Skipping Google search results that point to certain sites

I have the following code that will skip certain URLs if needed: ...

ruby web-scraping url

asked Apr 7 at 18:02

13aal

16124

your communities

Tagged Questions

Related Tags