Newest 'web-scraping' Questions

1

vote

0answers

35 views

Scraping the date of most recent post from various social media services

I have a large spreadsheet (csvToRead in my code). Each line includes (among other things I don't care about): The URL of a social media account (not always ...

asked 2 days ago

A_S00

62

2

votes

0answers

30 views

Finding shortest paths in a Wikipedia article graph using Java

I have this sort of a web crawler that asks for two (English) Wikipedia article titles (the source and the target), and proceeds to compute the shortest path between the two. My code is as follows: ...

java algorithm graph web-scraping

asked May 29 at 6:51

coderodde

6,2032740

7

votes

0answers

89 views

The YouTube crawler

I have coded a program to scrap YouTube data (for educational purposes). When the link of the channel is entered it scraps the channel name, description of the channel, the videos posted by the ...

python python-3.x web-scraping tkinter youtube

asked May 21 at 10:51

VISWESWARAN1998

363

1

vote

2answers

42 views

Crawling SPOJ through cURL and C++

I am trying to write industry standard code. https://www.quora.com/How-do-I-follow-a-user-on-Spoj-for-solving-problems-Refer-Details Someone gave me this A2A. And I wrote this code for it ...

c++ c++11 bash web-scraping curl

asked May 20 at 12:13

Dhruv Sehgal

756

2

votes

1answer

38 views

Get data from a web page, with client code, using F#

I want to extract a price for a single index fund, the price of which is available on dynamic web pages. Being new to this, my original idea was to download the single page of static HTML and get ...

f# web-scraping

asked May 18 at 9:42

Jack Chidley

113

1

vote

0answers

24 views

Scrape google apps page and store application details in database

Below is a python script which scrapes specific google apps url for example https://play.google.com/store/apps/details?id=com.mojang.minecraftpe and save the ...

python mysql python-2.7 web-scraping xpath

asked May 18 at 5:02

satch_boogie

946

3

votes

2answers

433 views

Web crawler in F#

I have been writing a web crawler in F# that downloads pages with stylesheets and scripts. Can somebody give me suggestions on improving this code, please? Would appreciate any feedback that could ...

.net f# web-scraping

asked May 17 at 13:03

Diy

162

2

votes

1answer

32 views

Web scraping with VBA

I have this code that fetches rates from a website called X-Rates, and outputs to excel the monthly averages of a chosen country. The code runs quite fast, but I still think there's improvements to ...

vba excel web-scraping

asked May 17 at 8:45

svacxpython

454

6

votes

2answers

193 views

Fast(er) web scraping with VBA

I have a code that fetches rates from a website called X-Rates, and outputs to excel the monthly averages of a chosen country. The code runs quite fast, but I still think I could improve the code a ...

vba excel web-scraping

asked May 12 at 11:39

svacxpython

454

0

votes

0answers

15 views

Ruby written SQL vulnerability pentesting tool

I've wrote a program that scrapes websites for SQL vulnerability, (IT DOES NOT EXPLOIT JUST SEARCHES) I would like some critique on what I've done, is there a way I can write the ...

ruby web-scraping

asked May 2 at 1:37

13aal

24222

5

votes

2answers

65 views

Steam Community Market Strange Part Scraper

A while back I wrote a simple little PHP script that searches the Steam community market for any TF2 strange weapons with strange parts on the first page of results for that weapon type. It works by ...

php web-scraping

asked Apr 30 at 2:14

quartata

12817

6

votes

0answers

77 views

Node.js parallel file download, the ES6 way

I wrote a script that downloads all PDFs found on the web page of a particular government agency. I would have chosen bash for such a task, but I want the script to ...

node.js asynchronous http web-scraping ecmascript-6

asked Mar 14 at 10:48

Nicolas Raoul

1364

0

votes

0answers

36 views

Web-scraping library

Here are two functions that make a request for a given URL and then takes the response body (HTML) and loads it into the cheerio library: scrapeListing.js ...

javascript oop node.js web-scraping ecmascript-6

asked Mar 14 at 0:19

therewillbecode

11

2

votes

2answers

66 views

Java web scraping robots

I am developing application that goes through 2 websites and gets all the articles, but my code is identical in most parts, is there a way to optimize this code actually :/ (TL and DN are the naming ...

java performance web-scraping

asked Mar 7 at 7:33

imoteb

1455

2

votes

1answer

36 views

Wikipedia indexer and shortest link finder

I have the following code, how can I make it more efficient? Also, it doesn't always find the shortest route. (See Cat -> Tree) ...

python regex web-scraping

asked Mar 1 at 19:39

DJ Gee

362

0

votes

0answers

69 views

Web crawler with Python and the asnyncio library

I am trying to experiment with Python 3.5 async/await and the whole asyncio library. I tried ...

python python-3.x asynchronous web-scraping async-await

asked Mar 1 at 15:51

meto

2291210

3

votes

2answers

77 views

Parsing HTML from multiple webpages simultaneously

My friend wrote a scraper in Go that takes the results from a house listing webpage and finds listings for houses that he's interested in. The initial search returns listings, they are filtered by ...

multithreading regex web-scraping rust

asked Feb 27 at 13:10

Explosion Pills

1185

0

votes

1answer

121 views

CRAP index (56) in Web Scraper Engine

I am working on a Web Scraper for the first time using Test Driven Development, however I have caught myself into a huge CRAP (Change Risk Anti-Patterns) index (56) and I can not seem to find a ...

php unit-testing web-scraping

asked Feb 26 at 17:26

GiamPy

696

4

votes

0answers

70 views

Web scraping with Nokogiri

At work we have a need to know what printers are getting dangerously low on their toner, and paper consumption, etc.. So I've created a program that pulls the printer information off the websites the ...

ruby web-scraping yaml

asked Feb 17 at 12:29

user97942

212

0

votes

0answers

40 views

Email a notification when detecting changes on a website - follow-up

I read through other questions here and improved the code and added a new feature. The old question can be found at: Email a notification when detecting changes on a website The improvements that are ...

python python-3.x email web-scraping

asked Feb 16 at 18:07

questionanswer

284

19

votes

5answers

960 views

An OEIS Lookup tool in Python

I'm from PPCG so I was making an esolang and I decided to write it in Python. Eventually it went from an esolang to an OEIS (Online Encyclopedia of Integer Sequences) lookup tool. I'm very new to ...

python beginner parsing web-scraping beautiful-soup

asked Feb 16 at 1:25

Downgoat

535214

0

votes

0answers

25 views

Extracting articles mentioned in comments in three GET requests

I need to make several HTTP GET requests and do the following stuff: When all of them will be completed, I need to parse each HTML After HTML parsing or in case of any error I need to set flag ...

javascript jquery ajax http web-scraping

asked Feb 15 at 10:19

FrozenHeart

1012

5

votes

3answers

281 views

Email a notification when detecting changes on a website

The text of a website is checked in a given time period. If there are any changes a mail is sent. There is a option to show/mail the new parts in the website. What could be improved? ...

python python-3.x email web-scraping

asked Feb 14 at 11:24

questionanswer

284

5

votes

1answer

104 views

Newest Reddit submissions grabber

My program does exactly what I want it to do and it works well. However, I feel like it's very clunky. I'd like my code to be more efficient. By that I mean, I'd like it to accomplish what it already ...

c# web-scraping reddit webdriver

asked Feb 12 at 1:02

Owen

3138

2

votes

2answers

86 views

PHP crawler to collect comments on articles

I have code that parses through web pages finds commentaries and saves commentary info in DB. I have an array where all necessary pages are stored. I iterate through all these pages one by one and ...

php web-scraping mysqli

asked Feb 7 at 21:23

dreamPr

111

1

vote

0answers

38 views

Checking paginated website for new entries

I'm interested in determining the best way to check a paginated website for new entries. I want to be able to scrape pages 1, 2, 3, ... as necessary to get all updates. However the scraping is fairly ...

python web-scraping

asked Feb 4 at 22:36

Six

1754

3

votes

0answers

69 views

Scraping links from the first page of Google using Kivy

I'm making a scraper/web crawler using Kivy when I run the code it works but I'm not sure if what I'm doing is Pythonic because all the language I can find is about using the Kivy library. I'm unsure ...

python web-scraping kivy

asked Feb 2 at 22:13

Eric MacLeod

161

3

votes

1answer

69 views

Factory pattern in F# for a web scraper

I'm trying to learn F# by creating a little web scraper that will do custom scraping based on the url domain. For this, I need to create and select the correct kind of scraper. I figure I would use a ...

f# web-scraping factory-method

asked Feb 2 at 3:33

ceiling cat

182

0

votes

1answer

46 views

Implementation of bridge design pattern for a web scraping app - follow-up

Earlier today I tried to implement an example of the bridge design pattern, but I ended up misinterpreting it. I made a lot of changes: ...

java oop design-patterns web-scraping

asked Jan 31 at 0:31

alexpfx

10410

0

votes

1answer

53 views

Implementation of Bridge Design Pattern

I made an implementation of the Bridge Pattern to handle ever-changing in crawler APIs that I'm using in my APP. ...

java design-patterns web-scraping

asked Jan 30 at 18:10

alexpfx

10410

2

votes

0answers

33 views

HTML Scraper for Plex downloads page

I have written a scraper in Python 3 using Beautiful Soup 4 to retrieve the latest version of Plex Media Server from https://plex.tv, and I'd like some feedback on how to improve it. The HTML the ...

python html python-3.x web-scraping beautiful-soup

asked Jan 19 at 19:40

Jack Wilsdon

600417

4

votes

2answers

200 views

Page Scraper and DOM manipulator

This code is a page scraper using HtmlAgilityPack that creates a DOM document upon construction and allows for node manipulation afterward. HtmlAgilityPack uses XPath Selectors for selecting nodes. ...

c# web-scraping

asked Jan 18 at 3:53

Quill

9,56742481

6

votes

1answer

38 views

Helper functions to extract SEDE query results into more user-friendly format

This Python module contains helper functions to download the result page of SEDE queries and extract columns from it, most prominently: ...

python python-3.x web-scraping stackexchange beautiful-soup

asked Dec 25 '15 at 22:01

janos♦

74.8k880284

14

votes

2answers

168 views

OLog Userscript - Logging messages, planets and researches

For the online text-based browser game OGame I am working on an application with as aim to assist the users where possible, for this I have a server-side part and a client-side part, the respective ...

javascript web-scraping userscript

asked Dec 25 '15 at 19:32

skiwi

5,39921780

4

votes

1answer

115 views

Web crawler that charts stock ticker data using matplotlib

I've built a web crawler using the BeautifulSoup library that pulls stock ticker data from CSV files on Yahoo finance, and charts the data using ...

python web-scraping beautiful-soup matplotlib

asked Dec 21 '15 at 6:01

123

1487

14

votes

2answers

117 views

Getting to Wikipedia's “Philosophy” article using Python

On Wikipedia, if you click the first non-italicised internal link in the main text of an article that's not within parentheses, and then repeat the process, you usually end up on the "Philosophy" ...

python performance web-scraping

asked Dec 20 '15 at 12:37

Sumit

1716

6

votes

1answer

109 views

Java batch movie downloader

The idea is to batch download a list of movies (torrents) off a torrent site and add them to your server. I have a little bit of Java experience (sophomore in college), so I'm looking for things that ...

java web-scraping network-file-transfer ssh

asked Dec 15 '15 at 6:12

Ben

311

6

votes

1answer

34 views

Scraping SEDE query results with caching

I use this script to scrape the results of a SEDE page and return as a BeautifulSoup object. A small twist is that if I don't use a SEDE query manually in the browser for a few days, then ...

python python-3.x web-scraping stackexchange beautiful-soup

asked Dec 14 '15 at 23:30

janos♦

74.8k880284

6

votes

2answers

42 views

Scraping columns from SEDE results

I use the following script to download the result of a SEDE query and scrape a specific column from it using BeautifulSoup: ...

python python-3.x web-scraping stackexchange beautiful-soup

asked Dec 11 '15 at 21:42

janos♦

74.8k880284

9

votes

2answers

232 views

Web Scraping with VBA

I wrote this to scrape album review data from AOTY into a spreadsheet. Check it out and let me know what I could've done better. ...

vba web-scraping

asked Dec 2 '15 at 16:30

ForrestA

1135

6

votes

2answers

130 views

Web Scraper in Python

So, this is my first web scraper (or part of it at least) and looking for things that I may have done wrong, or things that could be improved so I can learn from my mistakes. I made a few short ...

python python-3.x recursion web-scraping beautiful-soup

asked Nov 27 '15 at 3:28

Pythonista

1856

2

votes

1answer

418 views

Multithreaded Webcrawler in Java

I am working on a multi-threaded webcrawling program in Java. Each WebCrawler starts at a root page and repeatedly extracts new links and writes them to a database. ...

java multithreading web-scraping

asked Nov 19 '15 at 10:45

Simon Zhu

1453

2

votes

0answers

184 views

Movie torrent-site web scraper with IMDb info and streaming

I'm completely new to Javascript and NodeJs and functional programming in general. The code below scrapes a torrent-website containing movies, gets info about the movie from the OMDb API and lets a ...

javascript beginner node.js web-scraping

asked Nov 6 '15 at 12:30

tsorn

1284

3

votes

2answers

74 views

Program to retrieve key/message from a multiple times used one time pad

I wrote a program to retrieve the key/messages from 10 different ciphers which were all encrypted with the same key with an xor one-time-pad method via crib dragging. To do this, I wrote a python ...

python cryptography web-scraping

asked Oct 28 '15 at 23:25

Acru

183

6

votes

3answers

238 views

IP and router connections

How can I make my code more pythonic ? I definitely think there is a way to make this code a lot more readable and clear + shorter... But I haven't found an effective way. Any techniques I can use to ...

python regex web-scraping

asked Oct 25 '15 at 18:20

Marie Anne

393

1

vote

2answers

346 views

Scrapy spider for products on a site

I recently submitted a code sample for a web scraping project and was rejected without feedback as to what they didn't like. The prompt, while I cannot give it here verbatim, basically stated that I ...

python web-scraping scrapy

asked Oct 23 '15 at 13:10

trendsetter37

3317

5

votes

1answer

79 views

Mixed scripting language API to determine file upload location and scrape city government website to find corresponding government official

I wrote this script as an NYC-specific-API for file upload for a mobile app. Users upload a video file and also their geographic coordinates. I then use an external API to get the corresponding ...

python php api r web-scraping

asked Oct 15 '15 at 22:18

sunny

1,292123

4

votes

3answers

629 views

Multithreaded web scraper with proxy and user agent switching

I am trying to improve the performance of my scraper and plug up any possible security leaks (identifying information being revealed). Ideally, I would like to achieve a performance of 10 pages per ...

python multithreading python-2.7 web-scraping

asked Oct 9 '15 at 18:14

ScrapeHeap

211

2

votes

0answers

351 views

A web crawler for scraping images from stock photo websites

I created a web crawler that uses beautiful soup to crawl images from a website and scrape them to a database. in order to use it you have to create a class that inherits from Crawler and implements 4 ...

python python-3.x web-scraping django

asked Oct 3 '15 at 15:02

davegri

1282

3

votes

1answer

125 views

Web crawlers for three image sites

I'm very new to python and only vaguely remember OOP from doing some Java a few years ago so I don't know what the best way to do this is. I've build a bunch of classes that represent a crawler that ...

python oop python-3.x web-scraping

asked Oct 2 '15 at 13:55

davegri

1282

your communities

Tagged Questions

Related Tags