Newest &#39;web-scraping&#39; Questions

0

votes

0answers

36 views

Whitewidow, SQL vulnerability web scraper

I believe it was yesterday I posted a program that scrapes google for SQL vulnerable web pages here. I never got an answer, so I just went all out and made it look a little prettier, it still only ...

asked Apr 6 at 4:50

13aal

1435

3

votes

0answers

43 views

SQL vulnerability web scraper

I've created a program that will be used for pentesting. It scrapes the first page of Google out of an array of searches. It then attempts to find an SQL error within the site by adding an apostrophe ...

ruby web-scraping

asked Apr 5 at 1:55

13aal

1435

5

votes

0answers

49 views

Node.js parallel file download, the ES6 way

I wrote a script that downloads all PDFs found on the web page of a particular government agency. I would have chosen bash for such a task, but I want the script to ...

node.js asynchronous http web-scraping ecmascript-6

asked Mar 14 at 10:48

Nicolas Raoul

1263

0

votes

0answers

32 views

Web-scraping library

Here are two functions that make a request for a given URL and then takes the response body (HTML) and loads it into the cheerio library: scrapeListing.js ...

javascript oop node.js web-scraping ecmascript-6

asked Mar 14 at 0:19

therewillbecode

11

2

votes

2answers

55 views

Java web scraping robots

I am developing application that goes through 2 websites and gets all the articles, but my code is identical in most parts, is there a way to optimize this code actually :/ (TL and DN are the naming ...

java performance web-scraping

asked Mar 7 at 7:33

imoteb

455

2

votes

1answer

33 views

Wikipedia indexer and shortest link finder

I have the following code, how can I make it more efficient? Also, it doesn't always find the shortest route. (See Cat -> Tree) ...

python regex web-scraping

asked Mar 1 at 19:39

Daniel Gee

161

0

votes

0answers

47 views

Web crawler with Python and the asnyncio library

I am trying to experiment with Python 3.5 async/await and the whole asyncio library. I tried ...

python python-3.x asynchronous web-scraping async-await

asked Mar 1 at 15:51

meto

229129

3

votes

2answers

75 views

Parsing HTML from multiple webpages simultaneously

My friend wrote a scraper in Go that takes the results from a house listing webpage and finds listings for houses that he's interested in. The initial search returns listings, they are filtered by ...

multithreading regex web-scraping rust

asked Feb 27 at 13:10

Explosion Pills

1185

0

votes

1answer

118 views

CRAP index (56) in Web Scraper Engine

I am working on a Web Scraper for the first time using Test Driven Development, however I have caught myself into a huge CRAP (Change Risk Anti-Patterns) index (56) and I can not seem to find a ...

php unit-testing web-scraping

asked Feb 26 at 17:26

GiamPy

696

4

votes

0answers

61 views

Web scraping with Nokogiri

At work we have a need to know what printers are getting dangerously low on their toner, and paper consumption, etc.. So I've created a program that pulls the printer information off the websites the ...

ruby web-scraping yaml

asked Feb 17 at 12:29

user97942

212

0

votes

0answers

27 views

Email a notification when detecting changes on a website - follow-up

I read through other questions here and improved the code and added a new feature. The old question can be found at: Email a notification when detecting changes on a website The improvements that are ...

python python-3.x email web-scraping

asked Feb 16 at 18:07

questionanswer

284

19

votes

5answers

930 views

An OEIS Lookup tool in Python

I'm from PPCG so I was making an esolang and I decided to write it in Python. Eventually it went from an esolang to an OEIS (Online Encyclopedia of Integer Sequences) lookup tool. I'm very new to ...

python beginner parsing web-scraping beautiful-soup

asked Feb 16 at 1:25

Downgoat

535113

0

votes

0answers

25 views

Extracting articles mentioned in comments in three GET requests

I need to make several HTTP GET requests and do the following stuff: When all of them will be completed, I need to parse each HTML After HTML parsing or in case of any error I need to set flag ...

javascript jquery ajax http web-scraping

asked Feb 15 at 10:19

FrozenHeart

1012

5

votes

3answers

250 views

Email a notification when detecting changes on a website

The text of a website is checked in a given time period. If there are any changes a mail is sent. There is a option to show/mail the new parts in the website. What could be improved? ...

python python-3.x email web-scraping

asked Feb 14 at 11:24

questionanswer

284

5

votes

1answer

94 views

Newest Reddit submissions grabber

My program does exactly what I want it to do and it works well. However, I feel like it's very clunky. I'd like my code to be more efficient. By that I mean, I'd like it to accomplish what it already ...

c# web-scraping reddit webdriver

asked Feb 12 at 1:02

Owen

1837

2

votes

2answers

74 views

PHP crawler to collect comments on articles

I have code that parses through web pages finds commentaries and saves commentary info in DB. I have an array where all necessary pages are stored. I iterate through all these pages one by one and ...

php web-scraping mysqli

asked Feb 7 at 21:23

dreamPr

111

1

vote

0answers

30 views

Checking paginated website for new entries

I'm interested in determining the best way to check a paginated website for new entries. I want to be able to scrape pages 1, 2, 3, ... as necessary to get all updates. However the scraping is fairly ...

python web-scraping

asked Feb 4 at 22:36

Six

1754

3

votes

0answers

58 views

Scraping links from the first page of Google using Kivy

I'm making a scraper/web crawler using Kivy when I run the code it works but I'm not sure if what I'm doing is Pythonic because all the language I can find is about using the Kivy library. I'm unsure ...

python web-scraping kivy

asked Feb 2 at 22:13

Eric MacLeod

161

3

votes

1answer

53 views

Factory pattern in F# for a web scraper

I'm trying to learn F# by creating a little web scraper that will do custom scraping based on the url domain. For this, I need to create and select the correct kind of scraper. I figure I would use a ...

f# web-scraping factory-method

asked Feb 2 at 3:33

ceiling cat

182

0

votes

1answer

43 views

Implementation of bridge design pattern for a web scraping app - follow-up

Earlier today I tried to implement an example of the bridge design pattern, but I ended up misinterpreting it. I made a lot of changes: ...

java oop design-patterns web-scraping

asked Jan 31 at 0:31

alexpfx

10410

0

votes

1answer

50 views

Implementation of Bridge Design Pattern

I made an implementation of the Bridge Pattern to handle ever-changing in crawler APIs that I'm using in my APP. ...

java design-patterns web-scraping

asked Jan 30 at 18:10

alexpfx

10410

2

votes

0answers

22 views

HTML Scraper for Plex downloads page

I have written a scraper in Python 3 using Beautiful Soup 4 to retrieve the latest version of Plex Media Server from https://plex.tv, and I'd like some feedback on how to improve it. The HTML the ...

python html python-3.x web-scraping beautiful-soup

asked Jan 19 at 19:40

Jack Wilsdon

487416

4

votes

2answers

192 views

Page Scraper and DOM manipulator

This code is a page scraper using HtmlAgilityPack that creates a DOM document upon construction and allows for node manipulation afterward. HtmlAgilityPack uses XPath Selectors for selecting nodes. ...

c# web-scraping

asked Jan 18 at 3:53

Quill

9,52942381

6

votes

1answer

35 views

Helper functions to extract SEDE query results into more user-friendly format

This Python module contains helper functions to download the result page of SEDE queries and extract columns from it, most prominently: ...

python python-3.x web-scraping stackexchange beautiful-soup

asked Dec 25 '15 at 22:01

janos♦

72.7k878282

14

votes

2answers

163 views

OLog Userscript - Logging messages, planets and researches

For the online text-based browser game OGame I am working on an application with as aim to assist the users where possible, for this I have a server-side part and a client-side part, the respective ...

javascript web-scraping userscript

asked Dec 25 '15 at 19:32

skiwi

5,31421778

4

votes

1answer

93 views

Web crawler that charts stock ticker data using matplotlib

I've built a web crawler using the BeautifulSoup library that pulls stock ticker data from CSV files on Yahoo finance, and charts the data using ...

python web-scraping beautiful-soup matplotlib

asked Dec 21 '15 at 6:01

123

1487

14

votes

2answers

109 views

Getting to Wikipedia's “Philosophy” article using Python

On Wikipedia, if you click the first non-italicised internal link in the main text of an article that's not within parentheses, and then repeat the process, you usually end up on the "Philosophy" ...

python performance web-scraping

asked Dec 20 '15 at 12:37

Sumit

1716

6

votes

1answer

97 views

Java batch movie downloader

The idea is to batch download a list of movies (torrents) off a torrent site and add them to your server. I have a little bit of Java experience (sophomore in college), so I'm looking for things that ...

java web-scraping network-file-transfer ssh

asked Dec 15 '15 at 6:12

Ben

311

6

votes

1answer

32 views

Scraping SEDE query results with caching

I use this script to scrape the results of a SEDE page and return as a BeautifulSoup object. A small twist is that if I don't use a SEDE query manually in the browser for a few days, then ...

python python-3.x web-scraping stackexchange beautiful-soup

asked Dec 14 '15 at 23:30

janos♦

72.7k878282

6

votes

2answers

40 views

Scraping columns from SEDE results

I use the following script to download the result of a SEDE query and scrape a specific column from it using BeautifulSoup: ...

python python-3.x web-scraping stackexchange beautiful-soup

asked Dec 11 '15 at 21:42

janos♦

72.7k878282

9

votes

2answers

207 views

Web Scraping with VBA

I wrote this to scrape album review data from AOTY into a spreadsheet. Check it out and let me know what I could've done better. ...

vba web-scraping

asked Dec 2 '15 at 16:30

ForrestA

1135

6

votes

2answers

121 views

Web Scraper in Python

So, this is my first web scraper (or part of it at least) and looking for things that I may have done wrong, or things that could be improved so I can learn from my mistakes. I made a few short ...

python python-3.x recursion web-scraping beautiful-soup

asked Nov 27 '15 at 3:28

Pythonista

1856

2

votes

1answer

293 views

Multithreaded Webcrawler in Java

I am working on a multi-threaded webcrawling program in Java. Each WebCrawler starts at a root page and repeatedly extracts new links and writes them to a database. ...

java multithreading web-scraping

asked Nov 19 '15 at 10:45

Simon Zhu

1453

2

votes

0answers

160 views

Movie torrent-site web scraper with IMDb info and streaming

I'm completely new to Javascript and NodeJs and functional programming in general. The code below scrapes a torrent-website containing movies, gets info about the movie from the OMDb API and lets a ...

javascript beginner node.js web-scraping

asked Nov 6 '15 at 12:30

tsorn

111

3

votes

2answers

71 views

Program to retrieve key/message from a multiple times used one time pad

I wrote a program to retrieve the key/messages from 10 different ciphers which were all encrypted with the same key with an xor one-time-pad method via crib dragging. To do this, I wrote a python ...

python cryptography web-scraping

asked Oct 28 '15 at 23:25

Acru

183

6

votes

3answers

233 views

IP and router connections

How can I make my code more pythonic ? I definitely think there is a way to make this code a lot more readable and clear + shorter... But I haven't found an effective way. Any techniques I can use to ...

python regex web-scraping

asked Oct 25 '15 at 18:20

Marie Anne

393

1

vote

2answers

300 views

Scrapy spider for products on a site

I recently submitted a code sample for a web scraping project and was rejected without feedback as to what they didn't like. The prompt, while I cannot give it here verbatim, basically stated that I ...

python web-scraping scrapy

asked Oct 23 '15 at 13:10

trendsetter37

337

5

votes

1answer

77 views

Mixed scripting language API to determine file upload location and scrape city government website to find corresponding government official

I wrote this script as an NYC-specific-API for file upload for a mobile app. Users upload a video file and also their geographic coordinates. I then use an external API to get the corresponding ...

python php api r web-scraping

asked Oct 15 '15 at 22:18

sunny

1,292122

4

votes

3answers

468 views

Multithreaded web scraper with proxy and user agent switching

I am trying to improve the performance of my scraper and plug up any possible security leaks (identifying information being revealed). Ideally, I would like to achieve a performance of 10 pages per ...

python multithreading python-2.7 web-scraping

asked Oct 9 '15 at 18:14

ScrapeHeap

211

2

votes

0answers

280 views

A web crawler for scraping images from stock photo websites

I created a web crawler that uses beautiful soup to crawl images from a website and scrape them to a database. in order to use it you have to create a class that inherits from Crawler and implements 4 ...

python python-3.x web-scraping django

asked Oct 3 '15 at 15:02

davegri

1282

3

votes

1answer

106 views

Web crawlers for three image sites

I'm very new to python and only vaguely remember OOP from doing some Java a few years ago so I don't know what the best way to do this is. I've build a bunch of classes that represent a crawler that ...

python oop python-3.x web-scraping

asked Oct 2 '15 at 13:55

davegri

1282

3

votes

0answers

186 views

Web scraping from the Google Play store

I am using this R function to web scrape data from the google play store. Is there a way to increase its efficiency using R? This code takes about 4 seconds for 14 urls with my machine/internet ...

performance r web-scraping

asked Sep 3 '15 at 7:49

CFM

161

4

votes

1answer

48 views

Webscraping Bing wallpapers

I wanted to scrape all the wallpapers from the Bing wallpaper gallery. This was for personal use and to learn about webscraping. The gallery progressively gets images using javascript as the user ...

python python-3.x web-scraping

asked Aug 31 '15 at 15:08

amt528

1835

10

votes

2answers

152 views

Financial Data From Webqueries in Excel

I'm new (to CR and to programming in general). I wrote my first VBA on Monday. This is my first working project. It Takes a bunch of financial data from a company called Financial Analytics and a ...

beginner vba excel web-scraping

asked Aug 31 '15 at 1:43

BlackHatGuy

889

3

votes

1answer

81 views

Scraping a table from Texas Dept. of Criminal Justice website

The script scrapes a table from the website mentioned, looks at the last 2 columns, takes that information, and then sorts it (and then returns the largest county, and the set of races and their ...

python web-scraping beautiful-soup

asked Aug 7 '15 at 20:07

Scoutdrago3

6016

5

votes

2answers

472 views

Downloading stock information from Yahoo! Finance

The program downloads stock information from Yahoo! Finance and displays it in the spreadsheet. On my Mac the program takes 10 minutes to get data for approximately 4000 stocks and on the PC it takes ...

vba excel finance web-scraping

asked Jul 29 '15 at 19:08

Chris Atkeson

313

4

votes

1answer

435 views

Yellow Pages scraper

What are your thoughts and comments relating to areas I could improve on? ...

python web-scraping

asked Jul 29 '15 at 1:44

Halcyon Abraham Ramirez

233

2

votes

0answers

65 views

OOP Web scraper using regex to grab tag contents

I'm about learning about implementing the solid principle in PHP. I want to create simple content crawler/grabber from some websites. This crawler will grab the content from the website url. Since we ...

oop php5 web-scraping

asked Jul 27 '15 at 23:09

Fatimah Wulandari

211

3

votes

2answers

195 views

Google News scraper to fetch links with similar stories

The following code takes either a URL or the title to an existing news article. Searches Google News using the title. Collects all links from search results. ...

python web-scraping

asked Jul 17 '15 at 14:25

user2573339

183

2

votes

1answer

91 views

Reducing execution time of an HTML parsing script

The script is intended to return an array with texts containing specific words in English and the equivalent texts in Polish from EUR-Lex - a website with EU documents. The script downloads the page ...

php performance web-scraping

asked Jul 15 '15 at 12:11

Lukasz

112

your communities

Tagged Questions

Related Tags