Skip to main content

All Questions

Tagged with
Filter by
Sorted by
Tagged with
4 votes
1 answer
117 views

Java classes for downloading all in-coming/out-going links of an article in the Wikipedia article graph

(The entire project is in GitHub.) Introduction This project provides facilities for generating in-coming or out-going links in a given Wikipedia page. Code ...
coderodde's user avatar
  • 31.3k
4 votes
1 answer
102 views

Webscraping tennis data 1.1

I incorporated the substantial changes suggested in my previous question that involved building a web-scraper for gathering tennis data. The improved code is shown below: ...
cloudy_eclispse's user avatar
8 votes
1 answer
265 views

Webscraping tennis data

So as a starter Java project I decided to web scrape some data (specifically all historically No. 1 ranked players for weeks starting from 1973) from the ATP website, and do something with it (IPR). I'...
cloudy_eclispse's user avatar
2 votes
2 answers
708 views

Jsoup connection to URL

I have simple class that I want to ask if is there any possible to improve it? I mean, for me it looks poor. Is there any way to use here try-with-resources, stream or ...
mara122's user avatar
  • 183
0 votes
1 answer
3k views

YouTube page scraping using Jsoup

I am trying to scrape the YouTube video streaming page to get the metadata of the video. I am considering this YouTube page as an example. You can find the HTML contents of that page over here (I have ...
Nandan Desai's user avatar
2 votes
1 answer
122 views

Part of web crawler

According to the first part: Forum crawler - counts statistics for words in chosen forum topic I take into account the review of Janos and created Iterator for my classes. This is part of the whole ...
wBacz's user avatar
  • 133
1 vote
1 answer
190 views

Forum crawler - counts statistics for words in chosen forum topic

I have made a skeleton part of the crawler and would like to ask you for a review. I'm not sure especially about the way how I divide app into classes. What app does? It scraps through all the ...
wBacz's user avatar
  • 133
2 votes
1 answer
2k views

Class for scraping images with JSoup

I refactored this class as far as I'm capable of at the time but I wonder if it can't be better. One thing that I'm not sure of, is that I take parameters from a method, which is not a constructor, ...
progonkpa's user avatar
  • 131
3 votes
0 answers
52 views

Regularly watch recent posts of a blog for specific words with HTML scraping

Task I want to watch the "Recent Posts" section of a blog for changes/new posts but only for specific posts containing a pre-defined word. Afterwards a list should be outputted in the console with ...
sceiler's user avatar
  • 305
6 votes
2 answers
4k views

Multithreaded webcrawler

I've been trying to learn Java for the last day or two. This is the first project I am working on, so please bear with me. I worked on a multithreaded web crawler. It is fairly simple but I'd like to ...
Jan's user avatar
  • 161
2 votes
1 answer
160 views

Optimizing Java HTML parser

I wrote a program that goes through a webpage and returns matches of regex. I used it on my letterboxd.com account to go through all of my movies (over 900 entries) and then find genres field for each ...
mlukas's user avatar
  • 21
5 votes
1 answer
461 views

Finding shortest paths in a Wikipedia article graph using Java

(See also Finding shortest paths in a Wikipedia article graph using Java - second attempt.) I have this sort of a web crawler that asks for two (English) Wikipedia article titles (the source and the ...
coderodde's user avatar
  • 31.3k
2 votes
2 answers
280 views

Java web scraping robots

I am developing application that goes through 2 websites and gets all the articles, but my code is identical in most parts, is there a way to optimize this code actually :/ (TL and DN are the naming ...
Tano's user avatar
  • 197
0 votes
1 answer
857 views

Implementation of bridge design pattern for a web scraping app - follow-up

Earlier today I tried to implement an example of the bridge design pattern, but I ended up misinterpreting it. I made a lot of changes: ...
alexpfx's user avatar
  • 524
0 votes
1 answer
177 views

Implementation of Bridge Design Pattern

I made an implementation of the Bridge Pattern to handle ever-changing in crawler APIs that I'm using in my APP. ...
alexpfx's user avatar
  • 524

15 30 50 per page