Tell me more ×
Stack Overflow is a question and answer site for professional and enthusiast programmers. It's 100% free, no registration required.

I am attempting to connect to a website where I'd like to extract its HTML contents. My application will never connect to the site - only time out.

Here is my code:

URL url = new URL("www.website.com");
URLConnection connection = url.openConnection();
connection.setConnectTimeout(2000);
connection.setReadTimeOut(2000);
BufferedReader reader = new BufferedReader(new InputStreamReader(connection.getInputStream());
String line;

while ((line = reader.readLine()) != null) {
  // do stuff with line
}

reader.close();

Any ideas would be greatly appreciated. Thanks!

share|improve this question
are you able to access that website using your browser? any proxy set ? – Jigar Joshi Dec 27 '10 at 16:42
Yeah, I'm able to access the website just fine via the browser. Shouldn't be a proxy set. How can I tell? – webren Dec 27 '10 at 17:04
Are you really setting the timeout to 2 seconds? How complicated of a page are you loading? Change the timeout to something much higher like 10 minutes and see if you are able to load any data. – Andrew Finnell Dec 27 '10 at 17:14
are you getting a connection timeout or read time out? are you seeing any exceptions? Have you tried telnet to the url you are connecting to and checked if you are able to connect or not? – Pangea Dec 27 '10 at 17:26
Andrew, I've tested it without any limits on the time out and let the web page try to load until Tomcat throws a ConnectException, proclaiming the connect timed out. The page is not very complicated - a static page with an HTML table. – webren Dec 27 '10 at 17:50
show 3 more commentsadd comment (requires an account with 50 reputation)

1 Answer

I believe the url should be (ie. you need a protocol):

URL url = new URL("http://www.website.com"); 

If that doesn't help then post your real SSCCE that demonstrates the problem so we don't have to guess what you are really doing because we can't tell if you are using your try/catch block correctly or if you are just ignoring exceptions.

share|improve this answer
Right. The URL context: <scheme>://<authority><path>?<query>#<fragment> – Vash Dec 27 '10 at 16:48
In my actual code, it had the http protocol. What you see within my code is pretty much all there is expect for the method signature which is public String testURLConnection() throws IOException. Inside the while loop prints the line. My main objective is to parse the webpage's HTML contents, but I figure it would be best to start small and make sure the connection can be made first. – webren Dec 27 '10 at 17:05
I agree, you should test the connections first before parsing the contents. So your test program should be about 15 lines of code. Post your SSCCE so we don't have to guess what your are doing. That way we can also copy/paste and test code ourselves. Also,why for a simple test are you setting the timeout value? Have you tried other URL's, like stackoverflow for example? – camickr Dec 27 '10 at 17:24
camickr, you won't be able to test the code because the URL leads to a webpage that is on an intranet. This why I put the bogus webpage.com link as the URL. I set the time out value because without it, the connection would be stalling for a few minutes. I did this so if the connection wouldn't connect immediately like it should, then save me some time and halt. – webren Dec 27 '10 at 17:54
setting connection and read time out is a preferred option. though try to make these configurable on a per url basis and do some testing to find the optimal values – Pangea Dec 27 '10 at 18:40
add comment (requires an account with 50 reputation)

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.