[an error occurred while processing this directive]
Domain for sale!
Start Search Contents Index Links About

Bookmark utility part 4 - 404 - URL not found

URL download, HTTP responses

2002 - Week 10 - Havard Rast Blok

When maintaining a collection of links, you can be sure some of the pages will change, disappear or move. Thus it would be find to know if the server returns any errors on the URLs you already have. This week we will look at such error responses.

As the last weeks exercises, this also starts with a chance to look at some of the previous tasks. And again, the URL download from week 48 last year, proves helpful. Please have a look at it if you are not familiar with reading of web pages.

When a requested web page does not exist, most web servers sends back an error code; 404. This number can usually be found within the title or body tags or both of the return HTML document.

Implement an application that returns whether an URL returned a true page or a 404 error message by reading the HTML code returned. Be aware that the Java reader or stream you are using, may throw a FileNotFoundException. After some small tests, it seems like this is the case with URLs not ending with an file extension; i.e. directories.

Hint: To search the HTML document, you may also look at the exercises two weeks ago


GetResponse.java

However, it is not only the 404 message which is returned as an error code from web servers. The HTTP specifies a wide range of different return codes. Find these codes and see if there are more your application should report.

A problem with the approach above is that not all web servers return the 404 message. It is a rather technical message, and more popular sites would rather return a polite notification that the page could not be found. Furthermore, this message might not be in English.

To deal with this problem, you would have to work with HTTP requests and responses instead. The URL class of java.net lets you access the response messages. Implement a new application or extend the one above to return the HTTP response code and message from a web server.

Hint: URL.openConnection() , HttpURLConnection.getResponseCode() and HttpURLConnection.getResponseMessage()


GetResponse.java



site: Håvard Rast Blok
mail:
updated: 16 July 2010