Bookmark utility part 4 - 404 - URL not found
URL download, HTTP responses
When maintaining a collection of links, you can be sure some of the pages
will change, disappear or move. Thus it would be find to know if the server
returns any errors on the URLs you already have. This week we will look
at such error responses.
|
As the last weeks exercises, this also starts with a chance to look
at some of the previous tasks. And again, the URL download from week
48 last year, proves helpful. Please have a look at it if you are not
familiar with reading of web pages.
|
|
When a requested web page does not exist, most web servers sends back
an error code; 404. This number can usually be found within the title or
body tags or both of the return HTML document.
Implement an application that returns whether an URL returned a true
page or a 404 error message by reading the HTML code returned. Be aware
that the Java reader or stream you are using, may throw a FileNotFoundException.
After some small tests, it seems like this is the case with URLs not ending
with an file extension; i.e. directories.
Hint: To search the HTML document, you may also look at the
exercises two weeks ago
GetResponse.java
|
|
However, it is not only the 404 message which is returned as an error
code from web servers. The HTTP specifies a wide range of different return
codes. Find these codes and see if there are more your application should
report.
|
|
A problem with the approach above is that not all web servers return
the 404 message. It is a rather technical message, and more popular sites
would rather return a polite notification that the page could not be found.
Furthermore, this message might not be in English.
To deal with this problem, you would have to work with HTTP requests
and responses instead. The URL class of java.net lets you access the response
messages. Implement a new application or extend the one above to return
the HTTP response code and message from a web server.
Hint:
URL.openConnection() , HttpURLConnection.getResponseCode()
and HttpURLConnection.getResponseMessage()
GetResponse.java
|
|
|