December 2005 Archives

2005-12-12

Ooops... typo

I just found out that the description of my web page (“technical articles, scientific publications, personnal stuff, etc.”) had a typo since last may: “personnal” should be written “personal”. What a shame... ^_^


Posted by Romain Lenglet | Permanent Link | Categories: Web | Comments

2005-12-08

Registered to Feedster

Although I have experienced problems with Feedster's robot, I have decided to give Feedster a chance and I have registered on their web site, although I prefer using a standalone news feed aggregator such as KDE's Akregator than a web-based aggregator such as Feedster.

Here is the special cryptic link that they asked me to put into my blog to be able to “claim” control of my weblog RSS feed in the Feedster search engine (do not click that link: it is not meant to be clicked).

By the way, the web interface for user registration is also buggy and unusable because the registration form, when submitted, points to a nonexistent page. I guessed that only the host name of the latter URL was wrong, and should be feedster.com instead of feedster.net. I have therefore simply configured my system to resolve the feedster.net host name to the same IP address as feedster.com, so that the form points to feedster.com instead: this made the registration form work nice! To make my web browser access feedster.com instead of feedster.net, I have simply added the following line to my /etc/hosts file in my Linux system (64.95.116.10 is the IP address of feedster.com):

64.95.116.10    feedster.net

In addition, forms are present twice in the login page, for no reason.

Those are strong signs of “web site rot”...


Posted by Romain Lenglet | Permanent Link | Categories: Web | Comments

2005-12-08

Problems with Feedster's robot

Starting from yesterday, I have found many hits in our web server's logs (four hits every 30 minutes) from a robot at IP address 64.95.116.1. According to whois(1), this address belongs to "Feedster". This is how I have discovered the existence of the Feedster blog search engine...

Special note to the person who registered my web page into Feedster yesterday: this is nice to you, but you should have better registered my real RSS feed URL (at http://www.csg.is.titech.ac.jp/~lenglet/rss.xml) instead of my web page (at http://www.csg.is.titech.ac.jp/~lenglet/), because the result is a lot of hits to unexistent URLs from the dumb Feedster robot, cf. an extract of our web server's logs:

...
64.95.116.1 - - [08/Dec/2005:12:49:33 +0900] "GET /~lenglet HTTP/1.1" 301 336 -
64.95.116.1 - - [08/Dec/2005:12:49:33 +0900] "GET /~lenglet/ HTTP/1.1" 200 27650 -
64.95.116.1 - - [08/Dec/2005:12:49:34 +0900] "GET /atom.xml HTTP/1.1" 404 294 -
64.95.116.1 - - [08/Dec/2005:12:49:34 +0900] "GET /index.xml HTTP/1.1" 404 295 -
64.95.116.1 - - [08/Dec/2005:12:49:34 +0900] "GET /rss.xml HTTP/1.1" 404 293 -
64.95.116.1 - - [08/Dec/2005:13:17:48 +0900] "GET /~lenglet HTTP/1.1" 301 336 -
64.95.116.1 - - [08/Dec/2005:13:17:48 +0900] "GET /~lenglet/ HTTP/1.1" 200 27650 -
64.95.116.1 - - [08/Dec/2005:13:17:49 +0900] "GET /atom.xml HTTP/1.1" 403 298 -
64.95.116.1 - - [08/Dec/2005:13:17:49 +0900] "GET /index.xml HTTP/1.1" 403 299 -
64.95.116.1 - - [08/Dec/2005:13:17:49 +0900] "GET /rss.xml HTTP/1.1" 403 297 -
...

So if you could correct the URL of my feed in your Feedster account, or do anything to stop those wrong accesses, it would be very nice, thanks.

Here is why I say above that Feedster's robot is dumb:

  1. It does not respect the Robots Exclusion Standards, which consists for web robots such as Feedster's to access a file named robots.txt on every accessed web server to check if its accesses are welcome. Not only Feedster's robot does not respect this standard, which is disrespectful, but also it accesses feeds every 30 minutes, which I consider excessive.
  2. It seems to incorrectly interpret <link rel="alternate".../> elements in HTML page headers. For instance, my XHTML web page, which has been accessed every 30 minutes by Feedster's robot, contains such elements in its header which seem to be incorrectly interpreted by the robot. This leads to accesses to unexistent URLs (as shown in the web logs above with the 403 HTTP error codes): it should have accessed /~lenglet/atom.xml instead of /atom.xml, etc.

For information, here are the <link rel="alternate".../> elements in my web page headers:

<link rel="alternate" type="application/atom+xml" title="Atom 0.3" href="./atom.xml">
<link rel="alternate" type="application/rss+xml" title="RSS 2.0" href="./rss.xml">
<link rel="alternate" type="application/rss+xml" title="RSS 1.0" href="./index.xml">

I am certain that accesses by Feedster's robot to /atom.xml, /index.xml and /rss.xml are due to its interpretation of those <link rel="alternate".../> elements, because since I have denied any access to my web page to the robot it does no more try to access /atom.xml, /index.xml and /rss.xml. Here are the lines that I have added into my root .htaccess file, to specifically deny access to my web page to Feedster's robot:

<Limit GET>
order allow,deny
deny from 64.95.116.1
allow from all
</Limit>

Now, I still get accesses from Feedster's robot every 30 minutes, but those accesses are now denied and I am now getting those lines in our web server's logs:

...
64.95.116.1 - - [08/Dec/2005:14:01:02 +0900] "GET /~lenglet HTTP/1.1" 403 298 -
64.95.116.1 - - [08/Dec/2005:14:20:14 +0900] "GET /~lenglet HTTP/1.1" 403 298 -
...

When they will have corrected my feed's URL, I will probably re-enable access for that robot, but still they should correct their robot implementation...


Posted by Romain Lenglet | Permanent Link | Categories: Web | Comments

2005-12-06

Fast scrolling in KDE

I just inadvertently found a trick to scroll faster using a scrolling wheel mouse in anything that can be scrolled in KDE application, e.g. in the list of emails in Kmail, in the list of articles in aKregator, in a long web page displayed in Konqueror, in a long text displayed in Kwrite, etc. Just hold the Shift key while scrolling with the mouse's scrolling wheel: scrolling becomes much faster!


Posted by Romain Lenglet | Permanent Link | Categories: Desktop | Comments

2005-12-02

Secrets of good hypertext

The Art. Lebedev Studio are the designers of the wonderful Optimus keyboard and Mus computer mouse, among other things.

In addition, they publish many interesting in-depth articles about design on their web site, some articles concerning web design. Article number 83, titled “Secrets of good hypertext”, is about how to put good hyperlinks in web pages.

Today, I have tried to polish the hyperlinks in my blogs's articles according to this article, but this still requires some work.


Posted by Romain Lenglet | Permanent Link | Categories: Web | Comments