Robots
Almost all websites get regular visitors that aren't human. They're called robots or spiders. Their main purpose: Reading content for search engines. Well behaved robots are clearly marked in the access logfiles your webspace provider gives you.
The table below summarizes the robots I noticed in my access logs:
| Name | Read robots.text? | User Agent |
|---|---|---|
| Yahoo! | YES | Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp) |
| Look Smart | YES | Mozilla/4.0 compatible ZyBorg/1.0 (wn-14.zyborg@looksmart.net; http://www.WISEnutbot.com) |
| Alexa | YES | ia_archiver |
Notable exceptions include Google and MSN. Apparently, their update frequency is not small enough for me to notice (the table above is based on an access log of two weeks).
Currently, my website does not have a robots.txt, but the search engines don't know this. The hits are subsequently recorded as 404's.