|
Like any other self respecting webhead, I wouldn't
exactly
mind being in search engines like Google or Hotbot. The trouble is
that the Robots Protocol that dictates appropriate netiquette for
spidering is an
"optional" one. This translates to something like "no spider
gives a shit about WHAT is in your robots.txt file", expect it to pound
your entire site relentlessly, no matter what you say. I got the bright idea
that if maybe I denied the spider(s) based on security violations (http error
403) their business rules would pick
up on it and desist. No such luck, online spiders are more than content to
generate a security violation once every 5 seconds for months on end. Multiply this
by about 3 per search engine times ten search engines and I will show you a bunch of unwanted
activity wasting my bandwidth and cycles. This
same behavior when perpetrated by the bad guys is identical
to the behavior of someone trying to break in as well as the building block
of what is otherwise known as a Denial of Service Attack. About my only strategy would
be to behave in kind and literally force them to stop. I would
not be surprised to find that "success" in stopping commercial
search engines would result in ME receiving the short end of the
stick yet again, only this time in the form of costly litigation
however justified my behavior is.
|