www.nettime.org
Nettime mailing list archives

Re: <nettime> fast-changing propaganda website archiving tools?
Morlock Elloi on Thu, 4 Feb 2010 06:19:46 +0100 (CET)


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: <nettime> fast-changing propaganda website archiving tools?


There is a more general issue here: the concept of aggregating spidering
(one entity spiders and then serves the wider paudience, public or not.  As
competition develops, there will be more and more of these - Bing of course
won't trust Google, NSA won't trust CIA and Intel won't trust AMD's
espionage facility. To push it all the way to the boundary case, why would
I or anyone else trust any of the above for spidering services?

This is a purely technological problem: initially it was on the aggregating
side (only Google and Yahoo could afford industrial-strength spidering),
but that changed and now many can afford it; soon everyone will be able to
afford it - my search patterns can be serviced from my own computers 
- I know what I am looking for, and can create custom spider that will do
  better job than google, for me.  This is on the aggregating side.

On the source side it will be harder to solve, as "interesting sites"
without commercial backing cannot afford to service all these private and
public spiders. This is a classical publishing problem, and the solution
(on the source side) will have to somehow involve money or equivalent
barrier. No pay, no spider access. Which is, of course, Google's nightmare.
Paying for content. 


> As search engine technology spun of hundreds, maybe thousands, of bot
> programs empowered individuals, institutions, governments, competitors,
> thieves, good hearts and idiot savants to rake in files without
> restraint. They came to average 25-30% of bandwidth usage despite use of
> robot.txt and htaccess.
> 
> The more powerful bots would take over the site until it was completly
> drained. When it hosted a few hundred files, that was


#  distributed via <nettime>: no commercial use without permission
#  <nettime>  is a moderated mailing list for net criticism,
#  collaborative text filtering and cultural politics of the nets
#  more info: http://mail.kein.org/mailman/listinfo/nettime-l
#  archive: http://www.nettime.org contact: nettime {AT} kein.org