Icy Phoenix

     
 


Post new topic  This topic is locked: you cannot edit posts or make replies. 
Page 1 of 1
 
 
Reply with quote Download Post 
Post Spammed by Google - or Google used almost all my bandwith in 
 
i am read now this

Quote:
Spammed by Google - or Google used almost all my bandwith in 10 days!

It addresses selective crawling without going to all the no-event days. This should cut down on a huge google hit on my site. May apply to others with similar event calendars.

August 30th, 2005, 12:48
Hi,
I run a design and hosting service and several of our clients have suddenly been inundated with google bots in the last week and these are sites that have been indexed for a couple of years with no problem. One of the sites usually draws 1.2Gigs per month but lost 20Gigs in 4 days to one bot.
We had to deny the IP of that bot.

I'm guessing google has changed it's software as I'm seeing more and more reports on this.




google error or not?? this is the question


other links
mira unos enlaces utiles
 Información sobre Google para webmasters
http://books.google.com/webmasters/bot.html
¿Por qué Googlebot no obedece a mi archivo robots.txt?

Para ahorrar ancho de banda, Googlebot sólo descarga el archivo robots.txt una vez al día o cuando hemos recogido un número importante de páginas del servidor. Por lo tanto, es posible que a Googlebot le lleve un tiempo ponerse al corriente de los cambios en su archivo robots.txt. Además, Googlebot se encuentra distribuido en varias máquinas, cada una de las cuales mantiene un registro propio de su archivo robots.txt.

Siempre sugerimos que se compruebe si la sintaxis es correcta, comparándolo con el estándar en http://www.robotstxt.org/wc/exclusion.html#robotstxt. Una fuente habitual de problemas es que el archivo robots.txt no está ubicado en el directorio principal del servidor (por ejemplo, www.mihost.com/robots.txt); situar el archivo en un subdirectorio no tendrá ningún efecto.

Igualmente, existe una pequeña diferencia entre la manera en que Googlebot utiliza el archivo robots.txt y la manera en que se debería utilizar según el estándar robots.txt (sin olvidar la distinción entre "debería" y "debe"). El estándar indica que deberíamos utilizar la primera regla aplicable, pero Googlebot obedece a la más larga (es decir, la más específica). Esta práctica que resulta más intuitiva hace coincidir lo que las personas hacen en realidad con lo que esperan que hagamos. Por ejemplo, tenga en cuenta el siguiente archivo robots.txt:
User-Agent: *
Allow: /
Disallow: /cgi-bin

Es evidente que la intención del webmaster es permitir que los robots rastreen todo excepto el directorio /cgi-bin. En consecuencia, es lo que en Google hacemos.

Para obtener más información al respecto, por favor, consulte la página Preguntas más frecuentes sobre robots. Si continúa habiendo un problema, por favor, comuníquenoslo.

http://www.robotstxt.org/wc/faq.html

http://desarrollo.blogalia.com/historias/4764
http://google.dirson.com/post/1103/

http://www.maxglaser.net/crawl-cach...ra-de-bigdaddy/


http://www.maxglaser.net/category/noticias/page/2/


en ingles
: Spammed by Google - or Google used almost all my bandwith in 10 days!
http://forum.mamboserver.com/archive/index.php/t-58771.html
Sorry to "spam" so much here, but I'm working through solutions. I just found the following link that explains how to adjust googlebot for searching Events Calendars.

http://mamboforge.net/forum/forum.p...77&forum_id=296

It addresses selective crawling without going to all the no-event days. This should cut down on a huge google hit on my site. May apply to others with similar event calendars.

August 30th, 2005, 12:48
Hi,
I run a design and hosting service and several of our clients have suddenly been inundated with google bots in the last week and these are sites that have been indexed for a couple of years with no problem. One of the sites usually draws 1.2Gigs per month but lost 20Gigs in 4 days to one bot.
We had to deny the IP of that bot.

I'm guessing google has changed it's software as I'm seeing more and more reports on this.

http://forums.indiegamer.com/showthread.php?t=7882

http://www.michaelklouda.com/Articl...-your-bandwidth

http://forums.digitalpoint.com/showthread.php?t=43552

http://forums.digitalpoint.com/showthread.php?t=26192
http://warpedvisions.org/2005/08/26/microsoft-sucks-bandwidth/

parece mas bien un FALLO de google...........................¿¿o no ?? porque algunos mensajes son del 2005 del verano, y la mayoria de estos 2 meses
a quejarse en masa en la primera direccion toca o no??................
y sino que nos paguen el servidor ellos¿??
 
 




____________
jack of all trades, master of none
http://www.mieloma.com/ - http://www.casimedicos.com/ - http://www.egalego.com/ - http://www.casimedicos.com.es/ - http://www.medicosmir.com/
 
Last edited by casimedicos on Fri 22 Sep, 2006 16:21; edited 1 time in total 
casimedicosSend private messageVisit poster's website  
Back to topPage bottom
Icy Phoenix is an open source project, you can show your appreciation and support future development by donating to the project.

Support us
 
Reply with quote Download Post 
Post Re: Spammed by Google - or Google used almost all my bandwit 
 
Hail casimedicos, this is an unusual find... do you have html links enabled? I've not seen this problem myself.. yet... but as I don't have html enabled, I wouldn't expect to see it... (instead I just put a meta tag in an index.html file in root with all the relevent info about my site for search bots - the forum itself being a sideline, and thus not bothered about having bots search it)...

Also I just clicked on some to the links in your posts, but they're not working...


Does anyone know if Info>>Google Bot Detector also logs msn bots, or google bot only?
 



 
moreteavicarSend private message  
Back to topPage bottom
Reply with quote Download Post 
Post Re: Spammed by Google - or Google used almost all my bandwit 
 
sorry problem of copy and paste

original in spanish here
http://www.phpbb-es.com/foro/5-vt6307.html?start=40
now i fix the first post
 




____________
jack of all trades, master of none
http://www.mieloma.com/ - http://www.casimedicos.com/ - http://www.egalego.com/ - http://www.casimedicos.com.es/ - http://www.medicosmir.com/
 
casimedicosSend private messageVisit poster's website  
Back to topPage bottom
Reply with quote Download Post 
Post Re: Spammed by Google - or Google used almost all my bandwit 
 
Useful post... thanks!
 




____________
Luca
SEARCH is the quickest way to get support.
Icy Phoenix ColorizeIt - CustomIcy - HON
 
Mighty GorgonSend private messageSend e-mail to userVisit poster's website  
Back to topPage bottom
Post new topic  This topic is locked: you cannot edit posts or make replies.  Page 1 of 1
 


Display posts from previous:    

HideWas this topic useful?

Link this topic
URL
BBCode
HTML




 
Permissions List
You cannot post new topics
You cannot reply to topics
You cannot edit your posts
You cannot delete your posts
You cannot vote in polls
You cannot attach files
You can download files
You cannot post calendar events


  

 

  cron