Googlebot (Google Web search)
w推测“域名解析过程中,Google crawlers中首先是Googlebo中的Google Web search上阵”。
+-----+----------------+---------------------+-------------------------+------------------+
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 119.147.32.253 | -- :: | Unidentified User Agent | |
| | 183.57.53.197 | -- :: | Mozilla 5.0 | iOS |
| | 123.56.233.103 | -- :: | Unidentified User Agent | |
| | 112.90.142.207 | -- :: | Firefox 3.0 | Windows XP |
| | 183.232.120.37 | -- :: | Firefox 3.0 | Windows XP |
| | 117.136.40.218 | -- :: | ZTE | Android |
| | 117.136.40.218 | -- :: | ZTE | Android |
| | 117.136.40.218 | -- :: | ZTE | Android |
| | 117.136.40.218 | -- :: | ZTE | Android |
| | 117.136.40.218 | -- :: | Safari 534.30 | Android |
| | 117.136.40.218 | -- :: | Safari 534.30 | Android |
| | 117.136.40.218 | -- :: | Chrome 37.0.0.0 | Android |
| | 117.136.40.218 | -- :: | Chrome 37.0.0.0 | Android |
| | 117.136.40.218 | -- :: | Chrome 37.0.0.0 | Android |
| | 117.136.40.218 | -- :: | Chrome 37.0.0.0 | Android |
| | 117.136.40.218 | -- :: | Chrome 55.0.2883.87 | Windows |
| | 177.193.53.212 | -- :: | Googlebot | Unknown Platform |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 139.162.108.53 | -- :: | Chrome 50.0.2661.102 | Windows |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 61.142.176.19 | -- :: | Firefox 3.6. | Windows |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 23.251.63.45 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 61.142.176.20 | -- :: | Unidentified User Agent | Unknown Platform |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 23.251.63.45 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 23.251.63.45 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 23.251.63.45 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 125.39.207.33 | -- :: | Unidentified User Agent | Unknown Platform |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 183.60.48.110 | -- :: | Unidentified User Agent | Unknown Platform |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 101.226.51.229 | -- :: | Chrome 45.0.2454.101 | Windows XP |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
+-----+----------------+---------------------+-------------------------+------------------+
https://support.google.com/webmasters/answer/1061943?hl=en
Google crawlers
"Crawler" is a generic term for any program (such as a robot or spider) used to automatically discover and scan websites by following links from one webpage to another. Google's main crawler is called Googlebot. This table lists information about the common Google crawlers you may see in your referrer logs, and how they should be specified in robots.txt, the robots meta tags, and the X-Robots-Tag HTTP directives.
| Crawler | User agent token | Full user agent string (as seen in website log files) |
|---|---|---|
| Googlebot (Google Web search) | Googlebot |
Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)or (rarely used): Googlebot/2.1 (+http://www.google.com/bot.html) |
| Googlebot News | Googlebot-News( Googlebot) |
Googlebot-News |
| Googlebot Images | Googlebot-Image( Googlebot) |
Googlebot-Image/1.0 |
| Googlebot Video | Googlebot-Video( Googlebot) |
Googlebot-Video/1.0 |
| Google Smartphone | Googlebot |
|
| Google Mobile AdSense | Mediapartners-Google
or
|
[various mobile device types] (compatible; Mediapartners-Google/2.1; +http://www.google.com/bot.html) |
| Google AdSense | Mediapartners-GoogleMediapartners( Googlebot) |
Mediapartners-Google |
| Google AdsBot landing page quality check | AdsBot-Google |
AdsBot-Google (+http://www.google.com/adsbot.html) |
|
Google app crawler (Used to fetch resources for mobile apps, obeys AdsBot-Google robots rules.) |
AdsBot-Google-Mobile-Apps |
AdsBot-Google-Mobile-Apps |
robots.txt
Where several user-agents are recognized in the robots.txt file, Google will follow the most specific. If you want all of Google to be able to crawl your pages, you don't need a robots.txt file at all. If you want to block or allow all of Google's crawlers from accessing some of your content, you can do this by specifying Googlebot as the user-agent. For example, if you want all your pages to appear in Google search, and if you want AdSense ads to appear on your pages, you don't need a robots.txt file. Similarly, if you want to block some pages from Google altogether, blocking the user-agent Googlebot will also block all Google's other user-agents.
But if you want more fine-grained control, you can get more specific. For example, you might want all your pages to appear in Google Search, but you don't want images in your personal directory to be crawled. In this case, use robots.txt to disallow the user-agent Googlebot-image from crawling the files in your /personal directory (while allowing Googlebot to crawl all files), like this:
User-agent: Googlebot
Disallow: User-agent: Googlebot-Image
Disallow: /personal
To take another example, say that you want ads on all your pages, but you don't want those pages to appear in Google Search. Here, you'd block Googlebot, but allow Mediapartners-Google, like this:
User-agent: Googlebot
Disallow: / User-agent: Mediapartners-Google
Disallow:
robots meta tag
Some pages use multiple robots meta tags to specify directives for different crawlers, like this:
<meta name="robots" content="nofollow"><meta name="googlebot" content="noindex">
In this case, Google will use the sum of the negative directives, and Googlebot will follow both the noindex and nofollow directives. More detailed information about controlling how Google crawls and indexes your site.
Googlebot (Google Web search)的更多相关文章
- Google Web Designer – 创建引人入胜的 HTML5 网站
Google Web Designer 可以帮助你创建引人入胜,互动的基于 HTML5 的设计和动画,可以在任何设备上运行.如果你喜欢自己动手,设计背后的所有的代码都是可以手工编辑的. 虽然可视化工具 ...
- Angular JS | Closure | Google Web Toolkit | Dart | Polymer 概要汇集
AngularJS | Closure | Google Web Toolkit | Dart | Polymer GWT https://code.google.com/p/google-web-t ...
- Google Web Toolkit (GWT)怎么制作多个用户界面
Google Web Toolkit即GWT是目前基于AJAX技术开发的一个比较成功的框架包,但是其附带例程中只有单页面的实例,那么应该怎么样制作多个页面呢? 其实很简单,GWT的一个模块,就是一个页 ...
- GWT(Google Web Tookit) Eclipse Plugin的zip下载地址(同时提供GWT Designer下载地址)
按照Eclipse Help->Install new software->....(这里是官方安装文档:http://code.google.com/intl/zh-CN/eclipse ...
- Mac效率:配置Alfred web search
// 这是一篇导入进来的旧博客,可能有时效性问题. 想用搜索引擎搜东西,或者查字典时,一般的workflow是:打开浏览器-->打开搜索引擎/字典网站-->输入搜索关键字-->回车. ...
- Google Web Designer打开白屏问题的解决方案
Google Web Designer是谷歌出品的一个可视化的 HTML5 网页和广告的设计开发工具 Google Web Designer . 官网地址:https://www.google. ...
- google web design html5制作工具
Google 推出 Web Designer,帮助你做 HTML 5 设计的免费本地应用,支持 Windows 和 OS X 2013年10月1日 感谢读者 SamRaper 的提醒. ...
- 通过Google Custom Search API 进行站内搜索
今天突然想把博客的搜索改为google的站内搜索,印象中google adsense中好像提高这个站内搜索的代码,但苦逼的是google adsense帐号一直审核不通过,所以只能通过google c ...
- AdMob设计工具google web designer
一.google web designer工具中文文档: https://support.google.com/webdesigner?hl=zh-Hans#topic=3227692 我用的版本:应 ...
随机推荐
- 判断是否是IE浏览器和是否是IE11
判断是否是IE浏览器用下面这个函数, function isIE() { //ie? 是ie返回true,否则返回false if (!!window.ActiveXObject || "A ...
- cocos2d-x中的宏定义CC_PROPERTY
cocos2d-x定义了很多宏定义,帮我们提高开发效率,下面看下CC_PROPERTY, CC_PROPERTY定义 CC_PROPERTY的声明在CCPlatformMacros.h中,结构如下 # ...
- Java类的实例化的初始化过程
A a = new A(); new 创建对象过程: 1.类加载 代码验证 2.给对象在内存(堆)中分配空间(给属性赋值): 3.属性赋默认值: byte,short.int,long -&g ...
- 82. Single Number【easy】
Given 2*n + 1 numbers, every numbers occurs twice except one, find it. Example Given [1,2,2,1,3,4, ...
- 每日英语:Online Education a New Frontier in China
In a country as obsessed with education as China, it makes sense that online teaching has huge poten ...
- CentOs下安装gcc/g++/gdb
使用yum安装gcc:yum install gcc即可.使用:which gcc 查看是否安装成功 使用yum安装g++:yum install gcc-c++ 即可.使用:which g++ 查看 ...
- 一款纯css3实现的发光屏幕旋转特效
今天给大家带来一款纯css3实现的发光屏幕旋转特效.该屏幕由纯css3实现带发光旋转特效,效果图如下: 在线预览 源码下载 实现的代码. html代码: <div class="s ...
- [usb]usb otg和host
USB OTG 设备既能做主机,又能做设备.USB HOST是指主机.当OTG 插到 HOST 上,OTG 的角色 就是 device.当device 插到 OTG 上,OTG 的角色就是 HOST. ...
- 【BZOJ】1044: [HAOI2008]木棍分割(二分+dp)
http://www.lydsy.com/JudgeOnline/problem.php?id=1044 如果只求最大的最小,,直接二分就行了...可是要求方案.. 好神! 我竟然想不到! 因为我们得 ...
- 【noip模拟题】挖掘机(模拟题+精度)
这题直接模拟. 可是我挂在了最后两个点上QAQ.唯一注意的是注意精度啊...用来double后边转成整数就忘记用longlong...sad #include <cstdio> #incl ...