Googlebot (Google Web search)
w推测“域名解析过程中,Google crawlers中首先是Googlebo中的Google Web search上阵”。
+-----+----------------+---------------------+-------------------------+------------------+
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 119.147.32.253 | -- :: | Unidentified User Agent | |
| | 183.57.53.197 | -- :: | Mozilla 5.0 | iOS |
| | 123.56.233.103 | -- :: | Unidentified User Agent | |
| | 112.90.142.207 | -- :: | Firefox 3.0 | Windows XP |
| | 183.232.120.37 | -- :: | Firefox 3.0 | Windows XP |
| | 117.136.40.218 | -- :: | ZTE | Android |
| | 117.136.40.218 | -- :: | ZTE | Android |
| | 117.136.40.218 | -- :: | ZTE | Android |
| | 117.136.40.218 | -- :: | ZTE | Android |
| | 117.136.40.218 | -- :: | Safari 534.30 | Android |
| | 117.136.40.218 | -- :: | Safari 534.30 | Android |
| | 117.136.40.218 | -- :: | Chrome 37.0.0.0 | Android |
| | 117.136.40.218 | -- :: | Chrome 37.0.0.0 | Android |
| | 117.136.40.218 | -- :: | Chrome 37.0.0.0 | Android |
| | 117.136.40.218 | -- :: | Chrome 37.0.0.0 | Android |
| | 117.136.40.218 | -- :: | Chrome 55.0.2883.87 | Windows |
| | 177.193.53.212 | -- :: | Googlebot | Unknown Platform |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 139.162.108.53 | -- :: | Chrome 50.0.2661.102 | Windows |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 61.142.176.19 | -- :: | Firefox 3.6. | Windows |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 23.251.63.45 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 61.142.176.20 | -- :: | Unidentified User Agent | Unknown Platform |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 23.251.63.45 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 23.251.63.45 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 23.251.63.45 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 125.39.207.33 | -- :: | Unidentified User Agent | Unknown Platform |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 183.60.48.110 | -- :: | Unidentified User Agent | Unknown Platform |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 101.226.51.229 | -- :: | Chrome 45.0.2454.101 | Windows XP |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
| | 111.251.93.170 | -- :: | Unidentified User Agent | |
+-----+----------------+---------------------+-------------------------+------------------+
https://support.google.com/webmasters/answer/1061943?hl=en
Google crawlers
"Crawler" is a generic term for any program (such as a robot or spider) used to automatically discover and scan websites by following links from one webpage to another. Google's main crawler is called Googlebot. This table lists information about the common Google crawlers you may see in your referrer logs, and how they should be specified in robots.txt, the robots meta tags, and the X-Robots-Tag HTTP directives.
| Crawler | User agent token | Full user agent string (as seen in website log files) |
|---|---|---|
| Googlebot (Google Web search) | Googlebot |
Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)or (rarely used): Googlebot/2.1 (+http://www.google.com/bot.html) |
| Googlebot News | Googlebot-News( Googlebot) |
Googlebot-News |
| Googlebot Images | Googlebot-Image( Googlebot) |
Googlebot-Image/1.0 |
| Googlebot Video | Googlebot-Video( Googlebot) |
Googlebot-Video/1.0 |
| Google Smartphone | Googlebot |
|
| Google Mobile AdSense | Mediapartners-Google
or
|
[various mobile device types] (compatible; Mediapartners-Google/2.1; +http://www.google.com/bot.html) |
| Google AdSense | Mediapartners-GoogleMediapartners( Googlebot) |
Mediapartners-Google |
| Google AdsBot landing page quality check | AdsBot-Google |
AdsBot-Google (+http://www.google.com/adsbot.html) |
|
Google app crawler (Used to fetch resources for mobile apps, obeys AdsBot-Google robots rules.) |
AdsBot-Google-Mobile-Apps |
AdsBot-Google-Mobile-Apps |
robots.txt
Where several user-agents are recognized in the robots.txt file, Google will follow the most specific. If you want all of Google to be able to crawl your pages, you don't need a robots.txt file at all. If you want to block or allow all of Google's crawlers from accessing some of your content, you can do this by specifying Googlebot as the user-agent. For example, if you want all your pages to appear in Google search, and if you want AdSense ads to appear on your pages, you don't need a robots.txt file. Similarly, if you want to block some pages from Google altogether, blocking the user-agent Googlebot will also block all Google's other user-agents.
But if you want more fine-grained control, you can get more specific. For example, you might want all your pages to appear in Google Search, but you don't want images in your personal directory to be crawled. In this case, use robots.txt to disallow the user-agent Googlebot-image from crawling the files in your /personal directory (while allowing Googlebot to crawl all files), like this:
User-agent: Googlebot
Disallow: User-agent: Googlebot-Image
Disallow: /personal
To take another example, say that you want ads on all your pages, but you don't want those pages to appear in Google Search. Here, you'd block Googlebot, but allow Mediapartners-Google, like this:
User-agent: Googlebot
Disallow: / User-agent: Mediapartners-Google
Disallow:
robots meta tag
Some pages use multiple robots meta tags to specify directives for different crawlers, like this:
<meta name="robots" content="nofollow"><meta name="googlebot" content="noindex">
In this case, Google will use the sum of the negative directives, and Googlebot will follow both the noindex and nofollow directives. More detailed information about controlling how Google crawls and indexes your site.
Googlebot (Google Web search)的更多相关文章
- Google Web Designer – 创建引人入胜的 HTML5 网站
Google Web Designer 可以帮助你创建引人入胜,互动的基于 HTML5 的设计和动画,可以在任何设备上运行.如果你喜欢自己动手,设计背后的所有的代码都是可以手工编辑的. 虽然可视化工具 ...
- Angular JS | Closure | Google Web Toolkit | Dart | Polymer 概要汇集
AngularJS | Closure | Google Web Toolkit | Dart | Polymer GWT https://code.google.com/p/google-web-t ...
- Google Web Toolkit (GWT)怎么制作多个用户界面
Google Web Toolkit即GWT是目前基于AJAX技术开发的一个比较成功的框架包,但是其附带例程中只有单页面的实例,那么应该怎么样制作多个页面呢? 其实很简单,GWT的一个模块,就是一个页 ...
- GWT(Google Web Tookit) Eclipse Plugin的zip下载地址(同时提供GWT Designer下载地址)
按照Eclipse Help->Install new software->....(这里是官方安装文档:http://code.google.com/intl/zh-CN/eclipse ...
- Mac效率:配置Alfred web search
// 这是一篇导入进来的旧博客,可能有时效性问题. 想用搜索引擎搜东西,或者查字典时,一般的workflow是:打开浏览器-->打开搜索引擎/字典网站-->输入搜索关键字-->回车. ...
- Google Web Designer打开白屏问题的解决方案
Google Web Designer是谷歌出品的一个可视化的 HTML5 网页和广告的设计开发工具 Google Web Designer . 官网地址:https://www.google. ...
- google web design html5制作工具
Google 推出 Web Designer,帮助你做 HTML 5 设计的免费本地应用,支持 Windows 和 OS X 2013年10月1日 感谢读者 SamRaper 的提醒. ...
- 通过Google Custom Search API 进行站内搜索
今天突然想把博客的搜索改为google的站内搜索,印象中google adsense中好像提高这个站内搜索的代码,但苦逼的是google adsense帐号一直审核不通过,所以只能通过google c ...
- AdMob设计工具google web designer
一.google web designer工具中文文档: https://support.google.com/webdesigner?hl=zh-Hans#topic=3227692 我用的版本:应 ...
随机推荐
- widnows 使用WIN32 APi 实现修改另一打开程序的窗口显示方式
1.GUI点击打开一个程序那边做一个判断. hwnd = 获取目标程序窗口句柄: if(hwnd == NULL /*不存在目标程序窗口句柄*/){ 创建进程,打开目标程序: } else{ ...
- cf339d Xenia and Bit Operations
Xenia and Bit Operations Time Limit:2000MS Memory Limit:262144KB 64bit IO Format:%I64d & ...
- DLL编写中extern “C”和__stdcall的作用
动态链接库的使用有两种方式,一种是显式调用.一种是隐式调用. (1) 显式调用:使用LoadLibrary载入动态链接库.使用GetProcAddress获取某函数地址. (2) ...
- aix 常用命令
官网上的介绍: AIX 常用命令汇总 http://www.ibm.com/developerworks/cn/aix/library/au-dutta_cmds.html 我们先SSH 到AIX 系 ...
- STM32F1_常见外设资源汇总
前言 STM32F1系列芯片算是在STM32中最早的一系列,在实际生活中应用的比较广泛.因此,汇总一下STM32F1系列芯片常见片内资源,每一篇文章把重点提出来讲解,并提供软件源代码工程. 汇总常见资 ...
- zombie处理
僵尸进程处理 程序处理(预处理) 父进程wait/waitpid. signal(SIGCHLD, SIG_IGN); 捕捉SIGCHLD,signal(SIGCHLD, handler);可获取子进 ...
- zend stdio 快捷键
1.快速跳转到当前所指的函数.变量.方法.类的定义处 F3或者 ctrl+鼠标左键2.ctrl+m 编辑窗口最大化3.ctrl+d 删除当前行4.ctrl+q 定位到最后编辑的地方(全局的)5.ctr ...
- 基于Ambari构建自己的大数据平台产品
目前市场上常见的企业级大数据平台型的产品主流的有两个,一个是Cloudera公司推出的CDH,一个是Hortonworks公司推出的一套HDP,其中HDP是以开源的Ambari作为一个管理监控工具,C ...
- AD smart pdf 中文丢失
Altium Designer将原理图通过smart pdf导出,原理图中的中文丢失了. 将原理图中的所有中文字体改为宋体即可. 百度知道上的也有说: 打开软件后,点击左上角的[DXP]→[Prefe ...
- 去死吧!USB转串口!!!
首先,这个题目有两种歧义:1.USB转232串口(严格说就是这种)! 2.USB转USART串口(通常都是这么叫,认为就是这,理论上是错误的,歧义所在)! USB转TTL.USB转232.USB转串口 ...