ROBOTS.TXT屏蔽笔记、代码、示例大全

自己网站的ROBOTS.TXT屏蔽的记录，以及一些代码和示例：

屏蔽后台目录，为了安全，做双层管理后台目录/a/xxxx/，蜘蛛屏蔽/a/，既不透露后台路径，也屏蔽蜘蛛爬后台目录

缓存，阻止蜘蛛爬静态缓存文件

下载，阻止蜘蛛爬下载目录，若无用，删除下载目录

编辑器，阻止蜘蛛爬编辑器，也防止编辑器目录被发现产生安全隐患

邮件，阻止蜘蛛爬静态邮件模板

其他页面，无收录价值页面屏蔽

图片，阻止蜘蛛爬除JPG/jpg类文件之外的任何类型图片

核心文件目录，阻止蜘蛛直接爬include及其子目录（函数/类库/模型/模板等）

媒体目录，阻止爬播放类型媒体目录，若无用，删除该目录

附加参数页面，阻止蜘蛛爬带参数的页面

RAR ZIP GZ文件类型

无效蜘蛛、恶意蜘蛛屏蔽

指定sitemap.xml位置

目录屏蔽：

User-agent: *

Disallow: /a/

Disallow: /cache/

Disallow: /download/

Disallow: /editors/

Disallow: /email/

Disallow: /extras/

Disallow: /images/

Disallow: /includes/

Disallow: /media/

Disallow: /pub/

Disallow: /nddbc.html

Disallow: /page_not_found.php

Disallow: /login.html

Disallow: /privacy.html

Disallow: /conditions.html

Disallow: /contact_us.html

Disallow: /gv_faq.html

Disallow: /discount_coupon.html

Disallow: /unsubscribe.html

Disallow: /shopping_cart.html

Disallow: /ask_a_question.html

Disallow: /popup_image_additional.html

Disallow: /product_reviews_write.html

Disallow: /tell_a_friend.html

Disallow: /pages-popup_image.html

Disallow: /popup_image_additional.html

Disallow: /login.html

阻止蜘蛛爬非jpg图片（限制产品图片格式为jpg）

User-agent: Googlebot

Allow: .jpg$

Disallow: .jpeg$

Disallow: .gif$

Disallow: .png$

Disallow: .bmp$

阻止蜘蛛爬压缩文件

User-agent: *

Disallow: .zip$

Disallow: .rar$

Disallow: .gz$

Disallow: .tar $

制定sitemap地址

Sitemap: http://www.xxx.jp/sitemap.xml

其他无效蜘蛛、恶意蜘蛛屏蔽：

User-Agent: almaden

Disallow: /

User-Agent: ASPSeek

Disallow: /

User-Agent: Axmo

Disallow: /

User-Agent: BaiduSpider

Disallow: /

User-Agent: booch

Disallow: /

User-Agent: DTS Agent

Disallow: /

User-Agent: Downloader

Disallow: /

User-Agent: EmailCollector

Disallow: /

User-Agent: EmailSiphon

Disallow: /

User-Agent: EmailWolf

Disallow: /

User-Agent: Expired Domain Sleuth

Disallow: /

User-Agent: Franklin Locator

Disallow: /

User-Agent: Gaisbot

Disallow: /

User-Agent: grub

Disallow: /

User-Agent: HughCrawler

Disallow: /

User-Agent: iaea.org

Disallow: /

User-Agent: lcabotAccept

Disallow: /

User-Agent: IconSurf

Disallow: /

User-Agent: Iltrovatore-Setaccio

Disallow: /

User-Agent: Indy Library

Disallow: /

User-Agent: IUPUI

Disallow: /

User-Agent: Kittiecentral

Disallow: /

User-Agent: iaea.org

Disallow: /

User-Agent: larbin

Disallow: /

User-Agent: lwp-trivial

Disallow: /

User-Agent: MetaTagRobot

Disallow: /

User-Agent: Missigua Locator

Disallow: /

User-Agent: NetResearchServer

Disallow: /

User-Agent: NextGenSearch

Disallow: /

User-Agent: NPbot

Disallow: /

User-Agent: Nutch

Disallow: /

User-Agent: ObjectsSearch

Disallow: /

User-Agent: Oracle Ultra Search

Disallow: /

User-Agent: PEERbot

Disallow: /

User-Agent: PictureOfInternet

Disallow: /

User-Agent: PlantyNet

Disallow: /

User-Agent: QuepasaCreep

Disallow: /

User-Agent: ScSpider

Disallow: /

User-Agent: SOFT411

Disallow: /

User-Agent: spider.acont.de

Disallow: /

User-Agent: Sqworm

Disallow: /

User-Agent: SSM Agent

Disallow: /

User-Agent: TAMU

Disallow: /

User-Agent: TheUsefulbot

Disallow: /

User-Agent: TurnitinBot

Disallow: /

User-Agent: Tutorial Crawler

Disallow: /

User-Agent: TutorGig

Disallow: /

User-Agent: WebCopier

Disallow: /

User-Agent: WebZIP

Disallow: /

User-Agent: ZipppBot

Disallow: /

User-Agent: Xenu

Disallow: /

User-Agent: Wotbox

Disallow: /

User-Agent: Wget

Disallow: /

User-Agent: NaverBot

Disallow: /

User-Agent: mozDex

Disallow: /

User-Agent: Sosospider

Disallow: /

User-Agent: Baidupider

Disallow: /

ROBOTS.TXT屏蔽笔记、代码、示例大全的更多相关文章

如何禁止同IP站点查询和同IP站点查询的原理分析 Robots.txt屏蔽BINGBOT
很多站长工具中都有“同IP站点查询”.“IP反查域名”这种服务不少人都不知道是什么原理,其实这些服务几乎都是用BING(以前的LIVE)来实现的,BING有个特别功能 BING抓取页面时会把站点的I ...
dedecms 蜘蛛抓取设置 robots.txt
我们可以用robots.txt屏蔽蜘蛛文件来跟蜘蛛来达成一个协议,但现在很少注重,其实用好robots.txt屏蔽蜘蛛文件,能给你的网站提高权重,接下来重庆SEO讲一下robots.txt屏蔽蜘蛛文件 ...
(转载)robots.txt写法大全和robots.txt语法的作用
1如果允许所有搜索引擎访问网站的所有部分的话我们可以建立一个空白的文本文档,命名为robots.txt放在网站的根目录下即可.robots.txt写法如下:User-agent: *Disallow ...
笔记-爬虫-robots.txt
笔记-爬虫-robots.txt 1. robots.txt文件简介 1.1. 是什么 robots.txt是用来告诉搜索引擎网站上哪些内容可以被访问.哪些不能被访问.当搜索引擎访问一 ...
robots.txt写法大全和robots.txt语法的作用
1如果允许所有搜索引擎访问网站的所有部分的话我们可以建立一个空白的文本文档,命名为robots.txt放在网站的根目录下即可.robots.txt写法如下:User-agent: *Disallow ...
出行服务类API调用的代码示例合集：长途汽车查询、车型大全、火车票查询等
以下示例代码适用于 www.apishop.net 网站下的API,使用本文提及的接口调用代码示例前,您需要先申请相应的API服务. 长途汽车查询:全国主要城市的长途汽车时刻查询,汽车站查询车型大全 ...
2018-12-09 疑似bug_中文代码示例之Programming in Scala笔记第九十章
续前文: 中文代码示例之Programming in Scala笔记第七八章源文档库: program-in-chinese/Programming_in_Scala_study_notes_zh ...
2018-11-27 中文代码示例之Programming in Scala笔记第七八章
续前文: 中文代码示例之Programming in Scala学习笔记第二三章中文代码示例之Programming in Scala笔记第四五六章. 同样仅节选有意思的例程部分作演示之用. 源文档 ...
2018-11-16 中文代码示例之Programming in Scala笔记第四五六章
续前文: 中文代码示例之Programming in Scala学习笔记第二三章. 同样仅节选有意思的例程部分作演示之用. 源文档仍在: program-in-chinese/Programming_ ...

随机推荐

【转】Nginx windows下搭建过程
Nginx windows下搭建过程内容列表: 简要介绍下载安装配置测试一.简要介绍 Nginx ("engine x") 是一个高性能的 HTTP 和反向代理服务器, ...
C#中struct与class的区别详解
转自:http://blog.csdn.net/justlovepro/archive/2007/11/02/1863734.aspx 有这么几点不同: 1.struct 是值类型,class是对象类 ...
Visual Studio 2015 Owin+MVC+WebAPI+ODataV4+EntityFrawork+Identity+Oauth2.0+AngularJS 1.x 学习笔记之"坑"
1.AngularJS route 与 MVC route http://www.cnblogs.com/usea/p/4211989.html public class SingleRoute : ...
CA
http://www.cmca.net/index.php?option=com_content&view=article&id=55&Itemid=16
uva 10026 Problem C: Edit Step Ladders
http://uva.onlinejudge.org/index.php?option=com_onlinejudge&Itemid=8&page=show_problem&p ...
Largest product in a grid
这个比前面的要复杂点,但找对了规律,还是可以的. 我逻辑思维不强,只好画图来数数列的下标了. 分四次计算,存入最大值. 左右一次,上下一次,左斜一次,右斜一次. In the 2020 grid be ...
Linux下如何发布Qt程序
在X11平台下qt程序,首先准备好程序中需要使用的资源,库和插件... 比如你的可运行程序取名叫作panel,那把你的panel,那些libQt*.so.4和libQt*.so.4.6.0(链接 ...
delphi编写dll心得，谢谢原作者的分享。转
delphi编写dll心得 1.每个函数体(包括exports和非exports函数)后面加 'stdcall;', 以编写出通用的dll2.exports函数后面必须加'export;'(放在'st ...
数据加密算法---base64
简介 base64是把8位字符打散,转换成不被人直接识别的形式,严格来说它并不是加密算法,只能算做一种编码方式原理首先准备64个字符数组做为“数组库” ['A', 'B', 'C', ... 'a ...
iOS中XMPP简单聊天实现好友和聊天
版权声明本文由陈怀哲首发自简书:http://www.jianshu.com/users/9f2e536b78fd/latest_articles;微信公众号:陈怀哲(chenhuaizhe2016) ...

ROBOTS.TXT屏蔽笔记、代码、示例大全

ROBOTS.TXT屏蔽笔记、代码、示例大全的更多相关文章

随机推荐

热门专题