批量文本读取URL获取正常访问且保留对应IP

#coding=utf-8

import sys

import requests

for i in range(3000,4999,1):

  url = 'http://192.168.88.139:8888/20150602'+str(i)+'.html'

  r = requests.get(url)

  if r.status_code == 200:

    print url

    print r.content

原文

http://zone.wooyun.org/content/20885

多线程+文本逐行读取+URL的IP转换+写入

# -*-coding:utf-8-*-

import os

import sys

import Queue

import getopt

import logging

import requests

import threading

import time

import socket

print "start:" + (time.strftime("%H:%M:%S"))

logging.basicConfig(

    level=logging.WARNING,

    format="[%(asctime)s] %(message)s"

)

class BatchThreads(threading.Thread):

    def __init__(self, queue):

        super(BatchThreads, self).__init__()

        self.queue = queue

    def run(self):

        while True:

            if self.queue.empty():

                break

            else:

                tempurl = self.queue.get()

                #print tempurl

                try:

                    url = 'http://'+tempurl

                    #print url

                    r = requests.get(url, timeout=5)

                    if r.status_code == 200 :

                        print url+' '+'access-comman:200'

                        #print tempurl

                        ip = socket.gethostbyname(tempurl)

                        #print ip

                        yes = open('yes.txt','a')

                        yes.write(url+'    ')

                        yes.write('    '+ip+'\n')

                        yes.close()

                except:

                    pass

                    print url+" error"

                    noaccess = open('noaccess.txt','a')

                    noaccess.write(url+'\n')

                    noaccess.close()

def batch_queue(_queue, _thread_number):

    with open('url-hz.txt') as f:

        urls = [line.strip() for line in f.readlines()]

    urls = set(filter(lambda url: url and not url.startswith("#"), urls))

    if urls:

        for url in urls:

            queue.put(url)

        if _thread_number > (queue.qsize() / 2):

            _thread_number = (queue.qsize())

        for _ in xrange(_thread_number):

            threads.append(BatchThreads(_queue))

        for t in threads:

            t.start()

        for t in threads:

            t.join()

threads = []

queue = Queue.Queue()

thread_number = 20

batch_queue(queue, thread_number)

print"end:" + (time.strftime("%H:%M:%S"))

批量文本读取URL获取正常访问且保留对应IP的更多相关文章

js进阶ajax读取json数据（ajax读取json和读取普通文本，和获取服务器返回数据（链接）都是一样的，在url处放上json文件的地址即可）
js进阶ajax读取json数据(ajax读取json和读取普通文本,和获取服务器返回数据(链接)都是一样的,在url处放上json文件的地址即可) 一.总结 ajax读取json和读取普通文本,和获 ...
java正则读取html 获取标题/超链接/链接文本/内容
java正则读取html 获取标题/超链接/链接文本/内容参考链接:http://yijianfengvip.blog.163.com/blog/static/175273432201142785 ...
js javascript 获取url，获得当前页面的url，静态html文件js读取url参数
获得当前页面的url window.location.href 静态html文件js读取url参数 location.search; //获取url中"?"符后的字串下边为转载的 ...
PHP 获取当前访问的完整URL
代码如下: <?php // php 获取当前访问的完整url function GetCurUrl() { $url = 'http://'; if(isset($_SERVER['HTTPS ...
Linux分析日志获取最多访问的前10个IP
原文地址:http://xuqq999.blog.51cto.com/3357083/774714 apache日志分析可以获得很多有用的信息,现在来试试最基本的,获取最多访问的前10个IP地址及访问 ...
linux分析apache日志获取最多访问的前10个IP
apache日志分析可以获得很多有用的信息,现在来试试最基本的,获取最多访问的前10个IP地址及访问次数. 既然是统计,那么awk是必不可少的,好用而高效. 命令如下: awk '{a[$1] += ...
更改一个链接的文本、URL 以及 target
<html> <head> <script type="text/javascript"> function changeLink() { do ...
根据URL获取图片
背景:今天因为生产环境的系统界面图片无法显示被领导叼了一波,之前用Hutool工具类解析URL获取图片的,在生产环境上跑了一个多月都正常,嘣,今天突然发现周六下午后的图片统统显示异常,之后改为用jav ...
Java从URL获取PDF内容
Java直接URL获取PDF内容题外话网上很多Java通过pdf转 HTML,转文本的,可是通过URL直接获取PDF内容,缺没有,浪费时间,本人最近工作中刚好用到,花了时间整理下,分享出来,防止浪 ...

随机推荐

[转]C++ string的trim, split方法
很多其他语言的libary都会有去除string类的首尾空格的库函数,但是标准C++的库却不提供这个功能.但是C++string也提供很强大的功能,实现trim这种功能也不难.下面是几种方法: 1.使 ...
phpcms v9 模板调用代码大全
另:每个栏目会对应当前所选模型的三个模板文件: 这些模板文件所在位置:phpcms/templates/default/content/ 目录下,如果想修改模板文件,只需要到此目录下找到对应的模板文 ...
SQL backup&restore
--完整备份Backup Database NorthwindCSTo disk='G:\Backup\NorthwindCS_Full_20070908.bak' --差异备份Backup Data ...
MyBatis3资料网址
官网: http://mybatis.github.io/mybatis-3/zh/index.html 资料: http://www.open-open.com/doc/list/112?o=d 整 ...
20145211 《Java程序设计》第6周学习总结——三笑徒然当一痴
教材学习内容总结 I/O--InputStream与OutStream Java中I/O操作主要是指使用Java进行输入,输出操作.这与c++中的iostream并无太大区别. Java所有的I/O机 ...
pod》error:The dependency `` is not used in any concrete target
内容提要: podfile升级之后到最新版本,pod里的内容必须明确指出所用第三方库的target,否则会出现The dependency `` is not used in any concrete ...
Keep Alive
跳板机时经常出现连接被断开的情况.如果发生这种情况,请在客户端配置Keep Alive设置,具体方法参考如下: Windows: secureCRT:Properties -> Terminal ...
iOS FMDB官方使用文档 G-C-D的使用提高性能（翻译）（转）
由于FMDB是建立在SQLite的之上的,所以你至少也该把这篇文章从头到尾读一遍.与此同时,把SQLite的文档页 http://www.sqlite.org/docs.html 加到你的书签中.自动 ...
zabbix监控MySQL
通过使用mysql_performance_monitor软件包实现zabbix对mysql的监控. 1.安装依赖软件.yum install perl-File-Which perl-libwww- ...
使用Docker搭建consul集群+registrator实现服务自动注册。
准备工作:10.173.16.83 master10.172.178.76 node110.171.19.139 node210.162.204.252 node3 一.安装consul-cluste ...

批量文本读取URL获取正常访问且保留对应IP

批量文本读取URL获取正常访问且保留对应IP的更多相关文章

随机推荐

热门专题