日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 编程资源 > 编程问答 >内容正文

编程问答

改进 网站资源探测工具(添加代理)

發布時間:2024/4/17 编程问答 33 豆豆
生活随笔 收集整理的這篇文章主要介紹了 改进 网站资源探测工具(添加代理) 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

對網站資源探測工具進行調整,并且添加代理,防止多次訪問ip被封的情況。

#獲取代理,并寫入agents列 def agent_list(url):global agent_listsagent_lists = []header = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:57.0) Gecko/20100101 Firefox/57.0'}r = requests.get(url,headers = header)agent_info = BeautifulSoup(r.content,'html.parser').find(id = "ip_list").find_all('tr')[1:]for i in range(len(agent_info)):info = agent_info[i].find_all('td')agents = {info[5].string : 'http://' + info[1].string}agent_lists.append(agents)

修改后的 網站資源掃描工具:

#!/usr/bin/env python3 # -*- coding: utf-8 -*- # by 默不知然 2018-03-15import threading from threading import Thread from bs4 import BeautifulSoup import time import sys import requests import getopt import random#創建類,并對目標網站發起請求 class scan_thread (threading.Thread):global real_url_listreal_url_list = []def __init__(self,url):threading.Thread.__init__(self)self.url = urldef run(self):try:proxy = random.sample(agent_lists,1)[0]r = requests.get(self.url, proxies = proxy)print(self.url,'------->>',str(r.status_code))if int(r.status_code) == 200:real_url_list.append(self.url)l[0] = l[0] - 1except Exception as e:print(e)#獲取字典并構造url并聲明掃描線程 def url_makeup(dicts,url,threshold):global url_listglobal lurl_list = []l =[]l.append(0)dic = str(dicts)with open (dic,'r') as f:code_list = f.readlines()for i in code_list:url_list.append(url+i.replace('\r','').replace('\n',''))while len(url_list):try:if l[0] < threshold:n = url_list.pop(0)l[0] = l[0] + 1thread = scan_thread(n)thread.start()except KeyboardInterrupt:print('用戶停止了程序,完成目錄掃描。')sys.exit()#獲取輸入參數 def get_args():global get_urlglobal get_dictsglobal get_thresholdtry:options,args = getopt.getopt(sys.argv[1:],"u:f:n:")except getopt.GetoptError:print("錯誤參數")sys.exit()for option,arg in options:if option == '-u':get_url = argif option == '-f':get_dicts = argif option == '-n':get_threshold = int(arg) #獲取代理,并寫入agents列 def agent_list(url):global agent_listsagent_lists = []header = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:57.0) Gecko/20100101 Firefox/57.0'}r = requests.get(url,headers = header)agent_info = BeautifulSoup(r.content,'html.parser').find(id = "ip_list").find_all('tr')[1:]for i in range(len(agent_info)):info = agent_info[i].find_all('td')agents = {info[5].string : 'http://' + info[1].string}agent_lists.append(agents)#主函數,運行掃描程序 def main():agent_url = 'http://www.xicidaili.com/nn/1'agent_list(agent_url)get_args()url = get_urldicts = get_dictsthreshold = get_thresholdurl_makeup(dicts,url,threshold)time.sleep(0.5)print('目標網站存在目錄: ','\n', real_url_list)with open(r'網站目錄掃描結果.txt','w') as f:for i in real_url_list:f.write(i)f.write('\n')if __name__ == '__main__':main()

?

對某網站掃描結果:

?

轉載于:https://www.cnblogs.com/kaiho/p/8578848.html

總結

以上是生活随笔為你收集整理的改进 网站资源探测工具(添加代理)的全部內容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。