域名带后缀_[Python 爬虫]获取顶级域名及对应的 WHOIS Server 及 whoisservers.txt 下载...
生活随笔
收集整理的這篇文章主要介紹了
域名带后缀_[Python 爬虫]获取顶级域名及对应的 WHOIS Server 及 whoisservers.txt 下载...
小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.
使用 Python 爬蟲獲取頂級域名及對應(yīng)的 WHOIS Server 并保存可用于 WhoisCL.exe 的文件 whois-servers.txt。
環(huán)境:
from bs4 import BeautifulSoup
iurl = 'https://www.iana.org/domains/root/db'
res = requests.get(iurl, timeout=600)
res.encoding = 'utf-8'
soup = BeautifulSoup(res.text, 'html.parser')
list1 = []
list2 = []
jsonStr = {}
for tag in soup.find_all('span', class_='domain tld'):
d_suffix = tag.get_text()
print(d_suffix)獲取頂級域名及對應(yīng)的 WHOIS Server 并保存可用于 WhoisCL.exe 的文件 whois-servers.txtimport requests
from bs4 import BeautifulSoup
import re
import time
iurl = 'https://www.iana.org/domains/root/db'
res = requests.get(iurl, timeout=600)
res.encoding = 'utf-8'
soup = BeautifulSoup(res.text, 'html.parser')
list1 = []
list2 = []
jsonStr = {}
for tag in soup.find_all('span', class_='domain tld'):
d_suffix = tag.get_text()
print(d_suffix)
list2.append(d_suffix)
n_suffix = d_suffix.split('.')[1]
new_url = iurl + '/' + n_suffix
server = ''
try:
res2 = requests.get(new_url, timeout=600)
res2.encoding = 'utf-8'
soup2 = BeautifulSoup(res2.text, 'html.parser') retxt = re.compile(r'WHOIS Server: (.*?)\n') arr = retxt.findall(res2.text) if len(arr) > 0: server = arr[0] list2.append(server) print(server) time.sleep(1) except Exception as e: print('超時(shí)') with open('whois-servers.txt', "a", encoding='utf-8') as my_file: my_file.write(n_suffix + " " + server+'\n')
print('抓取結(jié)束')
環(huán)境:
Windows 10
Python 3.9.1
from bs4 import BeautifulSoup
iurl = 'https://www.iana.org/domains/root/db'
res = requests.get(iurl, timeout=600)
res.encoding = 'utf-8'
soup = BeautifulSoup(res.text, 'html.parser')
list1 = []
list2 = []
jsonStr = {}
for tag in soup.find_all('span', class_='domain tld'):
d_suffix = tag.get_text()
print(d_suffix)獲取頂級域名及對應(yīng)的 WHOIS Server 并保存可用于 WhoisCL.exe 的文件 whois-servers.txtimport requests
from bs4 import BeautifulSoup
import re
import time
iurl = 'https://www.iana.org/domains/root/db'
res = requests.get(iurl, timeout=600)
res.encoding = 'utf-8'
soup = BeautifulSoup(res.text, 'html.parser')
list1 = []
list2 = []
jsonStr = {}
for tag in soup.find_all('span', class_='domain tld'):
d_suffix = tag.get_text()
print(d_suffix)
list2.append(d_suffix)
n_suffix = d_suffix.split('.')[1]
new_url = iurl + '/' + n_suffix
server = ''
try:
res2 = requests.get(new_url, timeout=600)
res2.encoding = 'utf-8'
soup2 = BeautifulSoup(res2.text, 'html.parser') retxt = re.compile(r'WHOIS Server: (.*?)\n') arr = retxt.findall(res2.text) if len(arr) > 0: server = arr[0] list2.append(server) print(server) time.sleep(1) except Exception as e: print('超時(shí)') with open('whois-servers.txt', "a", encoding='utf-8') as my_file: my_file.write(n_suffix + " " + server+'\n')
print('抓取結(jié)束')
whois-servers.txt 在公眾號中回復(fù)
whois-servers.txt總結(jié)
以上是生活随笔為你收集整理的域名带后缀_[Python 爬虫]获取顶级域名及对应的 WHOIS Server 及 whoisservers.txt 下载...的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: mysql系统属性_mysql 显示表字
- 下一篇: python中csv模块是自带的吗_py