日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當(dāng)前位置: 首頁 > 编程语言 > python >内容正文

python

chrome动态ip python_用Python爬虫爬取动态网页,附带完整代码,有错误欢迎指出!...

發(fā)布時間:2024/4/13 python 49 豆豆
生活随笔 收集整理的這篇文章主要介紹了 chrome动态ip python_用Python爬虫爬取动态网页,附带完整代码,有错误欢迎指出!... 小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.

系統(tǒng)環(huán)境:

操作系統(tǒng):Windows8.1專業(yè)版 64bit Python:anaconda、Python2.7 Python modules:requests、random、json

Background:

對于靜態(tài)網(wǎng)頁,我們只需要把網(wǎng)頁地址欄中的url傳到get請求中就可以輕松地獲取到網(wǎng)頁的數(shù)據(jù)。但是,我們經(jīng)常會遇到直接把網(wǎng)頁地址欄中的url傳到get請求中無法直接獲取到網(wǎng)頁的數(shù)據(jù)的情況,而且右鍵查看網(wǎng)頁源代碼也無法看到網(wǎng)頁的數(shù)據(jù),同時點擊第二頁、第三頁等進行翻頁的時候,網(wǎng)頁地址欄中的url也沒變,這些就是動態(tài)網(wǎng)頁,例如:

自律監(jiān)管措施 - 全國中小企業(yè)股份轉(zhuǎn)讓系統(tǒng)?www.neeq.com.cn

解決辦法:

對于動態(tài)網(wǎng)頁抓取的關(guān)鍵是先分析網(wǎng)頁數(shù)據(jù)獲取和跳轉(zhuǎn)的邏輯,再去寫代碼。接下來,將以上面的那個網(wǎng)頁為例,介紹如何利用Python來爬取動態(tài)網(wǎng)頁的數(shù)據(jù)。

1、分析網(wǎng)頁數(shù)據(jù)請求和跳轉(zhuǎn)的邏輯:

如上圖所示,我們打開網(wǎng)頁之后,按“F12”進入chrome瀏覽器的開發(fā)工具,點擊“Network”->XHR(有時候是JS),然后我們點擊上面的頁面跳轉(zhuǎn)欄的“2”跳轉(zhuǎn)到第二頁,然后我們可以看到開發(fā)工具左邊的框里出現(xiàn)了一個新的請求,即左下圖的最下面那一行(藍(lán)色那條),我們用鼠標(biāo)點擊它,就可以在右邊顯示出該請求的headers的相關(guān)信息。在Headers中我們可以知道:Requests URL就是該網(wǎng)頁真正請求的URL,而且由Request Method可以知道這是一個post請求,而下面的Request Headers就是該請求所需要設(shè)置的headers參數(shù)。因為這是一個post請求,所以我們要查看一下post請求提交了那些數(shù)據(jù),所以我們可以在右邊的Headers中繼續(xù)往下拉來查看。

所以由上圖的Form Data我們可以知道,post請求上傳了兩個關(guān)鍵的數(shù)據(jù):disclosureType和page,到此我們就成功地分析了該動態(tài)網(wǎng)頁數(shù)據(jù)請求和跳轉(zhuǎn)的邏輯,接下來通過編程來實現(xiàn)爬取該網(wǎng)頁的數(shù)據(jù)。

2、Coding:

# -*- coding: utf-8 -*- """ Created on Tue May 01 18:52:49 2018 @author: gmn """ #導(dǎo)入requests module import requests #導(dǎo)入random module import random #導(dǎo)入json module import json# ============================================================================= # 應(yīng)對網(wǎng)站反爬蟲的相關(guān)設(shè)置 # ============================================================================= #User-Agent列表,這個可以自己在網(wǎng)上搜到,用于偽裝瀏覽器的User Agent USER_AGENTS = ["Mozilla/5.0 (Windows; U; Windows NT 5.2) Gecko/2008070208 Firefox/3.0.1""Mozilla/5.0 (Windows; U; Windows NT 6.1; en-us) AppleWebKit/534.50 (KHTML, like Gecko) Version/5.1 Safari/534.50","Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:2.0.1) Gecko/20100101 Firefox/4.0.1","Mozilla/5.0 (Windows NT 6.1; rv:2.0.1) Gecko/20100101 Firefox/4.0.1","Opera/9.80 (Macintosh; Intel Mac OS X 10.6.8; U; en) Presto/2.8.131 Version/11.11","Opera/9.80 (Windows NT 6.1; U; en) Presto/2.8.131 Version/11.11","Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_0) AppleWebKit/535.11 (KHTML, like Gecko) Chrome/17.0.963.56 Safari/535.11","Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; .NET4.0C; .NET4.0E)","Opera/9.80 (Windows NT 5.1; U; zh-cn) Presto/2.9.168 Version/11.50","Mozilla/5.0 (Windows NT 5.1; rv:5.0) Gecko/20100101 Firefox/5.0","Mozilla/5.0 (Windows NT 5.2) AppleWebKit/534.30 (KHTML, like Gecko) Chrome/12.0.742.122 Safari/534.30","Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/536.11 (KHTML, like Gecko) Chrome/20.0.1132.11 TaoBrowser/2.0 Safari/536.11","Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.1 (KHTML, like Gecko) Chrome/21.0.1180.71 Safari/537.1 LBBROWSER","Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; .NET4.0C; .NET4.0E; LBBROWSER)","Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Trident/4.0; SV1; QQDownload 732; .NET4.0C; .NET4.0E; 360SE)","Mozilla/5.0 (Windows NT 5.1) AppleWebKit/535.11 (KHTML, like Gecko) Chrome/17.0.963.84 Safari/535.11 SE 2.X MetaSr 1.0","Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0)","Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.2)","Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)","Mozilla/4.0 (compatible; MSIE 5.0; Windows NT)","Mozilla/5.0 (Windows; U; Windows NT 5.2) Gecko/2008070208 Firefox/3.0.1","Mozilla/5.0 (Windows; U; Windows NT 5.1) Gecko/20070309 Firefox/2.0.0.3","Mozilla/5.0 (Windows; U; Windows NT 5.1) Gecko/20070803 Firefox/1.5.0.12 "] #IP地址列表,用于設(shè)置IP代理 IP_AGENTS = ["http://58.240.53.196:8080", "http://219.135.99.185:8088","http://117.127.0.198:8080","http://58.240.53.194:8080" ]#設(shè)置IP代理 proxies={"http":random.choice(IP_AGENTS)} # ============================================================================= # 上面的設(shè)置是為了應(yīng)對網(wǎng)站的反爬蟲,與具體的網(wǎng)頁爬取無關(guān) # =============================================================================# ============================================================================= # 下面這些是根據(jù)剛才第一步的分析來設(shè)置的,所以下面需要按照第一步的分析來設(shè)置對應(yīng)的參數(shù)。 # 根據(jù)第一步圖片的右下角部分來設(shè)置Cookie、url、headers和post參數(shù) # ============================================================================= #設(shè)置cookie Cookie = "Hm_lvt_b58fe8237d8d72ce286e1dbd2fc8308c=1525162758; BIGipServerNEEQ_8000-NEW=83952564.16415.0000; JSESSIONID=E50D2B8270D728502754D4330CB0E275; Hm_lpvt_b58fe8237d8d72ce286e1dbd2fc8308c=1525165761" #設(shè)置動態(tài)js的url url = 'http://www.neeq.com.cn/disclosureInfoController/infoResult.do?callback=jQuery18307528463705200819_1525173495230' #設(shè)置requests請求的 headers headers = {'User-agent': random.choice(USER_AGENTS), #設(shè)置get請求的User-Agent,用于偽裝瀏覽器UA 'Cookie': Cookie,'Connection': 'keep-alive','Accept': 'text/javascript, application/javascript, application/ecmascript, application/x-ecmascript, */*; q=0.01','Accept-Encoding': 'gzip, deflate','Accept-Language': 'zh-CN,zh;q=0.9','Host': 'www.neeq.com.cn','Referer': 'http://www.neeq.com.cn/disclosure/supervise.html' } #設(shè)置頁面索引 pageIndex=0 #設(shè)置url post請求的參數(shù) data={'page':pageIndex,'disclosureType':8}#requests post請求 req=requests.post(url,data=data,headers=headers,proxies=proxies) print(req.content) #通過打印req.content,我們可以知道post請求返回的是json數(shù)據(jù),而且該數(shù)據(jù)是一個字符串類型的 #獲取包含json數(shù)據(jù)的字符串 #str_data=req.content ##獲取json字符串?dāng)?shù)據(jù) #str_json=str_data[8:-2] #print(str_json) ##把json數(shù)據(jù)轉(zhuǎn)成dict類型 #json_Info=json.loads(str_json)

運行結(jié)果如下:

我們可以看到返回的數(shù)據(jù)req.content為json格式的數(shù)據(jù),但是json數(shù)據(jù)的前面和后面分別是"jQuery18307528463705200819_1525173495230(["和"])",所以我們要去掉這兩部分,留下中間的json格式的數(shù)據(jù)。在此之前,我們可以發(fā)現(xiàn)“jQuery18307528463705200819_1525173495230”就是我們的url參數(shù)“callback”的值,所以為了去掉jQuery后面的一大串?dāng)?shù)字,我們可以把“callback”的值改成“jQuery”(當(dāng)然你也可以改成其他的值),所以url變?yōu)?#39;

http://www.neeq.com.cn/disclosureInfoController/infoResult.do?callback=jQuery?www.neeq.com.cn

在此運行代碼,可以得到:

而且我們發(fā)現(xiàn)req.content是一個字符串類型的數(shù)據(jù),所以我們可以用:

#獲取json字符串?dāng)?shù)據(jù) str_json=str_data[8:-2]

來獲取我們需要的中間的那部分json數(shù)據(jù),此時代碼如下:

# -*- coding: utf-8 -*- """ Created on Tue May 01 18:52:49 2018 @author: gmn """ #導(dǎo)入requests module import requests #導(dǎo)入random module import random #導(dǎo)入json module import json# ============================================================================= # 應(yīng)對網(wǎng)站反爬蟲的相關(guān)設(shè)置 # ============================================================================= #User-Agent列表,這個可以自己在網(wǎng)上搜到,用于偽裝瀏覽器的User Agent USER_AGENTS = ["Mozilla/5.0 (Windows; U; Windows NT 5.2) Gecko/2008070208 Firefox/3.0.1""Mozilla/5.0 (Windows; U; Windows NT 6.1; en-us) AppleWebKit/534.50 (KHTML, like Gecko) Version/5.1 Safari/534.50","Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:2.0.1) Gecko/20100101 Firefox/4.0.1","Mozilla/5.0 (Windows NT 6.1; rv:2.0.1) Gecko/20100101 Firefox/4.0.1","Opera/9.80 (Macintosh; Intel Mac OS X 10.6.8; U; en) Presto/2.8.131 Version/11.11","Opera/9.80 (Windows NT 6.1; U; en) Presto/2.8.131 Version/11.11","Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_0) AppleWebKit/535.11 (KHTML, like Gecko) Chrome/17.0.963.56 Safari/535.11","Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; .NET4.0C; .NET4.0E)","Opera/9.80 (Windows NT 5.1; U; zh-cn) Presto/2.9.168 Version/11.50","Mozilla/5.0 (Windows NT 5.1; rv:5.0) Gecko/20100101 Firefox/5.0","Mozilla/5.0 (Windows NT 5.2) AppleWebKit/534.30 (KHTML, like Gecko) Chrome/12.0.742.122 Safari/534.30","Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/536.11 (KHTML, like Gecko) Chrome/20.0.1132.11 TaoBrowser/2.0 Safari/536.11","Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.1 (KHTML, like Gecko) Chrome/21.0.1180.71 Safari/537.1 LBBROWSER","Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; .NET4.0C; .NET4.0E; LBBROWSER)","Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Trident/4.0; SV1; QQDownload 732; .NET4.0C; .NET4.0E; 360SE)","Mozilla/5.0 (Windows NT 5.1) AppleWebKit/535.11 (KHTML, like Gecko) Chrome/17.0.963.84 Safari/535.11 SE 2.X MetaSr 1.0","Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0)","Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.2)","Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)","Mozilla/4.0 (compatible; MSIE 5.0; Windows NT)","Mozilla/5.0 (Windows; U; Windows NT 5.2) Gecko/2008070208 Firefox/3.0.1","Mozilla/5.0 (Windows; U; Windows NT 5.1) Gecko/20070309 Firefox/2.0.0.3","Mozilla/5.0 (Windows; U; Windows NT 5.1) Gecko/20070803 Firefox/1.5.0.12 "] #IP地址列表,用于設(shè)置IP代理 IP_AGENTS = ["http://58.240.53.196:8080", "http://219.135.99.185:8088","http://117.127.0.198:8080","http://58.240.53.194:8080" ]#設(shè)置IP代理 proxies={"http":random.choice(IP_AGENTS)} # ============================================================================= # 上面的設(shè)置是為了應(yīng)對網(wǎng)站的反爬蟲,與具體的網(wǎng)頁爬取無關(guān) # =============================================================================# ============================================================================= # 下面這些是根據(jù)剛才第一步的分析來設(shè)置的,所以下面需要按照第一步的分析來設(shè)置對應(yīng)的參數(shù)。 # 根據(jù)第一步圖片的右下角部分來設(shè)置Cookie、url、headers和post參數(shù) # ============================================================================= #設(shè)置cookie Cookie = "Hm_lvt_b58fe8237d8d72ce286e1dbd2fc8308c=1525162758; BIGipServerNEEQ_8000-NEW=83952564.16415.0000; JSESSIONID=E50D2B8270D728502754D4330CB0E275; Hm_lpvt_b58fe8237d8d72ce286e1dbd2fc8308c=1525165761" #設(shè)置動態(tài)js的url url = 'http://www.neeq.com.cn/disclosureInfoController/infoResult.do?callback=jQuery' #設(shè)置requests請求的 headers headers = {'User-agent': random.choice(USER_AGENTS), #設(shè)置get請求的User-Agent,用于偽裝瀏覽器UA 'Cookie': Cookie,'Connection': 'keep-alive','Accept': 'text/javascript, application/javascript, application/ecmascript, application/x-ecmascript, */*; q=0.01','Accept-Encoding': 'gzip, deflate','Accept-Language': 'zh-CN,zh;q=0.9','Host': 'www.neeq.com.cn','Referer': 'http://www.neeq.com.cn/disclosure/supervise.html' } #設(shè)置頁面索引 pageIndex=0 #設(shè)置url post請求的參數(shù) data={'page':pageIndex,'disclosureType':8}#requests post請求 req=requests.post(url,data=data,headers=headers,proxies=proxies) #print(req.content) #通過打印req.content,我們可以知道post請求返回的是json數(shù)據(jù),而且該數(shù)據(jù)是一個字符串類型的 #獲取包含json數(shù)據(jù)的字符串 str_data=req.content #獲取json字符串?dāng)?shù)據(jù) str_json=str_data[8:-2] print(str_json) #把json數(shù)據(jù)轉(zhuǎn)成dict類型 #json_Info=json.loads(str_json)

運行結(jié)果如下:

我們把str_json打印出來的字符串復(fù)制粘貼到網(wǎng)上的json在線解析工具來分析該數(shù)據(jù)的規(guī)律,結(jié)果如下:

由右圖,我們可以發(fā)現(xiàn)json數(shù)據(jù)的規(guī)律。接下來,我們先把str_json轉(zhuǎn)成dict字典類型的數(shù)據(jù):

#把json數(shù)據(jù)轉(zhuǎn)成dict類型 json_Info=json.loads(str_json)

然后就可以通過字典數(shù)據(jù)的相關(guān)操作來獲取網(wǎng)頁的相關(guān)數(shù)據(jù)了。

完整代碼如下:

# -*- coding: utf-8 -*- """ Created on Tue May 01 18:52:49 2018 @author: gmn """ #導(dǎo)入requests module import requests #導(dǎo)入random module import random #導(dǎo)入json module import json# ============================================================================= # 應(yīng)對網(wǎng)站反爬蟲的相關(guān)設(shè)置 # ============================================================================= #User-Agent列表,這個可以自己在網(wǎng)上搜到,用于偽裝瀏覽器的User Agent USER_AGENTS = ["Mozilla/5.0 (Windows; U; Windows NT 5.2) Gecko/2008070208 Firefox/3.0.1""Mozilla/5.0 (Windows; U; Windows NT 6.1; en-us) AppleWebKit/534.50 (KHTML, like Gecko) Version/5.1 Safari/534.50","Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:2.0.1) Gecko/20100101 Firefox/4.0.1","Mozilla/5.0 (Windows NT 6.1; rv:2.0.1) Gecko/20100101 Firefox/4.0.1","Opera/9.80 (Macintosh; Intel Mac OS X 10.6.8; U; en) Presto/2.8.131 Version/11.11","Opera/9.80 (Windows NT 6.1; U; en) Presto/2.8.131 Version/11.11","Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_0) AppleWebKit/535.11 (KHTML, like Gecko) Chrome/17.0.963.56 Safari/535.11","Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; .NET4.0C; .NET4.0E)","Opera/9.80 (Windows NT 5.1; U; zh-cn) Presto/2.9.168 Version/11.50","Mozilla/5.0 (Windows NT 5.1; rv:5.0) Gecko/20100101 Firefox/5.0","Mozilla/5.0 (Windows NT 5.2) AppleWebKit/534.30 (KHTML, like Gecko) Chrome/12.0.742.122 Safari/534.30","Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/536.11 (KHTML, like Gecko) Chrome/20.0.1132.11 TaoBrowser/2.0 Safari/536.11","Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.1 (KHTML, like Gecko) Chrome/21.0.1180.71 Safari/537.1 LBBROWSER","Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; .NET4.0C; .NET4.0E; LBBROWSER)","Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Trident/4.0; SV1; QQDownload 732; .NET4.0C; .NET4.0E; 360SE)","Mozilla/5.0 (Windows NT 5.1) AppleWebKit/535.11 (KHTML, like Gecko) Chrome/17.0.963.84 Safari/535.11 SE 2.X MetaSr 1.0","Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0)","Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.2)","Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)","Mozilla/4.0 (compatible; MSIE 5.0; Windows NT)","Mozilla/5.0 (Windows; U; Windows NT 5.2) Gecko/2008070208 Firefox/3.0.1","Mozilla/5.0 (Windows; U; Windows NT 5.1) Gecko/20070309 Firefox/2.0.0.3","Mozilla/5.0 (Windows; U; Windows NT 5.1) Gecko/20070803 Firefox/1.5.0.12 "] #IP地址列表,用于設(shè)置IP代理 IP_AGENTS = ["http://58.240.53.196:8080", "http://219.135.99.185:8088","http://117.127.0.198:8080","http://58.240.53.194:8080" ]#設(shè)置IP代理 proxies={"http":random.choice(IP_AGENTS)} # ============================================================================= # 上面的設(shè)置是為了應(yīng)對網(wǎng)站的反爬蟲,與具體的網(wǎng)頁爬取無關(guān) # =============================================================================# ============================================================================= # 下面這些是根據(jù)剛才第一步的分析來設(shè)置的,所以下面需要按照第一步的分析來設(shè)置對應(yīng)的參數(shù)。 # 根據(jù)第一步圖片的右下角部分來設(shè)置Cookie、url、headers和post參數(shù) # ============================================================================= #設(shè)置cookie Cookie = "Hm_lvt_b58fe8237d8d72ce286e1dbd2fc8308c=1525162758; BIGipServerNEEQ_8000-NEW=83952564.16415.0000; JSESSIONID=E50D2B8270D728502754D4330CB0E275; Hm_lpvt_b58fe8237d8d72ce286e1dbd2fc8308c=1525165761" #設(shè)置動態(tài)js的url url = 'http://www.neeq.com.cn/disclosureInfoController/infoResult.do?callback=jQuery' #設(shè)置requests請求的 headers headers = {'User-agent': random.choice(USER_AGENTS), #設(shè)置get請求的User-Agent,用于偽裝瀏覽器UA 'Cookie': Cookie,'Connection': 'keep-alive','Accept': 'text/javascript, application/javascript, application/ecmascript, application/x-ecmascript, */*; q=0.01','Accept-Encoding': 'gzip, deflate','Accept-Language': 'zh-CN,zh;q=0.9','Host': 'www.neeq.com.cn','Referer': 'http://www.neeq.com.cn/disclosure/supervise.html' } #設(shè)置頁面索引 pageIndex=0 #設(shè)置url post請求的參數(shù) data={'page':pageIndex,'disclosureType':8}#requests post請求 req=requests.post(url,data=data,headers=headers,proxies=proxies) #print(req.content) #通過打印req.content,我們可以知道post請求返回的是json數(shù)據(jù),而且該數(shù)據(jù)是一個字符串類型的 #獲取包含json數(shù)據(jù)的字符串 str_data=req.content #獲取json字符串?dāng)?shù)據(jù) str_json=str_data[8:-2] #print(str_json) #把json數(shù)據(jù)轉(zhuǎn)成dict類型 json_Info=json.loads(str_json)

注意事項:

有時候我們按照以上步驟,仍然難以準(zhǔn)確的找到數(shù)據(jù)訪問的URL的時候,可以考慮使用selenium + 瀏覽器driver (如:chromedriver)的方式來爬取動態(tài)網(wǎng)頁,不過這種方式比較耗時間。

總結(jié)

以上是生活随笔為你收集整理的chrome动态ip python_用Python爬虫爬取动态网页,附带完整代码,有错误欢迎指出!...的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯,歡迎將生活随笔推薦給好友。