當前位置：首頁 > 编程语言 > python >内容正文

python

python url拼接_教你写python爬虫——用python爬原图

發布時間：2023/12/2 python 49 豆豆

生活随笔收集整理的這篇文章主要介紹了 python url拼接_教你写python爬虫——用python爬原图小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

環境：python3.6 + pycharm

“獵物”：http://www.polayoutu.com （僅學習用）

動機1：想要爬一些尺寸比較大（不是尺度）的美圖養養眼，僅此而已；

動機2：學習python爬蟲，要學以致用

一、分析目標網站：

1.尋找URL：

攝影圖片是分期展示的，我們滾動頁面，滾動到140期，看到請求的URL如下：

“http://www.polaxiong.com/collections/get_entries_by_collection_id/140?{}”

2.大膽猜測：URL地址中"?{}"去掉可不可以？我們做一個嘗試，直接輸入“http://www.polaxiong.com/collections/get_entries_by_collection_id/140”，得到的是一個json字符串（我個人還是蠻喜歡json字符串的，因為它跟字典互相轉換，炒雞爽！）

3.看！有情報，data字段對應的value里有0-11編號的數據，展開其中一個，看到文件描述，心里樂開花，這不正是頁面上圖片的描述嗎？而且，“full_res"字段對應的value，就是我們要找到原圖URL。

4.嘗試打開full_res的url地址，查看圖片大小：2.9M，這肯定是原圖啊，哪個網站縮略圖能有2.9M的？

5.接下來，我們要做的就是把json字符串轉成字典，取出full_res的值

二.以下是完整爬蟲代碼：

#!/bin/python # coding = utf-8# 本python爬蟲僅用于學習 import urllib.request import os, pathlib import threading import time import json import requestspath = r'C://IM/pic//' # 使用前先創建此目錄，否則會報錯header = {'User-Agent':'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.1 (KHTML, like Gecko) Chrome/21.0.1180.89 Safari/537.1'}def check_dir(path):if not os.path.exists(path):os.makedirs(path)print("目錄創建成功，開始下載...")else:print("目錄已存在，開始下載...")# 下載圖片文件 def save_img(url, path, name):try:urllib.request.urlretrieve(url, "%s%s" % (path, name))# time.sleep(10)except Exception as e:# 通過try -except 來捕獲異常，當目錄不存在時會拋出異常print(e)def get_img(url, header, path):try:js_content = requests.get(url, headers=header).content # 發送request請求js_dict = json.loads(js_content) # 將json字符串轉成python字典格式new_dict = js_dict['data'] # 取出data對應value值for i in range(len(new_dict)):file_url = new_dict[i]['full_res'] # 遍歷得到full_res對應的原圖地址file_name = file_url.split('/')[4]+'_'+file_url.split('/')[5]+'.jpg' # 拼接文件名:“期號_圖片編號.jpg”save_img(file_url, path, file_name) # 下載圖片文件，并保存到本地print("%s：已下載" % file_name)except Exception as e:print(e)if __name__ == '__main__':check_dir(path)for num in range(10, 100): # 這里是從10期到100期循環遍歷# 拼接URL地址url = 'http://www.polaxiong.com/collections/get_entries_by_collection_id/'+str(num)get_img(url, header, path)

三、我們看一下效果：

2張圖片大小共19M，下載的是原圖！

總結

以上是生活随笔為你收集整理的python url拼接_教你写python爬虫——用python爬原图的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇：查询分析器在哪里_你应该知道的3种Nod
下一篇：算法工程师和python_算法工程师只掌

日韩av黄I国产麻豆传媒I国产91av视频在线观看I日韩一区二区三区在线看I美女国产在线I麻豆视频国产在线观看I成人黄色短片

python

python url拼接_教你写python爬虫——用python爬原图

總結