當前位置：首頁 > 编程语言 > python >内容正文

python

使用python来批量抓取网站图片

發布時間：2025/3/14 python 36 豆豆

生活随笔收集整理的這篇文章主要介紹了使用python来批量抓取网站图片小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

今天"無意"看美女無意溜達到一個網站，發現妹子多多，但是可恨一個page只顯示一張或兩張圖片，家里WiFi也難用，于是發揮"程序猿"的本色，寫個小腳本，把圖片扒下來再看，類似功能已有不少大師實現了，但本著學習鍛煉的精神，自己折騰一遍，漲漲姿勢！

先來效果展示下：

python代碼：

# -*- coding:utf8 -*- import urllib2 import re import requests from lxml import etree import osdef check_save_path(save_path):try:os.mkdir(save_path)except:passdef get_image_name(image_link):file_name = os.path.basename(image_link)return file_namedef save_image(image_link, save_path):file_name = get_image_name(image_link)file_path = save_path + "\\" + file_nameprint("準備下載%s" % image_link)try:file_handler = open(file_path, "wb")image_handler = urllib2.urlopen(url=image_link, timeout=5).read()file_handler.write(image_handler)file_handler.closed()except Exception, ex:print(ex.message)def get_image_link_from_web_page(web_page_link):image_link_list = []print(web_page_link)try:html_content = urllib2.urlopen(url=web_page_link, timeout=5).read()html_tree = etree.HTML(html_content)print(str(html_tree))link_list = html_tree.xpath('//p/img/@src')for link in link_list:# print(link)if str(link).find("uploadfile"):image_link_list.append("http://www.xgyw.cc/" + link)except Exception, ex:passreturn image_link_listdef get_page_link_list_from_index_page(base_page_link):try:html_content = urllib2.urlopen(url=base_page_link, timeout=5).read()html_tree = etree.HTML(html_content)print(str(html_tree))link_tmp_list = html_tree.xpath('//div[@class="page"]/a/@href')page_link_list = []for link_tmp in link_tmp_list:page_link_list.append("http://www.xgyw.cc/" + link_tmp)return page_link_listexcept Exception, ex:print(ex.message)return []def get_page_title_from_index_page(base_page_link):try:html_content = urllib2.urlopen(url=base_page_link, timeout=5).read()html_tree = etree.HTML(html_content)print(str(html_tree))page_title_list = html_tree.xpath('//td/div[@class="title"]')page_title_tmp = page_title_list[0].textprint(page_title_tmp)return page_title_tmpexcept Exception, ex:print(ex.message)return ""def get_image_from_web(base_page_link, save_path):check_save_path(save_path)page_link_list = get_page_link_list_from_index_page(base_page_link)for page_link in page_link_list:image_link_list = get_image_link_from_web_page(page_link)for image_link in image_link_list:save_image(image_link, save_path)base_page_link = "http://www.xgyw.cc/tuigirl/tuigirl1346.html" page_title = get_page_title_from_index_page(base_page_link) if page_title <> "":save_path = "N:\\PIC\\" + page_title else:save_path = "N:\\PIC\\other\\"get_image_from_web(base_page_link, save_path) View Code

代碼思路：

使用urllib2.urlopen(url).open來獲取頁面數據，再使用etree.HTML()將頁面解析成xml格式，方便使用xmlpath方式來獲取特定node的值，最終遍歷所有頁面得到要下載的圖片，將圖片保存到本地。

--=========================================================

python包安裝：

很多python包沒有windows安裝包，或者沒有X64版本的安裝包，對于新手來說，很難快速上手，可以使用pip或easy_install來安裝要使用的安裝包，相關安裝方式：https://pypi.python.org/pypi/setuptools

本人采用easy_install方式，我電腦安裝python2.7，安裝路徑為：C:\Python27\python.exe，下載ez_setup.py文件后到c盤保存，然后運行cmd執行以下命令：

C:\Python27\python.exe "c:\ez_setup.py"

即可安裝easy_install，安裝結束后可以C:\Python27\Scripts下看到easy_install-2.7.exe，如果我們想在本地安裝requests包，那么可以運行以下命令來試下：

"C:\Python27\Scripts\easy_install-2.7.exe" requests

--==========================================================

依舊是妹子壓貼，推女郎第68期，想要圖的自己百度

轉載于:https://www.cnblogs.com/TeyGao/p/5225940.html

總結

以上是生活随笔為你收集整理的使用python来批量抓取网站图片的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇： FREE 开源 API 管理工具等
下一篇： python小程序：备份文件

日韩av黄I国产麻豆传媒I国产91av视频在线观看I日韩一区二区三区在线看I美女国产在线I麻豆视频国产在线观看I成人黄色短片

python

使用python来批量抓取网站图片

總結