當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

抓取页面图片元素并保存到本机电脑

發(fā)布時間：2023/12/31 编程问答 31 豆豆

生活随笔收集整理的這篇文章主要介紹了抓取页面图片元素并保存到本机电脑小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.

在這里主要通過流分析，通過java模擬訪問頁面獲取到頁面的html元素，并通過jsoup來分析獲取到的html元素，

然后通過流處理來將圖片保存到本機

package getpicture;import java.io.BufferedReader; import java.io.File; import java.io.FileOutputStream; import java.io.IOException; import java.io.InputStream; import java.io.InputStreamReader; import java.io.OutputStreamWriter; import java.net.HttpURLConnection; import java.net.URL; import java.text.SimpleDateFormat; import java.util.Date; import java.util.Scanner;import org.jsoup.Jsoup; import org.jsoup.nodes.Document; import org.jsoup.nodes.Element; import org.jsoup.select.Elements;public class getPicture {public static void main(String[] args) {new Thread(new Spider()).start();} }// 抓網(wǎng)頁, 并分析出圖片地址 class Spider implements Runnable {private String firstUrl = "http://jandan.net/ooxx/page-"; //1111#commentsprivate String connUrl = "#comments";private int beginIndex = 1115;private String preHtml;//private String testPath="http://www.mop.com/#";private String mSavePath;public Spider() {};@Overridepublic void run() {try {URL newURL = new URL(firstUrl + beginIndex + connUrl);//URL newURL = new URL(testPath);HttpURLConnection conn = (HttpURLConnection) newURL.openConnection();conn.setRequestProperty("Connection","keep-alive");conn.setRequestProperty("User-Agent", "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.107 Safari/537.36");conn.setDoInput(true);conn.setDoOutput(true);OutputStreamWriter out = new OutputStreamWriter(conn.getOutputStream(),"utf-8");out.flush();out.close(); InputStream inputStream = conn.getInputStream();BufferedReader reader = new BufferedReader(new InputStreamReader(inputStream, "utf-8"));String line;//讀取頁面html元素while ((line = reader.readLine()) != null) { preHtml+=line; }System.out.println(preHtml);//當頁面訪問成功時，解析頁面元素，獲取頁面圖片元素if(conn.getResponseCode()==200){Document doc=Jsoup.parse(preHtml);Elements elements = doc.select(".row img");for(Element e : elements) {String imgSrc = e.attr("src");new Thread(new DownloadImage(imgSrc)).start();}}}catch(Exception e) {e.printStackTrace();}} }class DownloadImage implements Runnable {private String imageSrc;private String imageName;public DownloadImage(String imageSrc) {this.imageSrc = imageSrc;}@Overridepublic void run() {String[] splits = imageSrc.split("/");imageName = splits[splits.length - 1];Date date=new Date();SimpleDateFormat sdf=new SimpleDateFormat("yyyyMMdd"); String random=sdf.format(date);File file = new File("E:\\picture\\"+sdf+"\\"+imageName);// 如果路徑不存在,則創(chuàng)建 if (!file.getParentFile().exists()) { file.getParentFile().mkdirs(); } //判斷文件是否存在，不存在就創(chuàng)建文件if(!file.exists()&& !file .isDirectory()) {try {file.createNewFile();} catch (IOException e) {// TODO Auto-generated catch block e.printStackTrace();}} System.out.println("開始下載圖片：" + imageName); try {URL newURL = new URL("http:"+imageSrc);HttpURLConnection conn = (HttpURLConnection) newURL.openConnection();conn.setRequestProperty("Connection","keep-alive");conn.setRequestProperty("User-Agent", "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.107 Safari/537.36");conn.setDoInput(true);conn.setDoOutput(true);//通過輸入流獲取圖片數(shù)據(jù)InputStream inputStream = conn.getInputStream();//BufferedReader reader = new BufferedReader(new InputStreamReader(inputStream));byte[] data=new byte[1024];//創(chuàng)建輸出流 FileOutputStream fos = new FileOutputStream(file); int len = 0; //使用一個輸入流從buffer里把數(shù)據(jù)讀取出來 while( (len=inputStream.read(data)) != -1 ){ //用輸出流往buffer里寫入數(shù)據(jù)，中間參數(shù)代表從哪個位置開始讀，len代表讀取的長度 fos.write(data, 0, len); } fos.flush();fos.close();System.out.println("下載完成：" + imageName);}catch(Exception e) {System.err.println(" 這個圖片下載不了哇！\n刪除妹子" + imageName);return;}} } View Code

轉載于:https://www.cnblogs.com/feitianshaoxai/p/6595381.html

總結

以上是生活随笔為你收集整理的抓取页面图片元素并保存到本机电脑的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇： Redis-3.2主从复制与集群搭建
下一篇：徐雷FrankXu 内推杭州蚂蚁金服