抓取页面图片元素并保存到本机电脑
生活随笔
收集整理的這篇文章主要介紹了
抓取页面图片元素并保存到本机电脑
小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.
在這里主要通過流分析,通過java模擬訪問頁面獲取到頁面的html元素,并通過jsoup來分析獲取到的html元素,
然后通過流處理來將圖片保存到本機
package getpicture;import java.io.BufferedReader; import java.io.File; import java.io.FileOutputStream; import java.io.IOException; import java.io.InputStream; import java.io.InputStreamReader; import java.io.OutputStreamWriter; import java.net.HttpURLConnection; import java.net.URL; import java.text.SimpleDateFormat; import java.util.Date; import java.util.Scanner;import org.jsoup.Jsoup; import org.jsoup.nodes.Document; import org.jsoup.nodes.Element; import org.jsoup.select.Elements;public class getPicture {public static void main(String[] args) {new Thread(new Spider()).start();} }// 抓網(wǎng)頁, 并分析出圖片地址 class Spider implements Runnable {private String firstUrl = "http://jandan.net/ooxx/page-"; //1111#commentsprivate String connUrl = "#comments";private int beginIndex = 1115;private String preHtml;//private String testPath="http://www.mop.com/#";private String mSavePath;public Spider() {};@Overridepublic void run() {try {URL newURL = new URL(firstUrl + beginIndex + connUrl);//URL newURL = new URL(testPath);HttpURLConnection conn = (HttpURLConnection) newURL.openConnection();conn.setRequestProperty("Connection","keep-alive");conn.setRequestProperty("User-Agent", "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.107 Safari/537.36");conn.setDoInput(true);conn.setDoOutput(true);OutputStreamWriter out = new OutputStreamWriter(conn.getOutputStream(),"utf-8");out.flush();out.close(); InputStream inputStream = conn.getInputStream();BufferedReader reader = new BufferedReader(new InputStreamReader(inputStream, "utf-8"));String line;//讀取頁面html元素while ((line = reader.readLine()) != null) { preHtml+=line; }System.out.println(preHtml);//當頁面訪問成功時,解析頁面元素,獲取頁面圖片元素if(conn.getResponseCode()==200){Document doc=Jsoup.parse(preHtml);Elements elements = doc.select(".row img");for(Element e : elements) {String imgSrc = e.attr("src");new Thread(new DownloadImage(imgSrc)).start();}}}catch(Exception e) {e.printStackTrace();}} }class DownloadImage implements Runnable {private String imageSrc;private String imageName;public DownloadImage(String imageSrc) {this.imageSrc = imageSrc;}@Overridepublic void run() {String[] splits = imageSrc.split("/");imageName = splits[splits.length - 1];Date date=new Date();SimpleDateFormat sdf=new SimpleDateFormat("yyyyMMdd"); String random=sdf.format(date);File file = new File("E:\\picture\\"+sdf+"\\"+imageName);// 如果路徑不存在,則創(chuàng)建 if (!file.getParentFile().exists()) { file.getParentFile().mkdirs(); } //判斷文件是否存在,不存在就創(chuàng)建文件if(!file.exists()&& !file .isDirectory()) {try {file.createNewFile();} catch (IOException e) {// TODO Auto-generated catch block e.printStackTrace();}} System.out.println("開始下載圖片:" + imageName); try {URL newURL = new URL("http:"+imageSrc);HttpURLConnection conn = (HttpURLConnection) newURL.openConnection();conn.setRequestProperty("Connection","keep-alive");conn.setRequestProperty("User-Agent", "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.107 Safari/537.36");conn.setDoInput(true);conn.setDoOutput(true);//通過輸入流獲取圖片數(shù)據(jù)InputStream inputStream = conn.getInputStream();//BufferedReader reader = new BufferedReader(new InputStreamReader(inputStream));byte[] data=new byte[1024];//創(chuàng)建輸出流 FileOutputStream fos = new FileOutputStream(file); int len = 0; //使用一個輸入流從buffer里把數(shù)據(jù)讀取出來 while( (len=inputStream.read(data)) != -1 ){ //用輸出流往buffer里寫入數(shù)據(jù),中間參數(shù)代表從哪個位置開始讀,len代表讀取的長度 fos.write(data, 0, len); } fos.flush();fos.close();System.out.println("下載完成:" + imageName);}catch(Exception e) {System.err.println(" 這個圖片下載不了哇!\n刪除妹子" + imageName);return;}} } View Code?
轉載于:https://www.cnblogs.com/feitianshaoxai/p/6595381.html
總結
以上是生活随笔為你收集整理的抓取页面图片元素并保存到本机电脑的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: Redis-3.2主从复制与集群搭建
- 下一篇: 徐雷FrankXu 内推 杭州 蚂蚁金服