java jsoup爬取动态网页_java通过Jsoup爬取网页(入门教程)
一,導(dǎo)入依賴
org.jsoup
jsoup
1.10.3
org.apache.httpcomponents
httpclient
二,編寫demo類
注意不要導(dǎo)錯(cuò)包了,是org.jsoup.nodes下面的
package com.taotao.entity;
import org.apache.http.HttpEntity;
import org.apache.http.client.methods.CloseableHttpResponse;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.impl.client.CloseableHttpClient;
import org.apache.http.impl.client.HttpClients;
import org.apache.http.util.EntityUtils;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
import java.io.IOException;
/**
* Author: TaoTao 2019/9/26
*/
public class intefaceTest {
public static void main(String[] args) throws IOException {
CloseableHttpClient httpClient = HttpClients.createDefault();//創(chuàng)建httpClient
HttpGet httpGet = new HttpGet("http://www.cnblogs.com/");//創(chuàng)建httpget實(shí)例
CloseableHttpResponse response = httpClient.execute(httpGet);//執(zhí)行g(shù)et請(qǐng)求
HttpEntity entity = response.getEntity();//獲取返回實(shí)體
String content = EntityUtils.toString(entity,"utf-8");//網(wǎng)頁(yè)內(nèi)容
response.close();//關(guān)閉流和釋放系統(tǒng)資源
Jsoup.parse(content);
Document doc = Jsoup.parse(content);//解析網(wǎng)頁(yè)得到文檔對(duì)象
Elements elements = doc.getElementsByTag("title");//獲取tag是title的所有dom文檔
Element element = elements.get(0);//獲取第一個(gè)元素
String title = element.text(); //.html是返回html
System.out.println("網(wǎng)頁(yè)標(biāo)題:"+title);
Element element1 = doc.getElementById("site_nav_top");//獲取id=site_nav_top標(biāo)簽
String str = element1.text();
System.out.println("str:"+str);
}
}
標(biāo)簽:http,title,入門教程,jsoup,爬取,Jsoup,import,apache,org
來源: https://www.cnblogs.com/book-mountain/p/11595018.html
總結(jié)
以上是生活随笔為你收集整理的java jsoup爬取动态网页_java通过Jsoup爬取网页(入门教程)的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問題。