生活随笔
收集整理的這篇文章主要介紹了
利用Python中的BeautifulSoup库爬取安居客第一页信息
小編覺得挺不錯的,現在分享給大家,幫大家做個參考.
題目:
網址為https://beijing.anjuke.com/sale/,
利用BeautifulSoup庫,爬取第1頁的信息,具體信息如下:進入每個房源的頁面,爬取小區名稱、參考預算、發布時間和核心賣點,并將它們打印出來。(剛學網絡爬蟲。若有錯誤,望指正)
代碼如下:
import requests
from bs4
import BeautifulSoup
headers
= {'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.81 Safari/537.36 Edg/94.0.992.50'
}info_lists
= []house
=requests
.get
("https://beijing.anjuke.com/sale/",headers
=headers
)
soup
=BeautifulSoup
(house
.text
,"lxml")
names
=soup
.select
("h3")
positions
=soup
.select
("p.property-content-info-comm-name")
moneys
=soup
.select
("div.property-price > p.property-price-total > span.property-price-total-num")
years
=soup
.select
("div.property-content > div.property-content-detail > section > div:nth-of-type(1) > p:nth-of-type(5)")
points
=soup
.select
("div.property-content > div.property-content-detail > section > div:nth-of-type(3)")for name
,position
,money
,year
,point
in zip(names
,positions
,moneys
,years
,points
):info
= {'name':name
.get_text
().strip
(),'position':position
.get_text
().strip
(),'money':money
.get_text
().strip
(),'year':year
.get_text
().strip
(),'point':point
.get_text
().strip
()}info_lists
.append
(info
)for info_list
in info_lists
:f
= open(r'C:\Users\23993\Desktop\house_info.txt','a+')try:f
.write
(info_list
["name"]+' '+info_list
["position"]+' '+info_list
["money"]+'萬'+' '+info_list
["year"]+' '+info_list
["point"]+'\n')f
.close
()except UnicodeEncodeError
:pass
部分結果截圖:
創作挑戰賽新人創作獎勵來咯,堅持創作打卡瓜分現金大獎
總結
以上是生活随笔為你收集整理的利用Python中的BeautifulSoup库爬取安居客第一页信息的全部內容,希望文章能夠幫你解決所遇到的問題。
如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。