當前位置：首頁 >

正则爬虫案例

發布時間：2025/4/14 46 豆豆

生活随笔收集整理的這篇文章主要介紹了正则爬虫案例小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

#coding:utf-8import requests import re import jsonurl='https://movie.douban.com/top250?start=0&filter=' #豆瓣網def get_page(url):
　　#獲取網頁上的數據response_html=requests.get(url)#response_html.encoding = response_html.apparent_encodingreturn response_html.text def run(url):response=get_page(url)
　　#編譯匹配規則，找出用的數據obj=re.compile('<div class="item">.*?<em.*?>(?P<id>\d+).*?(?P<title>.*?).*?(?P<info>.*?).*?(?P<rating>.*?).*?(?P<appraise>\w+)',re.S)res=obj.finditer(response)file={}for i in res:file[i.group('id')]=(i.group('title'),i.group('rating'),i.group('appraise'))# print(file)
　　　　　#將有用的信息轉成json格式，以字典的格式儲存到文件中
　　　　　content=json.dumps(file,ensure_ascii=False)f = open('doubian.txt', 'a')f.seek(0,2)f.write(content+'\n')file={}i=0 while i < 251:
#循環取出所有網頁里的內容（根據網頁不同進行更改）a=irun(url)i+=25url=re.sub('start=\d+','start='+str(i),url)print(url)

轉載于:https://www.cnblogs.com/mona524/p/7096190.html

總結

以上是生活随笔為你收集整理的正则爬虫案例的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

日韩av黄I国产麻豆传媒I国产91av视频在线观看I日韩一区二区三区在线看I美女国产在线I麻豆视频国产在线观看I成人黄色短片

正则爬虫案例

總結