日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 >

爬取猫眼数据

發布時間:2023/12/14 32 豆豆
生活随笔 收集整理的這篇文章主要介紹了 爬取猫眼数据 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

//源碼

?

#
# 導包
#
import pyximport
import requests
from fake_useragent import UserAgent
import json
import os
import pandas as pd
import csv
import datetime

#
#
#
# 代碼
# http://maoyan.com/films/42964
#
#

#偽表頭定義
pyximport.install()
ua=UserAgent()
headers = {
# "User-agent":UserAgent(verity_ssl=False).random,
"User-agent":ua.random,
"Host":"m.maoyan.com",
#"Referer":"http://m.maoyan.com/movie/1217236/comments?_v_=yes"
"Referer":"http://m.maoyan.com/movie/42964/comments?_v_=yes"
}

#請求參數定義
offsets = [0,15,30,45,60,75,90,105,120,135,150,165,180]
startTime="0"
randomTime = ""
list_info = []

for offset in offsets:
comment_api='http://m.maoyan.com/mmdb/comments/movie/42964.json?_v_=yes&offset={0}&startTime={1}'.format(offset,datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S'))

response_comment = requests.get(comment_api,headers=headers)
json_comment=response_comment.text
json_comment=json.loads(json_comment)
#print(json_comment)
json_response = json_comment['cmts']
for data in json_response:
cityName = data['cityName']
content=data['content']
if "gender" in data:
gender = data['gender']
else:
gender=0
nickName = data['nickName']
userLevel = data['userLevel']
score = data['score']
list_one=[nickName,gender,cityName,userLevel,score,content]
list_info.append(list_one)
#print("offset:"+offset+",startTime:"+startTime)
#重新定義請求參數

print("正在存儲數據:")
file_size=os.path.getsize(r'D:\B_Hakkelujah\python\maoyan.csv')
prStr = "文件大小:{0}".format(file_size)
print(prStr)
if file_size==0:
print("空文件添加數據")
# 表頭
name = ['評論者昵稱', '性別', '所在城市','貓眼等級','評分','評論內容']
# 建立DataFrame對象
file_test = pd.DataFrame(columns=name, data=list_info)
# 數據寫入
file_test.to_csv(r'D:\B_Hakkelujah\python\maoyan.csv', encoding='utf_8_sig', index=False)
print("數據添加完畢")
#pd.read_csv(file_name, encoding='utf-8')

?

原文:

https://mp.weixin.qq.com/s?__biz=MjM5MjAwODM4MA==&mid=2650706418&idx=1&sn=20e57b7b1c8caa4c0b06d6dbd2b94aaa&chksm=bea6e02189d16937c8c3d934264f24b599576b14b76361018b55cca76fb73a127d4f6681af98&mpshare=1&scene=1&srcid=101045ENCgxgoTId8LKXrIaE&pass_ticket=Cgz9TOK3J64evSI%2B9Ev7kLigZCJHUOKf8eJe9%2FagJaUdYdhyn53lL%2FeRC4NnDrUq#rd

?

注:

數據爬取記錄

1.分析接口(包括接口參數的變化)

2.分析JSON數據(數據解析)

3.數據存儲(文件、數據庫)

?

轉載于:https://www.cnblogs.com/newrohlzy/p/9973795.html

總結

以上是生活随笔為你收集整理的爬取猫眼数据的全部內容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。