當前位置：首頁 > 编程语言 > python >内容正文

python

Python 模块 requests 模拟登录豆瓣并发表动态

發(fā)布時間：2024/7/23 python 39 豆豆

生活随笔收集整理的這篇文章主要介紹了 Python 模块 requests 模拟登录豆瓣并发表动态小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.

如何抓取 WEB 頁面：http://blog.csdn.net/chenguolinblog/article/details/45024643
github 上一個關于模擬登錄的項目：https://github.com/xchaoinfo/fuck-login
Python爬蟲之模擬登錄總結：http://blog.csdn.net/churximi/article/details/50917322
爬蟲庫 Python Requests 如何模擬用戶登錄？：https://segmentfault.com/q/1010000002421773
python爬蟲實踐之模擬登錄：https://www.2cto.com/kf/201401/275152.html
Python 模擬登錄知乎：http://blog.csdn.net/Marksinoberg/article/details/69569353
使用python編寫簡單網絡爬蟲技巧總結：http://armsword.com/2014/03/31/python-in-crawler

模擬登錄這塊一直沒搞過，主要是對模擬登陸的流程不太熟悉，網上找了好多資料，感覺熟悉個大概，就先用豆瓣試試。

模擬登陸的重點，在于找到表單真實的提交地址，然后攜帶cookie，post數(shù)據即可，只要登陸成功，我們就可以訪問其他任意網頁，從而獲取網頁內容。

一個請求，只要正確模擬了method，url，header，body 這四要素，任何內容都能抓下來，而所有的四個要素，只要打開瀏覽器-審查元素-Network就能看到！

驗證碼這一塊，現(xiàn)在主要是先把驗證碼的圖片保存下來，手動輸入驗證碼，后期研究下python自動識別驗證碼。

但是驗證碼保存成本地圖片，看的不不太清楚（有時間在改下），可以把驗證碼的 url 地址在瀏覽器中打開，就可以看清楚驗證碼了。

主要實現(xiàn) 登錄豆瓣，并發(fā)表一句話

# -*- coding:utf-8 -*-import re import requests from bs4 import BeautifulSoupclass DouBan(object):def __init__(self):self.__username = "豆瓣帳號" # 豆瓣帳號self.__password = "豆瓣密碼" # 豆瓣密碼self.__main_url = "https://www.douban.com"self.__login_url = "https://www.douban.com/accounts/login"self.__proxies = {"http": "http://172.17.18.80:8080","https": "https://172.17.18.80:8080"}self.__headers = {"Host": "www.douban.com","Origin": self.__main_url,"Referer": self.__main_url,"Upgrade-Insecure-Requests": "1","User-Agent": 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36'}self.__data = {"source": "index_nav","redir": "https://www.douban.com","form_email": self.__username,"form_password": self.__password,"login": u"登錄"}self.__session = requests.session()self.__session.headers = self.__headersself.__session.proxies = self.__proxiespassdef login(self):r = self.__session.post(self.__login_url, self.__data)if r.status_code == 200:html = r.contentsoup = BeautifulSoup(html, "lxml")captcha_address = soup.find('img', id='captcha_image')['src']print captcha_address# 驗證碼存在if captcha_address:# 利用正則表達式獲取captcha的IDre_captcha_id = r'<input type="hidden" name="captcha-id" value="(.*?)"/'captcha_id = re.findall(re_captcha_id, html)print captcha_id# 保存到本地with open('captcha.jpg', 'w') as f:f.write(requests.get(captcha_address, proxies=self.__proxies).content)captcha = raw_input('please input the captcha:')self.__data['captcha-solution'] = captchaself.__data['captcha-id'] = captcha_idr = self.__session.post(self.__login_url, data=self.__data)if r.status_code == 200:print "login success"data = {"ck": "NBJ2","comment": "模擬登錄"}r = self.__session.post(self.__main_url, data=data)print r.status_codeelse:print "登錄不需要驗證碼"# 不需要驗證碼的邏輯和上面輸入驗證碼之后的邏輯一樣# 此處代碼省略else:print "login fail", r.status_codepassif __name__ == "__main__":t = DouBan()t.login()pass

登錄豆瓣帳號，可以看到說了一句話 “模擬登錄”

總結

以上是生活随笔為你收集整理的Python 模块 requests 模拟登录豆瓣并发表动态的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇：学会了这些技术，你离BAT大厂不远了
下一篇：简明Python教程学习笔记_1_基本

日韩av黄I国产麻豆传媒I国产91av视频在线观看I日韩一区二区三区在线看I美女国产在线I麻豆视频国产在线观看I成人黄色短片

python

Python 模块 requests 模拟登录豆瓣 并 发表动态

總結

Python 模块 requests 模拟登录豆瓣并发表动态