日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 编程语言 > python >内容正文

python

Python 3.X 要使用urllib.request 来抓取网络资源。转

發(fā)布時間:2024/9/5 python 54 豆豆
生活随笔 收集整理的這篇文章主要介紹了 Python 3.X 要使用urllib.request 来抓取网络资源。转 小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.

Python 3.X 要使用urllib.request 來抓取網(wǎng)絡(luò)資源。

最簡單的方式:

#coding=utf-8 import?urllib.request response?=?urllib.request.urlopen('http://python.org/') buff?=?response.read() #顯示 html?=?buff.decode("utf8") response.close() print(html)

?

?

使用Request的方式:

#coding=utf-8 import?urllib.request req?=?urllib.request.Request('http://www.voidspace.org.uk') response?=?urllib.request.urlopen(req) buff?=?response.read() #顯示 the_page?=?buff.decode("utf8") response.close() print(the_page)

?

這種方式同樣可以用來處理其他URL,例如FTP:

#coding=utf-8 import?urllib.request req?=?urllib.request.Request('ftp://ftp.pku.edu.cn/') response?=?urllib.request.urlopen(req) buff?=?response.read() #顯示 the_page?=?buff.decode("utf8") response.close() print(the_page)

?

?

使用POST請求:

import?urllib.parseimport urllib.requesturl?=?'http://www.someserver.com/cgi-bin/register.cgi' values?=?{'name'?:?'Michael?Foord','location'?:?'Northampton','language'?:?'Python'?}data?=?urllib.parse.urlencode(values) req?=?urllib.request.Request(url,?data) response?=?urllib.request.urlopen(req) the_page?=?response.read()

?

?

使用GET請求:

import?urllib.request import?urllib.parse data?=?{} data['name']?=?'Somebody?Here' data['location']?=?'Northampton' data['language']?=?'Python' url_values?=?urllib.parse.urlencode(data) print(url_values) name=Somebody+Here&language=Python&location=Northampton url?=?'http://www.example.com/example.cgi' full_url?=?url?+?'?'?+?url_values data?=?urllib.request.open(full_url)

?

?

添加header:

import?urllib.parse import?urllib.requesturl?=?'http://www.someserver.com/cgi-bin/register.cgi' user_agent?=?'Mozilla/4.0?(compatible;?MSIE?5.5;?Windows?NT)' values?=?{'name'?:?'Michael?Foord','location'?:?'Northampton','language'?:?'Python'?} headers?=?{?'User-Agent'?:?user_agent?}data?=?urllib.parse.urlencode(values) req?=?urllib.request.Request(url,?data,?headers) response?=?urllib.request.urlopen(req) the_page?=?response.read()

?

?

錯誤處理:

req?=?urllib.request.Request('http://www.pretend_server.org') try:?urllib.request.urlopen(req) except?urllib.error.URLError?as?e:print(e.reason)

?

?

返回的錯誤代碼:

#?Table?mapping?response?codes?to?messages;?entries?have?the #?form?{code:?(shortmessage,?longmessage)}. responses?=?{100:?('Continue',?'Request?received,?please?continue'),101:?('Switching?Protocols','Switching?to?new?protocol;?obey?Upgrade?header'),200:?('OK',?'Request?fulfilled,?document?follows'),201:?('Created',?'Document?created,?URL?follows'),202:?('Accepted','Request?accepted,?processing?continues?off-line'),203:?('Non-Authoritative?Information',?'Request?fulfilled?from?cache'),204:?('No?Content',?'Request?fulfilled,?nothing?follows'),205:?('Reset?Content',?'Clear?input?form?for?further?input.'),206:?('Partial?Content',?'Partial?content?follows.'),300:?('Multiple?Choices','Object?has?several?resources?--?see?URI?list'),301:?('Moved?Permanently',?'Object?moved?permanently?--?see?URI?list'),302:?('Found',?'Object?moved?temporarily?--?see?URI?list'),303:?('See?Other',?'Object?moved?--?see?Method?and?URL?list'),304:?('Not?Modified','Document?has?not?changed?since?given?time'),305:?('Use?Proxy','You?must?use?proxy?specified?in?Location?to?access?this?''resource.'),307:?('Temporary?Redirect','Object?moved?temporarily?--?see?URI?list'),400:?('Bad?Request','Bad?request?syntax?or?unsupported?method'),401:?('Unauthorized','No?permission?--?see?authorization?schemes'),402:?('Payment?Required','No?payment?--?see?charging?schemes'),403:?('Forbidden','Request?forbidden?--?authorization?will?not?help'),404:?('Not?Found',?'Nothing?matches?the?given?URI'),405:?('Method?Not?Allowed','Specified?method?is?invalid?for?this?server.'),406:?('Not?Acceptable',?'URI?not?available?in?preferred?format.'),407:?('Proxy?Authentication?Required',?'You?must?authenticate?with?''this?proxy?before?proceeding.'),408:?('Request?Timeout',?'Request?timed?out;?try?again?later.'),409:?('Conflict',?'Request?conflict.'),410:?('Gone','URI?no?longer?exists?and?has?been?permanently?removed.'),411:?('Length?Required',?'Client?must?specify?Content-Length.'),412:?('Precondition?Failed',?'Precondition?in?headers?is?false.'),413:?('Request?Entity?Too?Large',?'Entity?is?too?large.'),414:?('Request-URI?Too?Long',?'URI?is?too?long.'),415:?('Unsupported?Media?Type',?'Entity?body?in?unsupported?format.'),416:?('Requested?Range?Not?Satisfiable','Cannot?satisfy?request?range.'),417:?('Expectation?Failed','Expect?condition?could?not?be?satisfied.'),500:?('Internal?Server?Error',?'Server?got?itself?in?trouble'),501:?('Not?Implemented','Server?does?not?support?this?operation'),502:?('Bad?Gateway',?'Invalid?responses?from?another?server/proxy.'),503:?('Service?Unavailable','The?server?cannot?process?the?request?due?to?a?high?load'),504:?('Gateway?Timeout','The?gateway?server?did?not?receive?a?timely?response'),505:?('HTTP?Version?Not?Supported',?'Cannot?fulfill?request.'),}

轉(zhuǎn)載于:https://www.cnblogs.com/mmbbflyer/p/6340375.html

總結(jié)

以上是生活随笔為你收集整理的Python 3.X 要使用urllib.request 来抓取网络资源。转的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯,歡迎將生活随笔推薦給好友。