Python中url的编码以及解码
生活随笔
收集整理的這篇文章主要介紹了
Python中url的编码以及解码
小編覺得挺不錯的,現在分享給大家,幫大家做個參考.
當有些請求,或者地址中的漢字以及特殊符號不編碼使用不了時候,則需要去把中文進行編碼,有些地址拿到之后,需要進行解碼,不然中文會變成百分號加幾個字母和數字的形式
1.url編碼
from urllib.parse import quote# 將字符串‘程序設計’進行編碼 text = quote("程序設計", 'utf-8') print(text) # 打印結果 # %E7%A8%8B%E5%BA%8F%E8%AE%BE%E8%AE%A12.url解碼
from urllib.parse import unquote# 對字符串‘%E7%A8%8B%E5%BA%8F%E8%AE%BE%E8%AE%A1’進行解密 text = unquote("%E7%A8%8B%E5%BA%8F%E8%AE%BE%E8%AE%A1", 'utf-8') print(text) # 打印結果 # 程序設計3.源碼
def quote(string, safe='/', encoding=None, errors=None):"""quote('abc def') -> 'abc%20def'Each part of a URL, e.g. the path info, the query, etc., has adifferent set of reserved characters that must be quoted. Thequote function offers a cautious (not minimal) way to quote astring for most of these parts.RFC 3986 Uniform Resource Identifier (URI): Generic Syntax liststhe following (un)reserved characters.unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"reserved = gen-delims / sub-delimsgen-delims = ":" / "/" / "?" / "#" / "[" / "]" / "@"sub-delims = "!" / "$" / "&" / "'" / "(" / ")"/ "*" / "+" / "," / ";" / "="Each of the reserved characters is reserved in some component of a URL,but not necessarily in all of them.The quote function %-escapes all characters that are neither in theunreserved chars ("always safe") nor the additional chars set via thesafe arg.The default for the safe arg is '/'. The character is reserved, but intypical usage the quote function is being called on a path where theexisting slash characters are to be preserved.Python 3.7 updates from using RFC 2396 to RFC 3986 to quote URL strings.Now, "~" is included in the set of unreserved characters.string and safe may be either str or bytes objects. encoding and errorsmust not be specified if string is a bytes object.The optional encoding and errors parameters specify how to deal withnon-ASCII characters, as accepted by the str.encode method.By default, encoding='utf-8' (characters are encoded with UTF-8), anderrors='strict' (unsupported characters raise a UnicodeEncodeError)."""if isinstance(string, str):if not string:return stringif encoding is None:encoding = 'utf-8'if errors is None:errors = 'strict'string = string.encode(encoding, errors)else:if encoding is not None:raise TypeError("quote() doesn't support 'encoding' for bytes")if errors is not None:raise TypeError("quote() doesn't support 'errors' for bytes")return quote_from_bytes(string, safe) def unquote(string, encoding='utf-8', errors='replace'):"""Replace %xx escapes by their single-character equivalent. The optionalencoding and errors parameters specify how to decode percent-encodedsequences into Unicode characters, as accepted by the bytes.decode()method.By default, percent-encoded sequences are decoded with UTF-8, and invalidsequences are replaced by a placeholder character.unquote('abc%20def') -> 'abc def'."""if '%' not in string:string.splitreturn stringif encoding is None:encoding = 'utf-8'if errors is None:errors = 'replace'bits = _asciire.split(string)res = [bits[0]]append = res.appendfor i in range(1, len(bits), 2):append(unquote_to_bytes(bits[i]).decode(encoding, errors))append(bits[i + 1])return ''.join(res) 與50位技術專家面對面20年技術見證,附贈技術全景圖總結
以上是生活随笔為你收集整理的Python中url的编码以及解码的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: Pandas的学习(读取mongodb数
- 下一篇: Python中使用代码将后缀名doc文件