python mysql 编码方式,Python3编码与mysql编码介绍
Python3自詡解決了編碼問(wèn)題,但還是有一系列的坑。本文就記錄下前幾天遇到的python3編碼問(wèn)題。mysql編碼問(wèn)題附帶介紹。
python3 json串的編碼
針對(duì)于包含中文的字典,如果想要正常顯示中文,在dumps時(shí),需配置參數(shù)ensure_ascii=False。舉例:
a={"name":"中國(guó)"}
json.dumps(a)
'{"name": "\\u4e2d\\u56fd"}'
json.dumps(a,ensure_ascii=False)
'{"name": "中國(guó)"}'
針對(duì)于包含特定轉(zhuǎn)義字符的字符串,如果想要正常解析,需要在loads時(shí)配置strict=False。舉例:
json.loads('{"foo":"bar\nbaz"}')
Traceback (most recent call last):
File "", line 1, in
json.loads('{"foo":"bar\nbaz"}')
File "C:\Users\jonyguo\AppData\Local\Programs\Python\Python36\lib\json\__init__.py", line 354, in loads
return _default_decoder.decode(s)
File "C:\Users\jonyguo\AppData\Local\Programs\Python\Python36\lib\json\decoder.py", line 339, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "C:\Users\jonyguo\AppData\Local\Programs\Python\Python36\lib\json\decoder.py", line 355, in raw_decode
obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Invalid control character at: line 1 column 12 (char 11)
json.loads('{"foo":"bar\nbaz"}', strict=False)
{'foo': 'bar\nbaz'}
python3的字符串編碼
python3中只有兩種字符串,一是str,一是bytes。str經(jīng)過(guò)encode變成bytes,bytes經(jīng)過(guò)decode變成str。
有時(shí)從網(wǎng)絡(luò)取出的包含中文的數(shù)據(jù)為unicode編碼的字符串,可通過(guò)先編碼在解碼轉(zhuǎn)化為中文:
a="\\u4e2d\\u56fd"
print(a)
\u4e2d\u56fd
a.encode().decode("unicode_escape")
'中國(guó)'
也可以通過(guò)repr將其轉(zhuǎn)化為字符串,將兩個(gè)反斜杠替換為一個(gè)反斜杠來(lái)解決這個(gè)問(wèn)題:
a="\\u4e2d\\u56fd"
eval(repr(a).replace('\\\\', '\\'))
'中國(guó)'
python3 + apache的字符編碼問(wèn)題
python3腳本作為cgi供前端界面調(diào)用。遇到了一個(gè)很奇怪的問(wèn)題,我通過(guò)編寫(xiě)的python腳本調(diào)用cgi時(shí),編碼一切正常,但是當(dāng)我通過(guò)http調(diào)用時(shí)會(huì)出現(xiàn)一些問(wèn)題。從數(shù)據(jù)庫(kù)中取中文數(shù)據(jù),返回前端顯示一切都正常。但是當(dāng)我把數(shù)據(jù)庫(kù)中的中文與一些字符組成一個(gè)文件名,判斷文件是否存在時(shí),一直報(bào)錯(cuò):UnicodeEncodeError: 'ascii' codec can't encode characters in position 46-49: ordinal not in range(128)。
剛開(kāi)始以為是apache的編碼問(wèn)題,后來(lái)查看apache的編碼也確定是utf8,不知所措。經(jīng)google,查找到了原因。
https://www.raspberrypi.org/forums/viewtopic.php?t=65257 這個(gè)帖子里面有介紹到說(shuō):
The difference is that from the command line Python inherits your locale settings (probably LANG=fr_FR.UTF-8), whereas from Apache it inherits LANG=C. It knows that your strings are Unicode, but it can not print them in an ASCII environment.
說(shuō)是通過(guò)python腳本調(diào)用的時(shí)候python繼承的是locale 設(shè)置,為utf8,可正常顯示(個(gè)人感覺(jué)這里可能說(shuō)的有些不恰當(dāng),這里應(yīng)該是采用python3自己的編碼)。而apache繼承的是LANG=C,為ascii,無(wú)法正常顯示。按照其配置,在/etc/apache2/envvars中添加. /etc/default/locale(/etc/sysconfig/i18n)即可。配置后發(fā)現(xiàn)依然無(wú)法解決問(wèn)題。
后又繼續(xù)google,找到了解決方案。
https://stackoverflow.com/questions/9322410/set-encoding-in-python-3-cgi-scripts
Add PassEnv LANG line to the end of your /etc/apache2/apache2.conf or .htaccess.
Uncomment . /etc/default/locale line in /etc/apache2/envvars.
Make sure line similar to LANG="en_US.UTF-8" is present in /etc/default/locale.
就是在apache2的配置文件中添加一行:PassEnv LANG 即可。要確保LANG為utf8。
mysql編碼問(wèn)題
查看當(dāng)前的數(shù)據(jù)庫(kù)編碼:
mysql> show variables like 'character%';
+--------------------------+--------------------------------------------------------------+
| Variable_name | Value |
+--------------------------+--------------------------------------------------------------+
| character_set_client | latin1 |
| character_set_connection | latin1 |
| character_set_database | utf8 |
| character_set_filesystem | binary |
| character_set_results | latin1 |
| character_set_server | latin1 |
| character_set_system | utf8 |
| character_sets_dir | /usr/local/mysql-5.1.46-linux-x86_64-glibc23/share/charsets/ |
+--------------------------+--------------------------------------------------------------+
從上圖可知,數(shù)據(jù)庫(kù)的編碼為utf8.
● character_set_client:無(wú)論客戶端傳遞的是什么編碼的數(shù)據(jù),服務(wù)器都當(dāng)成該編碼來(lái)處理,例如該編碼為UTF8,那么如果客戶端發(fā)送過(guò)來(lái)的數(shù)據(jù)不是UTF8,那么就會(huì)出現(xiàn)亂碼;
● character_set_connection:通過(guò)該編碼與client一致!該編碼不會(huì)導(dǎo)致亂碼!當(dāng)執(zhí)行的是查詢語(yǔ)句時(shí),客戶端發(fā)送過(guò)來(lái)的數(shù)據(jù)會(huì)先轉(zhuǎn)換成connection指定的編碼。但只要客戶端發(fā)送過(guò)來(lái)的數(shù)據(jù)與client指定的編碼一致,那么轉(zhuǎn)換就不會(huì)出現(xiàn)問(wèn)題;
● character_set_database:數(shù)據(jù)庫(kù)默認(rèn)編碼,在創(chuàng)建數(shù)據(jù)庫(kù)時(shí),如果沒(méi)有指定編碼,那么默認(rèn)使用database編碼;
● character_set_filesystem:可以理解為文件的最終存儲(chǔ)形式,是二進(jìn)制形式的;
● character_set_server:MySQL服務(wù)器默認(rèn)編碼;
● character_set_results:MySQL會(huì)把數(shù)據(jù)轉(zhuǎn)換成該編碼后,再發(fā)送給客戶端,例如該編碼為UTF8,那么如果客戶端不使用UTF8來(lái)解讀,那么就會(huì)出現(xiàn)亂碼,說(shuō)明客戶端必須使用result指定的編碼來(lái)解碼;
一條數(shù)據(jù)庫(kù)連接的過(guò)程如下:
client --> connection --> server --> connection --> result
其實(shí)只要保證client、connection和result 一致就不會(huì)出現(xiàn)亂碼問(wèn)題。
通過(guò)set names utf8 保證client、connection和result 的編碼一致:
mysql> show variables like 'character%';
+--------------------------+--------------------------------------------------------------+
| Variable_name | Value |
+--------------------------+--------------------------------------------------------------+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | utf8 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | latin1 |
| character_set_system | utf8 |
| character_sets_dir | /usr/local/mysql-5.1.46-linux-x86_64-glibc23/share/charsets/ |
+--------------------------+--------------------------------------------------------------+
總結(jié)
以上是生活随笔為你收集整理的python mysql 编码方式,Python3编码与mysql编码介绍的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問(wèn)題。
- 上一篇: matlab intergral,mat
- 下一篇: oracle limsize,查看ora