【MySQL】字符集utf8mb4无法存储表情踩坑记录
現(xiàn)象
字段上的字符集優(yōu)先級(jí)高于表的字符集,表的字符集優(yōu)先級(jí)高于數(shù)據(jù)庫(kù)的字符集,理論上只要表的字符集為utf8mb4就能存儲(chǔ)表情,真的是這樣嗎?
MySQL數(shù)據(jù)表的字符集已經(jīng)設(shè)置成了utf8mb4,但是通過(guò)JDBC向數(shù)據(jù)庫(kù)寫(xiě)入4字節(jié)的emoji表情時(shí)報(bào)錯(cuò),但是通過(guò)直接使用SQL語(yǔ)句在命令行插入該4字節(jié)的emoji表情時(shí)卻成功了。
示例如下:
表結(jié)構(gòu):
CREATE TABLE `user_info` (`id` int(11) NOT NULL AUTO_INCREMENT,`name` varchar(11) NOT NULL,`age` int(4) DEFAULT NULLPRIMARY KEY (`id`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;JDBC寫(xiě)入:
User user = new User(); user.setName("\uD83D\uDC8B"); user.setAge(18); userMapper.insertUser(user);報(bào)錯(cuò)結(jié)果如下:
org.springframework.jdbc.UncategorizedSQLException: ### Error updating database. Cause: java.sql.SQLException: Incorrect string value: '\xF0\x9F\x92\x8B' for column 'name' at row 1 ### The error may involve com.wakzz.database.persistence.UserMapper.insertUser-Inline ### The error occurred while setting parameters ### SQL: insert into user_info (name, age) values (?, ? ) ### Cause: java.sql.SQLException: Incorrect string value: '\xF0\x9F\x92\x8B' for column 'name' at row 1 ; uncategorized SQLException; SQL state [HY000]; error code [1366]; Incorrect string value: '\xF0\x9F\x92\x8B' for column 'name' at row 1; nested exception is java.sql.SQLException: Incorrect string value: '\xF0\x9F\x92\x8B' for column 'name' at row 1at org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:89)at org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:81)at org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:81)at org.mybatis.spring.MyBatisExceptionTranslator.translateExceptionIfPossible(MyBatisExceptionTranslator.java:73)at org.mybatis.spring.SqlSessionTemplate$SqlSessionInterceptor.invoke(SqlSessionTemplate.java:446)at com.sun.proxy.$Proxy81.insert(Unknown Source)at org.mybatis.spring.SqlSessionTemplate.insert(SqlSessionTemplate.java:278)at org.apache.ibatis.binding.MapperMethod.execute(MapperMethod.java:58)at org.apache.ibatis.binding.MapperProxy.invoke(MapperProxy.java:59)at com.sun.proxy.$Proxy92.insertUser(Unknown Source)命令行寫(xiě)入成功:
mysql> insert into user_info (name,age) values ('💋',18); Query OK, 1 row affected錯(cuò)誤原因
原因:JDBC會(huì)自動(dòng)檢測(cè)MySQL服務(wù)端character_set_server的值,自動(dòng)執(zhí)行SET NAMES命令設(shè)置整個(gè)連接的字符集編碼,其目的是自動(dòng)檢測(cè)服務(wù)端字符集編碼配置而減少JDBC客戶(hù)端的字符集編碼配置。如果MySQL服務(wù)端character_set_server的值為utf8,那么JDBC就會(huì)將連接的字符集編碼設(shè)置為utf8,這樣即使表的字符集為utf8mb4也是無(wú)法存儲(chǔ)表情的。
官方說(shuō)明:https://dev.mysql.com/doc/relnotes/connector-j/5.1/en/news-5-1-13.html
查看JDBC源碼發(fā)現(xiàn):
// realJavaEncoding為url中指定characterEncoding的值 if (realJavaEncoding.equalsIgnoreCase("UTF-8") || realJavaEncoding.equalsIgnoreCase("UTF8")) {// charset names are case-sensitive// 取MySQL服務(wù)端character_set_serverboolean useutf8mb4 = CharsetMapping.UTF8MB4_INDEXES.contains(this.session.getServerDefaultCollationIndex());if (!this.useOldUTF8Behavior.getValue()) {if (dontCheckServerMatch || !this.session.characterSetNamesMatches("utf8") || (!this.session.characterSetNamesMatches("utf8mb4"))) {// 執(zhí)行set names xxxexecSQL(null, "SET NAMES " + (useutf8mb4 ? "utf8mb4" : "utf8"), -1, null, false, this.database, null, false);this.session.getServerVariables().put("character_set_client", useutf8mb4 ? "utf8mb4" : "utf8");this.session.getServerVariables().put("character_set_connection", useutf8mb4 ? "utf8mb4" : "utf8");}} else {execSQL(null, "SET NAMES latin1", -1, null, false, this.database, null, false);this.session.getServerVariables().put("character_set_client", "latin1");this.session.getServerVariables().put("character_set_connection", "latin1");}this.characterEncoding.setValue(realJavaEncoding); }在獲取mysql的服務(wù)器參數(shù)后,解析字符集編碼:
- 當(dāng)character_set_server為utf8時(shí),執(zhí)行SET NAMES utf8
- 當(dāng)character_set_server為utf8mb4時(shí),執(zhí)行SET NAMES utf8mb4
在命令行中測(cè)試SET NAMES發(fā)現(xiàn)即使數(shù)據(jù)庫(kù)表的字符集是utf8mb4時(shí),若執(zhí)行了SET NAMES utf8也會(huì)導(dǎo)致4字節(jié)字符寫(xiě)入mysql失敗。成功復(fù)現(xiàn)了JDBC寫(xiě)入emoji寫(xiě)入異常的問(wèn)題。
mysql> SET NAMES utf8; Query OK, 0 rows affectedmysql> insert into user_info (name,age) values ('💋',18); 1366 - Incorrect string value: '\xF0\x9F\x92\x8B' for column 'name' at row 1 mysql> SET NAMES utf8mb4; Query OK, 0 rows affectedmysql> insert into user_info (name,age) values ('💋',18); Query OK, 1 row affected解決辦法
修改character_set_server為utf8mb4
修改mysql配置文件my.cnf,添加以下配置:
character_set_server = utf8mb4需要重啟數(shù)據(jù)庫(kù)實(shí)例。
修改前字符集
mysql> show variables like "%char%"; +--------------------------+----------------------------------------+ | Variable_name | Value | +--------------------------+----------------------------------------+ | character_set_client | utf8mb4 | | character_set_connection | utf8mb4 | | character_set_database | latin1 | | character_set_filesystem | binary | | character_set_results | utf8mb4 | | character_set_server | utf8 | | character_set_system | utf8 | | character_sets_dir | /usr/soft/mysql-5.6.31/share/charsets/ | +--------------------------+----------------------------------------+ 8 rows in setmysql> SHOW VARIABLES LIKE 'collation%'; +----------------------+--------------------+ | Variable_name | Value | +----------------------+--------------------+ | collation_connection | utf8mb4_general_ci | | collation_database | latin1_swedish_ci | | collation_server | utf8_general_ci | +----------------------+--------------------+ 3 rows in set修改后字符集
mysql> show variables like "%char%"; +--------------------------+----------------------------------------+ | Variable_name | Value | +--------------------------+----------------------------------------+ | character_set_client | utf8mb4 | | character_set_connection | utf8mb4 | | character_set_database | latin1 | | character_set_filesystem | binary | | character_set_results | utf8mb4 | | character_set_server | utf8mb4 | | character_set_system | utf8 | | character_sets_dir | /usr/soft/mysql-5.6.31/share/charsets/ | +--------------------------+----------------------------------------+ 8 rows in setmysql> SHOW VARIABLES LIKE 'collation%'; +----------------------+--------------------+ | Variable_name | Value | +----------------------+--------------------+ | collation_connection | utf8mb4_general_ci | | collation_database | latin1_swedish_ci | | collation_server | utf8mb4_general_ci | +----------------------+--------------------+ 3 rows in set手動(dòng)設(shè)置數(shù)據(jù)庫(kù)連接的編碼為utf8mb4
JDBC設(shè)置連接的編碼:
Connection conn = DriverManager.getConnection(url, userName, password); conn.prepareStatement("set names utf8mb4").executeQuery();如果是Spring項(xiàng)目,可以從ThreadLocal拿到連接:
ConnectionHolder connectionHolder = (ConnectionHolder) TransactionSynchronizationManager.getResource(dataSource); Connection connection = connectionHolder.getConnection(); try {PreparedStatement preparedStatement = connection.prepareStatement("SET NAMES utf8mb4");preparedStatement.executeQuery(); } catch (SQLException e) {e.printStackTrace(); }其中dataSource可以通過(guò)Spring注入進(jìn)來(lái):
private DataSource dataSource;注意使用此方法,需要開(kāi)啟事務(wù),否則從ThreadLocal中拿不到連接。
最后付上修改數(shù)據(jù)庫(kù)、表、字符字符集的SQL:
-- 修改數(shù)據(jù)庫(kù)的字符集 alter database DBNAME DEFAULT CHARACTER SET utf8mb4;-- 修改表的字符集,改了表字符集后只對(duì)新增的字段有效 alter table tbl_name convert to character set character_name ;-- 修改字段的字符集 alter table tbl_name change col1 col1 varchar(20) CHARACTER SET utf8mb4;對(duì)字符串進(jìn)行編碼存入,取出解碼
存入數(shù)據(jù)庫(kù)時(shí)對(duì)字符串進(jìn)行編碼:
String encode = URLEncoder.encode("就🧑", StandardCharsets.UTF_8.name());從數(shù)據(jù)庫(kù)取出時(shí)對(duì)字符串進(jìn)行解碼:
URLDecoder.decode(encode, StandardCharsets.UTF_8.name());總結(jié)
以上是生活随笔為你收集整理的【MySQL】字符集utf8mb4无法存储表情踩坑记录的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問(wèn)題。
- 上一篇: 证明ker f是H中的闭线性子空间(f是
- 下一篇: mysql编码修改utf8_修改数据库m