mysql 1048_MySQL Error 1048 奇遇记-阿里云开发者社区
前提
上周遇到次奇葩的同步錯(cuò)誤,error 1048 , 看似是簡(jiǎn)單的not null導(dǎo)致,但是為什么master可以執(zhí)行,slave不行呢?為什么5.1的slave可以,5.6的slave不行呢? 帶著很多疑問(wèn),準(zhǔn)備來(lái)一窺究竟
[ERROR] Slave SQL: Error 'Column 'type_id' cannot be null' on query. Default database: ''. Query: 'insert into if_dw_stats.da_upload_nh_score_rank_result(city_id,city_name,comm_id,region_name,paid,comm_name_nh,region_id_num,region_id,subregion_id_num,subregion_id,vcuv,vcuv_z,call_vcuv,call_vcuv_z,orders_vcuv,orders_vcuv_z,peitao,peitao_z,result_score,rank,type_id,type_name,pinyin,cal_dt) values (N), 其中N>9000;
這里總結(jié)一下我遇到過(guò)的錯(cuò)誤,分三種情況,雖然都是由于null引起,但是1048才是重點(diǎn)。
timestamp字段類(lèi)型,為什么master執(zhí)行成功,同步到slave報(bào)錯(cuò)?
int字段類(lèi)型,5.1(master)
int字段類(lèi)型,5.6(master)
接下來(lái),開(kāi)始進(jìn)入主題
場(chǎng)景一
explicit_defaults_for_timestamp timestamp注意事項(xiàng)
* DB架構(gòu): Master(5.1)
* 表結(jié)構(gòu)如下 :
dbadmin:abc> desc lc_time;
+-------+-----------+------+-----+-------------------+-----------------------------+
| Field | Type | Null | Key | Default | Extra |
+-------+-----------+------+-----+-------------------+-----------------------------+
| id | timestamp | NO | | CURRENT_TIMESTAMP | on update CURRENT_TIMESTAMP |
+-------+-----------+------+-----+-------------------+-----------------------------+
1 row in set (0.00 sec)
* master
dbadmin:abc> select @@global.explicit_defaults_for_timestamp;
+------------------------------------------+
| @@global.explicit_defaults_for_timestamp |
+------------------------------------------+
| 0 |
+------------------------------------------+
1 row in set (0.00 sec)
dbadmin:abc> insert into lc_time values(null);
Query OK, 1 row affected (0.02 sec)
dbadmin:abc> select * from lc_time;
+---------------------+
| id |
+---------------------+
| 2014-11-25 13:02:14 |
+---------------------+
1 row in set (0.00 sec)
*slave
dbadmin:abc> select @@global.explicit_defaults_for_timestamp;
+------------------------------------------+
| @@global.explicit_defaults_for_timestamp |
+------------------------------------------+
| 1 |
+------------------------------------------+
1 row in set (0.00 sec)
dbadmin:abc> insert into lc_time values(null);
ERROR 1048 (23000): Column 'id' cannot be null
結(jié)論:master上explicit_defaults_for_timestamp=0,slave上explicit_defaults_for_timestamp=1,會(huì)出現(xiàn)這種錯(cuò)誤。
解決方案:
保證master和slave explicit_defaults_for_timestamp 一致。
前端過(guò)濾掉null。
場(chǎng)景二
* DB架構(gòu)
master(5.1)
|
-------------------------------------
| |
slave A(5.1) slave B(5.6)
* 表結(jié)構(gòu)
dbadmin:abc> show create table abc;
+-------+-----------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+-------+-----------------------------------------------------------------------------------------------------------------------------+
| abc | CREATE TABLE `abc` (
`id` int(11) DEFAULT NULL,
`id2` int(11) NOT NULL DEFAULT '6'
) ENGINE=InnoDB DEFAULT CHARSET=utf8 |
+-------+-----------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
* 核心參數(shù): master 和 slave A,B 的sql_mode 都是 '';
* 癥狀:在master上執(zhí)行一條SQL語(yǔ)句 insert into abc values(1,0),(1,null);
結(jié)果 Slave A 正常, Slave B 報(bào)error 1048,Error 'Column 'id2' cannot be null' on query, 這是為什么呢?
Question1:為什么insert into abc values(1,null)失敗?insert into abc values(1,0),(1,null);成功?
Question2:為什么5.1 slave可以,5.6slave 不行?
Question3:手動(dòng)去slave B上執(zhí)行同樣的insert,為什么可以執(zhí)行成功?
如果你已經(jīng)知道為什么,可以忽略下面的分析。
* 分析:
細(xì)心的讀者已經(jīng)發(fā)現(xiàn),第一個(gè)問(wèn)題的答案已經(jīng)在sql_mode鏈接中。接下來(lái),測(cè)試過(guò)程中發(fā)現(xiàn):insert into abc values(1,0),(1,null); 在sql_mode=''的時(shí)候,不管是5.1還是5.6都會(huì)成功執(zhí)行。那么問(wèn)題只有一個(gè),sql_mode出了問(wèn)題。查看master binlog后發(fā)現(xiàn):在insert語(yǔ)句之前,多了這個(gè)可以執(zhí)行的注釋:SET @@session.sql_mode=2097152。我們來(lái)看看:
dbadmin:abc> SET @@session.sql_mode=2097152;
Query OK, 0 rows affected (0.00 sec)
dbadmin:abc> select @@session.sql_mode;
+---------------------+
| @@session.sql_mode |
+---------------------+
| STRICT_TRANS_TABLES |
+---------------------+
1 row in set (0.00 sec)
這下,似乎發(fā)現(xiàn)了蛛絲馬跡,那么問(wèn)題又來(lái)了。
SET @@session.sql_mode=2097152; 從何而來(lái)?是程序?qū)懙?#xff1f;還是mysql自帶的?
經(jīng)過(guò)一番折騰,定位到此SQL來(lái)自java jdbc 。
以下代碼摘自 java ConnectionIMPL.java
private void setupServerForTruncationChecks() throws SQLException {
if (getJdbcCompliantTruncation()) {
if (versionMeetsMinimum(5, 0, 2)) {
String currentSqlMode =
this.serverVariables.get("sql_mode");
boolean strictTransTablesIsSet = StringUtils.indexOfIgnoreCase(currentSqlMode, "STRICT_TRANS_TABLES") != -1;
if (currentSqlMode == null ||
currentSqlMode.length() == 0 || !strictTransTablesIsSet) {
StringBuffer commandBuf = new StringBuffer("SET sql_mode='");
if (currentSqlMode != null && currentSqlMode.length() > 0) {
commandBuf.append(currentSqlMode);
commandBuf.append(",");
}
commandBuf.append("STRICT_TRANS_TABLES'");
execSQL(null, commandBuf.toString(), -1, null,
DEFAULT_RESULT_SET_TYPE,
DEFAULT_RESULT_SET_CONCURRENCY, false,
this.database, null, false);
setJdbcCompliantTruncation(false); // server's handling this for us now
} else if (strictTransTablesIsSet) {
// We didn't set it, but someone did, so we piggy back on it
setJdbcCompliantTruncation(false); // server's handling this for us now
}
}
}
}
大致的意思就是:如果sql_mode = ‘’,那么java會(huì)調(diào)高sql_mode的級(jí)別,commandBuf.append("STRICT_TRANS_TABLES'");
ok,這下我們已經(jīng)知道此set來(lái)自java,那么問(wèn)題又來(lái)了。即便設(shè)置STRICT_TRANS_TABLES,要出問(wèn)題,master就會(huì)報(bào)錯(cuò)了,為啥master是好的,Slave A是好的,卻Slave B 同步出錯(cuò)呢?
結(jié)果已經(jīng)很明顯,因?yàn)镾lave B是5.6。說(shuō)的明顯一點(diǎn)就是:
在嚴(yán)格模式下,5.1中可以執(zhí)行,但是5.6不行,這應(yīng)該算是5.6安全方面的新特性么?
有興趣的同學(xué)可以自己測(cè)試下。
解決方案
配置java或者修改java源碼,讓其不要更改mysql的sql_mode
臨時(shí)解決方案: insert ignore xxx;
sql_mode的規(guī)范。
場(chǎng)景三
來(lái)自case when的奇葩錯(cuò)誤
* DB架構(gòu) Master(5.6)
* sql_mode 都是'';
* 報(bào)錯(cuò)如下:
Replicate_Wild_Ignore_Table: mysql.%,test.%
Last_Errno: 1048
Last_Error: Error 'Column 'referer' cannot be null' on query. Default database: 'action_db'. Query: 'insert into oplogin_log(`cityId`,`userId`,`userName`,`uri
`,`referer`,`logType`,`logDate`,`ip`,`status`)
values('','','kyqxmxyt','/login.php?rtn=1','http://xx.com:80/login.php?rtn=' RLIKE (SELECT (CASE WHEN (ORD(MID((SELECT IFNULL(CAST(COUNT(DISTINCT(schema_na
me)) AS CHAR),0x20) FROM INFORMATION_SCHEMA.SCHEMATA),1,1))>50) THEN 0x687474703a2f2f6f70746f6f6c732e616e6a756b652e636f6d3a38302f6c6f67696e2e7068703f72746e3d ELSE 0x28 END)) AND 'ae
WZ'='aeWZ','1','1416020259','114.242.250.192','2') #v1:checklogin@login.php (15) 1416020259'
這條奇葩且牛B的SQL,我來(lái)稍微翻譯一下,如果INFORMATION_SCHEMA.SCHEMATA 去重后,得到的庫(kù)名的第一個(gè)字符如果是1,返回0,否則返回 null。
將這種SQL稍微轉(zhuǎn)換成簡(jiǎn)單一點(diǎn)的:
master:abc> desc abc;
+-------+---------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+---------+------+-----+---------+-------+
| id | int(11) | YES | | NULL | |
| id2 | int(11) | NO | | 6 | |
+-------+---------+------+-----+---------+-------+
2 rows in set (0.00 sec)
master:abc> select * from abc;
+------+-----+
| id | id2 |
+------+-----+
| 1 | 0 |
| 1 | 0 |
| 2 | 0 |
| 2 | 0 |
| 1 | 1 |
| 1 | 0 |
+------+-----+
6 rows in set (0.00 sec)
master:abc> select * from lc;
Empty set (0.00 sec)
master:abc> insert into abc values('1', case when (select count(*) from lc) < 1 then 1 else NULL end );
Query OK, 1 row affected (0.00 sec)
查看master的binlog如下:
*binlog*
# at 1109
#141125 12:44:51 server id 101082106 end_log_pos 1271 CRC32 0x9ec0ca94 Query thread_id=28 exec_time=0 error_code=0
SET TIMESTAMP=1416890691/*!*/;
insert into abc values('1', case when (select count(*) from lc) < 1 then 1 else NULL end )
/*!*/;
slave:abc> select * from abc;
+------+-----+
| id | id2 |
+------+-----+
| 1 | 0 |
| 1 | 0 |
| 2 | 0 |
| 2 | 0 |
| 1 | 0 |
| 1 | 0 |
+------+-----+
6 rows in set (0.00 sec)
slave:abc> select * from lc;
+------+
| id |
+------+
| 1 |
| 2 |
| 3 |
+------+
3 rows in set (0.00 sec)
*slave status*
Last_SQL_Errno: 1048
Last_SQL_Error: Error 'Column 'id2' cannot be null' on query. Default database: 'abc'. Query: 'insert into abc values('1', case when (select count(*) from lc) < 1 then 1 else NULL end )'
結(jié)論
最終binlog并不是RBR,所以會(huì)報(bào)錯(cuò)。
臨時(shí)解決方案: insert ignore xxx. 然后再用pt-table-checksum && pt-sync等修復(fù)。
禁止case when語(yǔ)句。
總結(jié)
以上是生活随笔為你收集整理的mysql 1048_MySQL Error 1048 奇遇记-阿里云开发者社区的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問(wèn)題。
- 上一篇: import math在python种中
- 下一篇: 常用搜索引擎指令