mysql replication延迟_MySQL Replication--复制延迟01--源码瞎猜
本人完全不懂MySQL源碼,以下文字純屬瞎猜,如有誤導(dǎo),概不負(fù)責(zé)!、
源碼版本:MySQL 5.6.28
在sql/rpl_slave.cc文件中,time_diff的計(jì)算代碼為:
/*The pseudo code to compute Seconds_Behind_Master:
if (SQL thread is running)
{
if (SQL thread processed all the available relay log)
{
if (IO thread is running)
print 0;
else
print NULL;
}
else
compute Seconds_Behind_Master;
}
else
print NULL;*/
if (mi->rli->slave_running)
{/*Check if SQL thread is at the end of relay log
Checking should be done using two conditions
condition1: compare the log positions and
condition2: compare the file names (to handle rotation case)*/
if ((mi->get_master_log_pos() == mi->rli->get_group_master_log_pos()) &&(!strcmp(mi->get_master_log_name(), mi->rli->get_group_master_log_name())))
{if (mi->slave_running ==MYSQL_SLAVE_RUN_CONNECT)
protocol->store(0LL);elseprotocol->store_null();
}else{long time_diff= ((long)(time(0) - mi->rli->last_master_timestamp)- mi->clock_diff_with_master);/*Apparently on some systems time_diff can be <0. Here are possible
reasons related to MySQL:
- the master is itself a slave of another master whose time is ahead.
- somebody used an explicit SET TIMESTAMP on the master.
Possible reason related to granularity-to-second of time functions
(nothing to do with MySQL), which can explain a value of -1:
assume the master's and slave's time are perfectly synchronized, and
that at slave's connection time, when the master's timestamp is read,
it is at the very end of second 1, and (a very short time later) when
the slave's timestamp is read it is at the very beginning of second
2. Then the recorded value for master is 1 and the recorded value for
slave is 2. At SHOW SLAVE STATUS time, assume that the difference
between timestamp of slave and rli->last_master_timestamp is 0
(i.e. they are in the same second), then we get 0-(2-1)=-1 as a result.
This confuses users, so we don't go below 0: hence the max().
last_master_timestamp == 0 (an "impossible" timestamp 1970) is a
special marker to say "consider we have caught up".*/protocol->store((longlong)(mi->rli->last_master_timestamp ?max(0L, time_diff) : 0));
}
}else{
protocol->store_null();
}
1、當(dāng)SQL線程停止時(shí),返回NULL
2、當(dāng)SLAVE正常運(yùn)行時(shí),如果SQL線程執(zhí)行的位置是relay log的最后位置則返回0,否則返回NULL
3、當(dāng)SLAVE正常運(yùn)行時(shí),復(fù)制延遲時(shí)間=當(dāng)前從庫系統(tǒng)時(shí)間(time(0)) - SQL線程處理的最后binlog的時(shí)間戳(?mi->rli->last_master_timestamp) - 主從系統(tǒng)時(shí)間差(mi->clock_diff_with_master)
主從系統(tǒng)時(shí)間差(mi->clock_diff_with_master)
在sql/rpl_slave.cc文件中,主從系統(tǒng)時(shí)間差計(jì)算代碼如下:
/*Compare the master and slave's clock. Do not die if master's clock is
unavailable (very old master not supporting UNIX_TIMESTAMP()?).*/DBUG_EXECUTE_IF("dbug.before_get_UNIX_TIMESTAMP",
{const char act[]=
"now"
"wait_for signal.get_unix_timestamp";
DBUG_ASSERT(opt_debug_sync_timeout> 0);
DBUG_ASSERT(!debug_sync_set_action(current_thd,
STRING_WITH_LEN(act)));
};);
master_res=NULL;if (!mysql_real_query(mysql, STRING_WITH_LEN("SELECT UNIX_TIMESTAMP()")) &&(master_res= mysql_store_result(mysql)) &&(master_row=mysql_fetch_row(master_res)))
{
mysql_mutex_lock(&mi->data_lock);
mi->clock_diff_with_master=(long) (time((time_t*) 0) - strtoul(master_row[0], 0, 10));
mysql_mutex_unlock(&mi->data_lock);
}else if (check_io_slave_killed(mi->info_thd, mi, NULL))gotoslave_killed_err;else if(is_network_error(mysql_errno(mysql)))
{
mi->report(WARNING_LEVEL, mysql_errno(mysql),"Get master clock failed with error: %s", mysql_error(mysql));gotonetwork_err;
}else{
mysql_mutex_lock(&mi->data_lock);
mi->clock_diff_with_master= 0; /*The "most sensible" value*/mysql_mutex_unlock(&mi->data_lock);
sql_print_warning("\"SELECT UNIX_TIMESTAMP()\" failed on master,"
"do not trust column Seconds_Behind_Master of SHOW"
"SLAVE STATUS. Error: %s (%d)",
mysql_error(mysql), mysql_errno(mysql));
}if(master_res)
{
mysql_free_result(master_res);
master_res=NULL;
}
主從系統(tǒng)時(shí)間差=從庫當(dāng)前時(shí)間(time((time_t*) 0)) - 主庫當(dāng)前時(shí)間(UNIX_TIMESTAMP()),而主庫時(shí)間是到主庫上執(zhí)行SELECT UNIX_TIMESTAMP(),然后取執(zhí)行結(jié)果(strtoul(master_row[0], 0, 10))。
clock_diff_with_master的值是在IO線程啟動(dòng)時(shí)計(jì)算的,如果中途修改過主庫時(shí)間,會導(dǎo)致clock_diff_with_master的值出現(xiàn)偏差。
從庫SQL線程讀取到relay log中的事件但未開始執(zhí)行前就會更新last_master_timestamp的值,更新操作以event為單位。
非并行復(fù)制下last_master_timestamp計(jì)算
在sql/rpl_slave.cc文件中exec_relay_log_event方法中,計(jì)算非并行復(fù)制的last_master_timestamp的代碼如下:
/**
Top-level function for executing the next event in the relay log.
This is called from the SQL thread.
This function reads the event from the relay log, executes it, and
advances the relay log position. It also handles errors, etc.
This function may fail to apply the event for the following reasons:
- The position specfied by the UNTIL condition of the START SLAVE
command is reached.
- It was not possible to read the event from the log.
- The slave is killed.
- An error occurred when applying the event, and the event has been
tried slave_trans_retries times. If the event has been retried
fewer times, 0 is returned.
- init_info or init_relay_log_pos failed. (These are called
if a failure occurs when applying the event.)
- An error occurred when updating the binlog position.
@retval 0 The event was applied.
@retval 1 The event was not applied.*/
static int exec_relay_log_event(THD* thd, Relay_log_info*rli)
{
DBUG_ENTER("exec_relay_log_event");/*We acquire this mutex since we need it for all operations except
event execution. But we will release it in places where we will
wait for something for example inside of next_event().*/mysql_mutex_lock(&rli->data_lock);/*UNTIL_SQL_AFTER_GTIDS requires special handling since we have to check
whether the until_condition is satisfied *before* the SQL threads goes on
a wait inside next_event() for the relay log to grow. This is reuired since
if we have already applied the last event in the waiting set but since he
check happens only at the start of the next event we may end up waiting
forever the next event is not available or is delayed.*/
if (rli->until_condition == Relay_log_info::UNTIL_SQL_AFTER_GTIDS &&rli->is_until_satisfied(thd, NULL))
{
rli->abort_slave= 1;
mysql_mutex_unlock(&rli->data_lock);
DBUG_RETURN(1);
}
Log_event*ev = next_event(rli), **ptr_ev;
DBUG_ASSERT(rli->info_thd==thd);if(sql_slave_killed(thd,rli))
{
mysql_mutex_unlock(&rli->data_lock);
delete ev;
DBUG_RETURN(1);
}if(ev)
{enumenum_slave_apply_event_and_update_pos_retval exec_res;
ptr_ev= &ev;/*Even if we don't execute this event, we keep the master timestamp,
so that seconds behind master shows correct delta (there are events
that are not replayed, so we keep falling behind).
If it is an artificial event, or a relay log event (IO thread generated
event) or ev->when is set to 0, or a FD from master, or a heartbeat
event with server_id '0' then we don't update the last_master_timestamp.*/
if (!(rli->is_parallel_exec() ||ev->is_artificial_event() || ev->is_relay_log_event() ||ev->when.tv_sec == 0 || ev->get_type_code() == FORMAT_DESCRIPTION_EVENT ||ev->server_id == 0))
{
rli->last_master_timestamp= ev->when.tv_sec + (time_t) ev->exec_time;
DBUG_ASSERT(rli->last_master_timestamp >= 0);
}
其中when.tv_sec是事件在主庫上的開始時(shí)間,而ev->exec_time在主庫上的執(zhí)行時(shí)間,只有Query_log_event和Load_log_event才會統(tǒng)計(jì)exec_time。
并行復(fù)制下last_master_timestamp計(jì)算
并行復(fù)制有一個(gè)分發(fā)隊(duì)列g(shù)aq,sql線程將binlog事務(wù)讀取到gaq,然后再分發(fā)給worker線程執(zhí)行。并行復(fù)制時(shí),binlog事件是并發(fā)穿插執(zhí)行的,gaq中有一個(gè)checkpoint點(diǎn)稱為lwm, lwm之前的binlog都已經(jīng)執(zhí)行,而lwm之后的binlog有些執(zhí)行有些沒有執(zhí)行。
假設(shè)worker線程數(shù)為2,gap有1,2,3,4,5,6,7,8個(gè)事務(wù)。worker 1已執(zhí)行的事務(wù)為1 4 6, woker 2執(zhí)行的事務(wù)為2 3,那么lwm為4。
并行復(fù)制更新gap checkpiont時(shí),會推進(jìn)lwm點(diǎn),同時(shí)更新last_master_timestamp為lwm所在事務(wù)結(jié)束的event的時(shí)間。因此,并行復(fù)制是在事務(wù)執(zhí)行完成后才更新last_master_timestamp,更新是以事務(wù)為單位。同時(shí)更新gap checkpiont還受slave_checkpoint_period參數(shù)的影響。
這導(dǎo)致并行復(fù)制下和非并行復(fù)制統(tǒng)計(jì)延遲存在差距,差距可能為slave_checkpoint_period+事務(wù)在備庫執(zhí)行的時(shí)間。這就是為什么在并行復(fù)制下有時(shí)候會有很小的延遲,而改為非并行復(fù)制時(shí)反而沒有延遲的原因。
另外當(dāng)sql線程等待io線程時(shí)且gaq隊(duì)列為空時(shí),會將last_master_timestamp設(shè)為0。同樣此時(shí)認(rèn)為沒有延遲,計(jì)算得出seconds_Behind_Master為0。
抄自https://www.kancloud.cn/taobaomysql/monthly/140089
參考
總結(jié)
以上是生活随笔為你收集整理的mysql replication延迟_MySQL Replication--复制延迟01--源码瞎猜的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 微信网名女可爱
- 下一篇: mysql substr 中文乱码_刚碰