當(dāng)前位置：首頁 >

浅谈sql中的in与not in,exists与not exists的区别以及性能分析

發(fā)布時(shí)間：2025/3/20 55 豆豆

生活随笔收集整理的這篇文章主要介紹了浅谈sql中的in与not in,exists与not exists的区别以及性能分析小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

1、in和exists

in是把外表和內(nèi)表作hash連接，而exists是對(duì)外表作loop循環(huán)，每次loop循環(huán)再對(duì)內(nèi)表進(jìn)行查詢，一直以來認(rèn)為exists比in效率高的說法是不準(zhǔn)確的。如果查詢的兩個(gè)表大小相當(dāng)，那么用in和exists差別不大；如果兩個(gè)表中一個(gè)較小一個(gè)較大，則子查詢表大的用exists，子查詢表小的用in；

例如：表A(小表)，表B(大表)

select * from A where cc in(select cc from B)　　-->效率低，用到了A表上cc列的索引；select * from A where exists(select cc from B where cc=A.cc)　　-->效率高，用到了B表上cc列的索引。

相反的：

select * from B where cc in(select cc from A)　　-->效率高，用到了B表上cc列的索引select * from B where exists(select cc from A where cc=B.cc)　　-->效率低，用到了A表上cc列的索引。

2、not in 和not exists

not in 邏輯上不完全等同于not exists，如果你誤用了not in，小心你的程序存在致命的BUG，請(qǐng)看下面的例子：

create table #t1(c1 int,c2 int);create table #t2(c1 int,c2 int);insert into #t1 values(1,2);insert into #t1 values(1,3);insert into #t2 values(1,2);insert into #t2 values(1,null);select * from #t1 where c2 not in(select c2 from #t2);　　-->執(zhí)行結(jié)果：無select * from #t1 where not exists(select 1 from #t2 where #t2.c2=#t1.c2)　　-->執(zhí)行結(jié)果：1　　3

正如所看到的，not in出現(xiàn)了不期望的結(jié)果集，存在邏輯錯(cuò)誤。如果看一下上述兩個(gè)select 語句的執(zhí)行計(jì)劃，也會(huì)不同，后者使用了hash_aj，所以，請(qǐng)盡量不要使用not in(它會(huì)調(diào)用子查詢)，而盡量使用not exists（它會(huì)調(diào)用關(guān)聯(lián)子查詢）。如果子查詢中返回的任意一條記錄含有空值，則查詢將不返回任何記錄。如果子查詢字段有非空限制，這時(shí)可以使用not in，并且可以通過提示讓它用hasg_aj或merge_aj連接。

如果查詢語句使用了not in，那么對(duì)內(nèi)外表都進(jìn)行全表掃描，沒有用到索引；而not exists的子查詢依然能用到表上的索引。所以無論哪個(gè)表大，用not exists都比not in 要快。

3、in 與 = 的區(qū)別

select name from student where name in('zhang','wang','zhao');

與

select name from student where name='zhang' or name='wang' or name='zhao'

的結(jié)果是相同的。

-----

其他分析：

1.EXISTS的執(zhí)行流程?
select * from t1 where exists ( select null from t2 where y = x )?

可以理解為:?
for x in ( select * from t1 ) loop?

if ( exists ( select null from t2 where y = x.x )?then?
OUTPUT THE RECORD?
end if?
end loop?

對(duì)于in 和 exists的性能區(qū)別:?
如果子查詢得出的結(jié)果集記錄較少，主查詢中的表較大且又有索引時(shí)應(yīng)該用in,反之如果外層的主查詢記錄較少，子查詢中的表大，又有索引時(shí)使用exists。?
其實(shí)我們區(qū)分in和exists主要是造成了驅(qū)動(dòng)順序的改變（這是性能變化的關(guān)鍵），如果是exists，那么以外層表為驅(qū)動(dòng)表，先被訪問，如果是IN，那么先執(zhí)行子查詢，所以我們會(huì)以驅(qū)動(dòng)表的快速返回為目標(biāo)，那么就會(huì)考慮到索引及結(jié)果集的關(guān)系了?

另外IN時(shí)不對(duì)NULL進(jìn)行處理?
如： select 1 from dual where null in (0,1,2,null) 為空?

2.NOT IN 與NOT EXISTS:?
NOT EXISTS的執(zhí)行流程?
select .....?from rollup R ?where not exists ( select 'Found' from title T where R.source_id = T.Title_ID);?
可以理解為:?
for x in ( select * from rollup )?loop?
if ( not exists ( that query ) ) then?
OUTPUT?
end if;?
end loop;?

注意:NOT EXISTS 與 NOT IN 不能完全互相替換，看具體的需求。如果選擇的列可以為空，則不能被替換。?

例如下面語句，看他們的區(qū)別：?
select x,y from t;?

查詢x和y數(shù)據(jù)如下：
x y?
------ ------?
1 3?
3 1?
1 2?
1 1?
3 1?
5?

使用not in 和not exists查詢結(jié)果如下：
select * from t where x not in (select y from t t2 ) ;
查詢無結(jié)果：no rows

select * from t where not exists (select null from t t2?where t2.y=t.x ) ;

查詢結(jié)果為：
x y?
------ ------?
5 NULL?

所以要具體需求來決定?

對(duì)于not in 和 not exists的性能區(qū)別：?
not in 只有當(dāng)子查詢中，select 關(guān)鍵字后的字段有not null約束或者有這種暗示時(shí)用not in,另外如果主查詢中表大，子查詢中的表小但是記錄多，則應(yīng)當(dāng)使用not in,并使用anti hash join.?
如果主查詢表中記錄少，子查詢表中記錄多，并有索引，可以使用not exists,另外not in最好也可以用/*+ HASH_AJ */或者外連接+is null?
NOT IN 在基于成本的應(yīng)用中較好?

比如:?
select .....?
from rollup R?
where not exists ( select 'Found' from title T?
where R.source_id = T.Title_ID);?

改成（佳）?

select ......?
from title T, rollup R?
where R.source_id = T.Title_id(+)?
and T.Title_id is null;?

或者（佳）?
sql> select /*+ HASH_AJ */ ...?
from rollup R?
where ource_id NOT IN ( select ource_id?
from title T?
where ource_id IS NOT NULL )?

討論IN和EXISTS。?
select * from t1 where x in ( select y from t2 )?
事實(shí)上可以理解為：?
select *?
from t1, ( select distinct y from t2 ) t2?
where t1.x = t2.y;?
——如果你有一定的SQL優(yōu)化經(jīng)驗(yàn)，從這句很自然的可以想到t2絕對(duì)不能是個(gè)大表，因?yàn)樾枰獙?duì)t2進(jìn)行全表的“唯一排序”，如果t2很大這個(gè)排序的性能是不可忍受的。但是t1可以很大，為什么呢？最通俗的理解就是因?yàn)閠1.x=t2.y可以走索引。但這并不是一個(gè)很好的解釋。試想，如果t1.x和t2.y 都有索引，我們知道索引是種有序的結(jié)構(gòu)，因此t1和t2之間最佳的方案是走merge join。另外，如果t2.y上有索引，對(duì)t2的排序性能也有很大提高。?
select * from t1 where exists ( select null from t2 where y = x )?
可以理解為：?
for x in ( select * from t1 )?
loop?
if ( exists ( select null from t2 where y = x.x )?
then?
OUTPUT THE RECORD!?
end if?
end loop?
——這個(gè)更容易理解，t1永遠(yuǎn)是個(gè)表掃描！因此t1絕對(duì)不能是個(gè)大表，而t2可以很大，因?yàn)閥=x.x可以走t2.y的索引。?
綜合以上對(duì)IN/EXISTS的討論，我們可以得出一個(gè)基本通用的結(jié)論：IN適合于外表大而內(nèi)表小的情況；EXISTS適合于外表小而內(nèi)表大的情況。?
我們要根據(jù)實(shí)際的情況做相應(yīng)的優(yōu)化，不能絕對(duì)的說誰的效率高誰的效率低，所有的事都是相對(duì)的

總結(jié)

以上是生活随笔為你收集整理的浅谈sql中的in与not in,exists与not exists的区别以及性能分析的全部?jī)?nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯(cuò)，歡迎將生活随笔推薦給好友。

上一篇：过滤器跟拦截器的区别
下一篇： String类为什么是final

日韩av黄I国产麻豆传媒I国产91av视频在线观看I日韩一区二区三区在线看I美女国产在线I麻豆视频国产在线观看I成人黄色短片

浅谈sql中的in与not in,exists与not exists的区别以及性能分析

總結(jié)