程序師世界是廣大編程愛好者互助、分享、學習的平台,程序師世界有你更精彩!
首頁
編程語言
C語言|JAVA編程
Python編程
網頁編程
ASP編程|PHP編程
JSP編程
數據庫知識
MYSQL數據庫|SqlServer數據庫
Oracle數據庫|DB2數據庫
 程式師世界 >> 數據庫知識 >> 其他數據庫知識 >> MSSQL >> 多列復合索引的應用 繞過微軟sql server的一個缺點

多列復合索引的應用 繞過微軟sql server的一個缺點

編輯:MSSQL

多列復合索引的應用 繞過微軟sql server的一個缺點。本站提示廣大學習愛好者:(多列復合索引的應用 繞過微軟sql server的一個缺點)文章只能為提供參考,不一定能成為您想要的結果。以下是多列復合索引的應用 繞過微軟sql server的一個缺點正文


但是,微軟sql server在處置這類索引時,有個主要的缺點,那就是把本該編譯成索引seek的操作編成了索引掃描,這能夠招致嚴重機能降低

舉個例子來講明成績,假定某個表T有索引 ( cityid, sentdate, userid), 如今有個分頁列表功效,要取得年夜於某個多列復合索引V0的若干個記載的查詢,用最簡略表意的方法寫出來就是 V >= V0, 假如分化開來,就是:
cityid > @cityid0 or (cityid = @cityid0 and (sentdate > @sentdate0 or (sentdate = @sentdate0 and userid >= @userid0))),

當你寫出上述查詢時,你會等待sql server會主動的把上述辨認為V >= V0類型的界限前提,並應用index seek操作來實行該查詢。但是,微軟的sql server (2005版)有一個主要缺點(其他的sql server若何還不得知), 當它碰到如許sql時,sql server就會采取index scan來實行,成果是您樹立好的索引基本就沒有被應用,假如這個表的數據量很年夜,那所形成的機能降低長短常年夜的。
關於這個成績,我已經提交給微軟的有關人士,他們進一步請求我去一個正式的網站上去提交這個缺點,我懶得去做。

不外,對這個缺點,照樣有個方法可以或許繞曩昔的,只需把下面給出的前提變變形,sql server照樣可以或許變回到是用index seek, 而不是低機能的index scan. 詳細請看我的英文原文吧(對不起了, 我一旦寫了中文,就不想翻成英文,反過去也一樣, 估量年夜家英文都還可以,其實不可的就看黑體部門吧, ):
The seek predicate of the form "x > bookmark_of_x" is needed in paging related query. The compiler has no difficulty to parse it correctly if x is a single column index, or two columns index, however, if x is a three columns index or more, then the compiler will have a hard time to recognize it. This failure will result in that the seek predicate ended up in residue predicate, which results in a much worse execution plan.
To illustrate the point, take a example,
Create table A( a int, b int, c int, d float, primary key (a, b, c))
now check the plan for the query:
select c, d from A where (a> 111 or a= 111 and
(b > 222 or b = 222 and c > 333))
you can see a table scan op is used, and the Where clause ended up in residue predicate.
However, if you rewrite the query in an equivalent form:
select c, d from A where a> 111 or a= 111 and b > 222 or a= 111 and b= 222 and c >333
Then the compiler can choose an index seek op, which is desired.
The problem is, the compiler should be able to recognize the first form of seek predicate on multiple columns index, it saves the user from having to pay extra time to figure out a get-around, not to mention the first form is a more efficient form of same expression.
下面的成績,可以說是部門的繞曩昔了,然則,也有繞不外的時刻,接著看上面一段:
It looks like that sql server lacks a consept of vector bookmark, or vector comparison or whatever you like to call it.
The workaround is not a perfect workaround. If sql server were to understand the concept of vector bookmark, then the following two would be the same in execution plan and performance:
1. select top(n) * from A where vectorIndex >= @vectorIndex
2. select * from A where vectorIndex >= @vectorIndex and vectorIndex <=@vectorIndexEnd
-- @vectorIndexEnd corresponds to the last row of 1.
However, test has shown that, the second statement takes far more time than the first statement, and sql server actually only seek to the begining of the vector range and scan to the end of the whole Index, instead of stop at the end of the vector range.
Not only sql server compile badly when the vector bookmark has 3 columns, test has shown that even with as few as 2 columns, sql serer still can not correctly recognize this is actually a vector range, example:
3. select top (100) a, b, c, d from A where a> 60 or a= 60 and b > 20
4. select a, b, c, d from A where (a> 60 or a= 60 and b > 20) and
(a< 60 or a= 60 and b <= 21),

下面兩個查詢本質雷同(表中的數據恰好如斯),而且給出同業的成果集,然則,3比4的速度要快的多,假如去看execution plan也證實3確切應該比4快.
也就是說, 即便在索引vectorIndex只含兩列的情形下, sql server也沒法准確的懂得規模表達式 @vectorIndex0 < vectorIndex < @vectorIndex1, 它能把前半部門准確的解讀為seek, 然則, 後半部門沒法准確解讀, 招致, sql server會一向掃描到全部表的末尾, 而不是在@vectorIndex1處停上去.
以下測試代碼, 有興致的人可以拿去本身玩:


CREATE TABLE [dbo].[A](
[a] [int] NOT NULL,
[b] [int] NOT NULL,
[c] [int] NOT NULL,
[d] [float] NULL,
PRIMARY KEY CLUSTERED ([a] ASC, [b] ASC, [c] ASC)
)
declare @a int, @b int, @c int
set @a =1
while @a <= 100
begin
set @b = 1
begin tran
while @b <= 100
begin
set @c = 1
while @c <= 100
begin
INSERT INTO A (a, b, c, d)
VALUES (@a,@b,@c,@a+@b+@c)
set @c = @c + 1
end
set @b = @b + 1
end
commit
set @a = @a + 1
end
SET STATISTICS PROFILE ON
SET STATISTICS time ON
SET STATISTICS io ON

select top (10) a, b, c, d from A where (a> 60 or a= 60 and
(b > 20 or b = 20 and c >= 31))
select a, b, c, d from A where (a> 60 or a= 60 and
(b > 20 or b = 20 and c >= 31)) and (a< 60 or a= 60 and
(b < 20 or b = 20 and c <= 40))

select top (10) a, b, c, d from A where a> 60 or a= 60 and b > 20 or a= 60 and b= 20 and c >= 31
select a, b, c, d from A where (a> 60 or a= 60 and b > 20 or a= 60 and b= 20 and c >= 31) and
(a< 60 or a= 60 and b < 20 or a= 60 and b= 20 and c <= 40)
select top (100) a, b, c, d from A where a> 60 or a= 60 and b > 20
select a, b, c, d from A where (a> 60 or a= 60 and b > 20) and (a< 60 or a= 60 and b <= 21)
select top (100) a, b, c, d from A where a> 60 or a= 60 and b > 20
select a, b, c, d from A where (a> 60 or a= 60 and b > 20) and (a< 60 or a= 60 and b <= 21)
  1. 上一頁:
  2. 下一頁:
Copyright © 程式師世界 All Rights Reserved