Problem:
Differences between index scan vs index seek?
Solution:
Just a quick reference of what
indexes means to a person who has no knowledge, I am going to brief a little
bit about index before jumping into the actual topic, if you feel this as off
topic. You can jump directly into the differences.
What is an Index:
If you have a reading habit and
most books have a page called contents where the author or editor will specify
the page numbers for a chapter and sub chapters to get quick access of the
contents you are looking for. Similarly, the data in a table needs some sorting
required for retrieving data as quickly as possible.
Index in SQL Server
means the physical order of storing the data in a table. An Index can be used
to speed up the performance of a query by reducing the number of pages used to
store the data.
Differences
Index Scan: if you are familiar
with execution plans, you have some idea that there are sometimes index scans
on the plan. Which means it is going through all the rows in the table, and
send back the results to the search criteria. When an Index scan or table scan
is performed by the query optimizer all the leaf level pages are scanned.
Usually when the table has only few rows it’s better to scan the table instead
of looking up in the index.
Index Seek: Usually Index Seek is
going to look for specific leaf pages which are inside the search criteria. In
a table where large number of rows and the percentage of selection criteria is less
than 15%.
This is a low-cost operator and gives a better execution plans by the
query optimizer.
Sometimes depending on the pages
in the index statistics and the density in the ranges.
The query optimizer
chooses to prefer index seek or index scan. To know more about the details of
the statistics please go through this article. how
statistics affect query execution plans
Here is a table of difference for
easy and quick comparison:
Rows
|
Index
Scan
|
Index
Seek
|
1
|
If the query searches for all the rows in the table, it uses table
scan or index scan.
|
If a query chooses to select a few rows and goes through page leaf’s.
It uses Index Seek
|
2
|
Index scan is better if you select 50% or more data from table
|
Index seek is better if you select less than 15 percent of data
|
3
|
It reads every row in the table whether it qualifies
|
It reads rows that qualify and pages that contain these
qualifying rows
|
4
|
If Scan is table scan means table doesn’t have index
|
Seek is always on index only
|
5
|
When SQL Server does a scan, it loads the complete object which
it wants to read from disk into memory, then reads through that object from
top to bottom looking for the records that it needs.
|
It knows where in the index data is going to be, so goes
directly to the part of the index that it needs and load to memory
|
6
|
You never want to convert seek to scan but if you just want
it, then drop index
|
By adding column in where clause and select option in index, we
can convert a scan into seek operation.
|
7
|
Scan is of 3 types clustered index scan, non-clustered index scan,
table scan
|
Seek is of two types clustered index seek and non-clustered
index seek
|
8
|
Non-Clustered index scan is a complete scan of all the leaf
pages in B tree to find index key of cluster index
|
Non-clustered index seek is a seek through the B-tree structure
of a non-clustered index to find index key of clustered index leaf page, from
the root down to the leaf
|
9
|
A clustered index scan is a complete scan of all data pages at
leaf in a clustered index
|
A clustered index seek is a seek through the b-tree structure of
a clustered index, from the root down to the leaf
|
10
|
Non-clustered index scan means lot fewer pages then in clustered
index scan
|
Non-clustered index seek means only pages required for data
address, where as in clustered index seek it only reads data pages.
|
11
|
Order of data does not impact much
|
Search is fast because data is stored in order based on the
clustered index key
|
12
|
Fragmented data affects scan most as whole data need to be read
from disk
|
fragmented data affects but not as compare to scan, as SQL
engine reads minimal required data.
|
13
|
Table scan only appears for a heap i.e. table without a
clustered index. The first page in the heap is located based on info in the
system tables, and then the pages are read one by one, using the next and, if
necessary, previous pointers in the page headers. This is generally an
expensive operation and should be avoided where ever possible
|
Seek is not possible without index
|
14
|
Clustered index scan is like table scan, just on a table that
has clustered index. This operation reads the leaf pages of the clustered
index, using the next and previous page pointers. Like with the table scan,
this can be an expensive operation and should, wherever possible be avoided
|
Clustered index seek uses the clustered index’s b-tree
structure. The seek starts at the root of the tree and navigates down the
levels of the index until it reached the leaf page(s) with the desired data.
This operation also appears when a partial scan of the table is done, when
the index’s tree is used to locate a page, and the index is scanned from that
point until another point in the table (possibly the end).
|
15
|
This sometimes comes with lookups (Row \ Bookmark), a part of scanning,
another index is used to over result of scan
|
In a seek not all the index is considered. Only one used index
is enough
|
Solution:
Example:
Example 1:
In this example, I
would like to show how the execution plans differ and the cost of query varies
when the index scan happens or index seek happens.
INDEX SCAN:
INDEX SEEK
As we can see based on the selection criteria the optimizer chooses to select scan or seek and this can be more explained if you read this article where statistics play a key role in the execution plans.
0 comments:
Post a Comment