INDEXES AND THE LIKE CLAUSEOften times, people will make the generalization that when you have LIKE in your WHERE clause it won't use indexes. The real answer is a little more complicated than that, and I think it's important to understand SQL's behavior so you can make good decisions with your queries.
(BTW,
I'm talking from the SQL 2005 SP2 and above perspective, only because it's
what we're using right now)
First of all, there are multiple levels of index usage as seen when you look at an execution plan. These
main operators below can come right out of the execution plan when you hit CTRL+L on a
given query from within SQL Management Studio:
- CLUSTERED INDEX SEEK - Very fast
and efficient - the table is physically ordered according to specified
column(s) and SQL can go find it AND since all of the data is phsically
ordered right, can pull the data sequentially
- INDEX SEEK - which is very
fast and efficient. The SQL server knows pretty much where the data
is and can go directly to it and seek out the rows it needs. The data isn't ordered in the DB by the fields in the index so it's likely not pulling the data sequentially like it is in the CLUSTERED INDEX SEEK, but it can still translate to a massive improvement in query execution cost/speed.
- INDEX SCAN - which
as you can see still implies using the index, but it doesn't
know exactly where the data is, so it may scan the whole index or a
partial range of the index to find it's data. Can be very slow and costly,
but still scans through the index as opposed to scanning the physical
table.
- CLUSTERED INDEX SCAN - a clustered index is really not much more than a table with the data physically ordered by specified columns. So if you see a CLUSTERED INDEX SCAN, it's pretty much the same as a TABLE SCAN for performance because SQL has to physically search through every single row in the clustered index (which really IS the table).
- TABLE SCAN - slow,
inefficient. SQL has to physically look at every single row in the
table. Shouldn't see this often, if you do, you may need to rewrite
something or add indexes.
So,
knowing that little background, the rules for index usage with LIKE are
loosely like this:
- If your filter criteria uses equals
= and the field is indexed, then most likely it will use
an INDEX/CLUSTERED INDEX SEEK
- If your filter criteria
uses LIKE, with no wildcards (like if you had a parameter in a web
report that COULD have a % but you instead use the full
string), it is about as likely as #1 to use the index. The
increased cost is almost nothing.
- If your filter criteria uses
LIKE, but with a wildcard at the beginning (as in Name0 LIKE '%UTER') it's
much less likely to use the index, but it still may
at least perform an INDEX SCAN on a full or partial range of the index.
- HOWEVER, if your filter criteria
uses LIKE, but starts with a STRING FIRST and has wildcards
somewhere AFTER that
(as in Name0
LIKE 'COMP%ER'),
then SQL may just use an INDEX SEEK to quickly find rows that have the
same first starting characters, and then look through those rows for an exact
match.
(Also
keep in mind, the SQL engine still might not use an index the way
you're expecting, depending on what else is going on in your query and what
tables you're joining to. The SQL engine reserves the right to rewrite
your query a little to get the data in a way that it thinks is most
efficient and that may include an INDEX SCAN instead of an INDEX SEEK)
Consider
the following queries:
select *
from dbo.v_R_System
where
NETBIOS_NAME0 = 'COMPUTER'
select *
from dbo.v_R_System
where
NETBIOS_NAME0 LIKE 'COMPUTER'
select *
from dbo.v_R_System
where
NETBIOS_NAME0 LIKE 'COMPUT%'
select *
from dbo.v_R_System
where
NETBIOS_NAME0 LIKE '%COMPUT%'
When
you hit CTRL+L to get the execution plan, for each of these, and mouse-over the operator on the far left, you'll see the following:

Notice, the first query uses = instead of LIKE. It's got the lowest cost and uses an INDEX SEEK. Very fast and efficient..What we would expect.
The second query uses LIKE, and no wildcards. The cost is still pretty low and it uses an INDEX SEEK.
The third query uses LIKE, and a wildcard, but doesn't START with a wildcard. The cost is the same as the second and still uses INDEX SEEK.
The fourth query however, uses LIKE and starts with a wildcard. It still uses an index, but the cost is higher and it's an INDEX SCAN so it's going to be less efficient and slower.
So you can see, it's not as simple as LIKE means no indexes. You can still use LIKE and get pretty darn good use out of your indexes if you just know the limitations and do what you can to query it accordingly.
Hope this stuff is informative!
Number2 (John Nelson)
MyITForum - Forum Posts
MyITForum - Blog
