jump to navigation

SQL 2008 Fulltext Query Performance issue using FREETEXT and CONTAINS June 6, 2010

Posted by Chirag in Full Text Search, SQL 2005, SQL 2008, SQL 2008 R2, SQL Server 2008.
Tags: , , , ,
trackback

Of all the things that have improved in SQL 2008 Fulltext seems to have taken a step backwards. We were helping one of our clients upgrade from SQL 2005 to SQL 2008. They are a 24×7 shop with a framework that supports various websites for their customers. Fulltext is heavily used for text based searches on the websites. We had set up a test rig for them on SQL 2008 to test their applications. The testing process revealed sever performance issues with the text based searches. We investigated and identified the cause of the performance degradation to be the FREETEXT clause used in a multiple OR condition.

Issue:
A fulltext query using the FREETEXT predicate in a multiple OR combination works magnitudes slower in SQL 2008 than in SQL 2005 like for like vanilla installs. We tested this in SQL 2008/R2 and the same issue persists so it pointed to an engine issue in the way fulltext query plans were generated and handled in SQL 2008.

Eg.

SELECT A.Col1,  A.Col2, B.Col1, B.Col1….
FROM TableA AS A LEFT JOIN TableB AS B ON A.ID = B.A_ID
WHERE FREETEXT((A.Name, A.Description), ‘SomeSearchText’)
OR FREETEXT((B.Name, B.Description),’SomeSearchText’)

The fulltext catalog encapsulates multiple tables which each have a single fulltext index.

Workaround:
After thorough investigation we identified a workaround for this issue, it is to separate the query using a UNION clause which has extremely good performance on SQL 2008/R2.

Eg.

SELECT A.Col1,  A.Col2, B.Col1, B.Col1….
FROM TableA AS A LEFT JOIN TableB AS B ON A.ID = B.A_ID
WHERE FREETEXT((A.Name, A.Description), ‘SomeSearchText’)

UNION

SELECT A.Col1,  A.Col2, B.Col1, B.Col1….
FROM TableA AS A LEFT JOIN TableB AS B ON A.ID = B.A_ID
WHERE FREETEXT((B.Name, B.Description),’SomeSearchText’)

Unfortunately, this leads to code bloat and may not be the best approach to resolve this issue as it could mean many man hours to update the related code which could be hundreds of stored procedures as was in our clients case.

We raised a PSS case with Microsoft to identify the root cause and possibly provide a solution for the same.

I have attached the repro scripts I provided Microsoft to help them recreate the issue:

Instructions to Reproduce the issue(Run in SQL 2005 and SQL 2008 instances preferably similar hardware/software settings):

  • Run the FREETEXT_Issue_CreateSchema.sql Script => this will create the schema and populate the tables and the fulltext catatlog
  • Wait for the fulltext catalog to finish population
  • In a separate window run the FREETEXT_Issue_Query.sql script, switch to the messages tab and compare the CPU times in the output.

Microsoft confirmed that this was a BUG in the SQL 2008/R2 and infact affected the CONATINS clause too. They confirmed that it would take a substantial design change to fix this issue. After much persistence they have confirmed that a fix will be released as a Cumulative update “soon”. I will update this blog with the appropriate link for the Cumulative update as soon as it is released.

UPDATE: (24th August 2010)
Microsoft has released a patch which resolves this issue in the SQL 2008 SP1 CU9 rollout. You can obtain it from here: http://support.microsoft.com/kb/2083921. I am still awaiting confirmation for SQL 2008 R2. Will update as soon as I have more news.

UPDATE: (27th January 2011)
Microsoft has released a patch which resolves this issue in SQL 2008 R2 CU4 rollout. You can obtain it from here: http://support.microsoft.com/kb/2264562. Although many readers have informed me that the workaround still works faster. Test it and let me know.

About these ads

Comments»

1. Richard - June 11, 2010

Fantastic! My full-text query has now gone from 90 seconds to ~3 seconds. Thanks for posting this – I’ll keep an eye out for the cumulative update.

2. Richard - June 28, 2010

Doesn’t look like it’s been fixed in CU2:
http://support.microsoft.com/kb/2072493/en-us

3. Steph - July 6, 2010

This is the exact problem I’ve been running into as well (using CONTAINS) but its not just SQL 2008…I see it also in SQL 2005 (more specifically SP2 on a Windows 2003 server…very standard install). Are you certain about SQL2005 not having the same issue?

Also, I’m curious if there is any official Microsoft links related to this issue that I can use for documentation? Thanks.

4. Edrich - August 18, 2010

Hello Chirag,
Many thanks for this post.
We are facing a similar problem with full text searches with “OR” clauses.
I will appreciate if you can let us know about the cumulative update.
Regards,
Ed.

5. Rui - August 24, 2010

“Soon” isn’t soon enough – We expeienced the same issue today when we upgraded to SQL 2008. A query that took 6 seconds to resolve on SQL 2005 took over 5 minutes to complete natively on SQL 2008. To say its a bug is an understatement.
We’re not going to re-write hundreds of SP’s either. I’ll use SQL 2005 on our new server instead!

I’ll keep my eye on your post for a patch.

Chirag - August 25, 2010

Hi Rui

Good news! Just checked with Microsoft and seems they had released a patch for this with the SQL 2008 SP1 CU9 rollout. You can get it from here: http://support.microsoft.com/kb/2083921

I am still awaiting conformation for SQL 2008 R2.

Cheers
Chirag

6. Aaron Bertrand - August 26, 2010

Weird, in http://support.microsoft.com/kb/2264562/ it says in the title “SQL Server 2008″ and it only mentions 2008 / SP1 in the fix. However at the bottom of the article, in the “Applies To” section, it only lists 2008 R2, not 2008. Yet there is no fix to be found in any of the cumulative updates for R2.

Chirag - August 26, 2010

Hi Aaron, I have already raised this with the Microsoft PSS Engineer who was handling the case. Seems a bit misleading on the KB. Will provide update on this as soon as I hear back. Many Thanks.

Regards
Chirag

7. Nabeel Herzalla - October 5, 2010

Hello,
any updates from Microsoft, I face the same problem in R2

8. Barry - January 18, 2011

There appears to be a fix in CU4 for SQL2008 R2 and whilst it’s made some improvement in that it has reduced a simple FT search (using ‘contains’ and ‘OR’) from over 3 minutes down to 1min30, the union solution above runs in 4secs the first time it’s runs and 0secs for subsequent runs!

So thank you very much; once I’ve refactored my app that creates the where clauses, my users are going to be very much happier than they are at the moment.

Best wishes

Barry

Chirag - January 27, 2011

Hi Barry

Thanks for the update on CU4. Glad at least the workaround works for you.

Regards
Chirag

Chirag - January 27, 2011

Hi Nabeel

There is a fix in CU4 which can be downloaded from http://support.microsoft.com/kb/2264562

Though some have told me the workaround still produces faster results.

Regards
Chirag


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: