jump to navigation

SQL 2008 Fulltext Query Performance issue using FREETEXT and CONTAINS June 6, 2010

Posted by Chirag in Full Text Search, SQL 2005, SQL 2008, SQL 2008 R2, SQL Server 2008.
Tags: , , , ,
12 comments

Of all the things that have improved in SQL 2008 Fulltext seems to have taken a step backwards. We were helping one of our clients upgrade from SQL 2005 to SQL 2008. They are a 24×7 shop with a framework that supports various websites for their customers. Fulltext is heavily used for text based searches on the websites. We had set up a test rig for them on SQL 2008 to test their applications. The testing process revealed sever performance issues with the text based searches. We investigated and identified the cause of the performance degradation to be the FREETEXT clause used in a multiple OR condition.

Issue:
A fulltext query using the FREETEXT predicate in a multiple OR combination works magnitudes slower in SQL 2008 than in SQL 2005 like for like vanilla installs. We tested this in SQL 2008/R2 and the same issue persists so it pointed to an engine issue in the way fulltext query plans were generated and handled in SQL 2008.

Eg.

SELECT A.Col1,  A.Col2, B.Col1, B.Col1….
FROM TableA AS A LEFT JOIN TableB AS B ON A.ID = B.A_ID
WHERE FREETEXT((A.Name, A.Description), ‘SomeSearchText’)
OR FREETEXT((B.Name, B.Description),’SomeSearchText’)

The fulltext catalog encapsulates multiple tables which each have a single fulltext index.

Workaround:
After thorough investigation we identified a workaround for this issue, it is to separate the query using a UNION clause which has extremely good performance on SQL 2008/R2.

Eg.

SELECT A.Col1,  A.Col2, B.Col1, B.Col1….
FROM TableA AS A LEFT JOIN TableB AS B ON A.ID = B.A_ID
WHERE FREETEXT((A.Name, A.Description), ‘SomeSearchText’)

UNION

SELECT A.Col1,  A.Col2, B.Col1, B.Col1….
FROM TableA AS A LEFT JOIN TableB AS B ON A.ID = B.A_ID
WHERE FREETEXT((B.Name, B.Description),’SomeSearchText’)

Unfortunately, this leads to code bloat and may not be the best approach to resolve this issue as it could mean many man hours to update the related code which could be hundreds of stored procedures as was in our clients case.

We raised a PSS case with Microsoft to identify the root cause and possibly provide a solution for the same.

I have attached the repro scripts I provided Microsoft to help them recreate the issue:

Instructions to Reproduce the issue(Run in SQL 2005 and SQL 2008 instances preferably similar hardware/software settings):

  • Run the FREETEXT_Issue_CreateSchema.sql Script => this will create the schema and populate the tables and the fulltext catatlog
  • Wait for the fulltext catalog to finish population
  • In a separate window run the FREETEXT_Issue_Query.sql script, switch to the messages tab and compare the CPU times in the output.

Microsoft confirmed that this was a BUG in the SQL 2008/R2 and infact affected the CONATINS clause too. They confirmed that it would take a substantial design change to fix this issue. After much persistence they have confirmed that a fix will be released as a Cumulative update “soon”. I will update this blog with the appropriate link for the Cumulative update as soon as it is released.

UPDATE: (24th August 2010)
Microsoft has released a patch which resolves this issue in the SQL 2008 SP1 CU9 rollout. You can obtain it from here: http://support.microsoft.com/kb/2083921. I am still awaiting confirmation for SQL 2008 R2. Will update as soon as I have more news.

UPDATE: (27th January 2011)
Microsoft has released a patch which resolves this issue in SQL 2008 R2 CU4 rollout. You can obtain it from here: http://support.microsoft.com/kb/2264562. Although many readers have informed me that the workaround still works faster. Test it and let me know.

Advertisements

Multithreaded DBCC Checkdb when MAXDOP = 1 March 4, 2010

Posted by Chirag in Database Maintenance Plans, SQL 2005, SQL 2008, SQL Server 2008, T-SQL.
Tags: , , , , ,
1 comment so far

I know I am posting after a long time, so have just decided to wipe of the dust off my blogging keyboard, bring out my rusty creative writing skills and start posting again.

So here goes again… with a post to request your support and vote on a connect item if you feel my pain.

We have a customer who has several servers with many VLDB’s hosted on each of them. We run a nightly maintenance plan for general housekeeping and this tends to take a very long time to finish. An average run of CHECKDB with the PHYSICAL_ONLY option on weekdays takes over 4 hours and full CHECKDB on Sunday’s can take up to 6-8 hours to finish. This causes the job to overshoot the nightly maintenance window and in turn translates to performance problems for query response times as it is a 24×7 shop.

Besides all the necessary checks that CHECKDB does to check the integrity of the database it can run single threaded or multithreaded if more  processors are available. For reasons of query response and runtime predictability requirements the client has MAXDOP = 1 configured at the SQL Server instance level. This causes CHECKDB to run single threaded and contributes to extending the overall runtime of CHECKDB.

If like me you feel that there should be a setting to allow us to force CHECKDB to run multi threaded even if MAXDOP = 1 then please vote on this connect item: https://connect.microsoft.com/SQLServer/feedback/details/538754/introduce-setting-to-force-dbcc-checkdb-to-run-multi-threaded-when-maxdop-1

I have requested this to be implemented as:

– Trace Flag to allow global setting

– Switch on the DBCC CHECKDB command

Many thanks in advance if you vote and hopefully Microsoft introduce this soon!