Speaking in Code

Thursday, September 20, 2007

Many-to-Many Merge Join

I came across an unexpected performance problem in a relatively small update statement.

I was comparing a table variable with a short list of rows (1-3) to a table with 60,000+ rows.
The catch is it was an outer join on a two-column compound primary key.
The result, as shown by the execution plan, was a Many-To-Many Merge Join that caused 128,000+ reads and .894 seconds to happen.

The solution I came up with was to add a statement to the Where clause limiting one of the columns to the values in the table variable. That is "Where val1 in (select val1 from @tablevar)"

That small, and seemingly redundant statement brought me down to 40 reads and 0 seconds.
It took me a while to try that because it's redundant; I had already joined to that column. But I think that since I was using and outer join and testing for null, the plan had to check every row in each table.

Just like Magellan (I think it was him) who was sent out on a mission to prove there was no great undiscovered continent in the southern hemisphere. A negative search, or one that sets out to prove that nothing is there, is the most costly type of search. I got that concept from Daniel Boorstin in his book "The Discoverers".