SQL Oddity - OUTER JOINs with criteria act like INNER JOIN

SYMPTOMS

You have a query with an OUTER JOIN (aka LEFT JOIN, RIGHT JOIN, LEFT OUTER JOIN, RIGHT OUTER JOIN), and you need to put filter criteria on the table or view that is outer joined.  But when you do that, the query acts like an inner join.

Consider the following query: 
It's designed to show all computers that are a member of collectionID 'SMS00004' but we want to have an indicator of whether a machine has OUTLOOK.EXE on it or not, so we want to see all machines in that collection.

SELECT sys.Netbios_Name0 as [Computer Name],

       sf.fileName

FROM dbo.v_R_System as sys

     INNER JOIN dbo.v_FullCollectionMembership as fcm

         ON fcm.ResourceID = sys.ResourceID

     LEFT JOIN dbo.v_GS_SoftwareFile as sf

         ON sf.resourceID = sys.resourceID

WHERE fcm.CollectionID = 'SMS00004'

  AND sf.fileName = 'outlook.exe'

 

There's a LEFT JOIN on the dbo.V_GS_SoftwareFile view, so we would expect to see OUTLOOK.EXE in that second column if it has the file, and a NULL if it does not. 

Like this: 

 

OUTPUT #1  - what we expect

 

Computer Name      FileName

COMP1                    OUTLOOK.EXE

COMP2                     

COMP3                    OUTLOOK.EXE

COMP4     

COMP5                    OUTLOOK.EXE 

 

But that's not what we're seeing.  Instead we see this:

 

OUTPUT #2  - what we see

 

Computer Name      FileName

COMP1                    OUTLOOK.EXE

COMP3                    OUTLOOK.EXE

COMP5                    OUTLOOK.EXE 

 

 

THE CAUSE

 

Consider what the query is saying:  If you put something in the WHERE clause, you're telling SQL to only pull records that meet those criteria, regardless of your INNER or OUTER joins, so the LEFT JOIN on V_GS_SoftwareFile is overridden and acts like an INNER JOIN.  (FYI, LEFT JOIN and LEFT OUTER JOIN can be used interchangably, as can RIGHT JOIN and RIGHT OUTER JOIN...the OUTER is implied if you exclude it...also, INNER JOIN is the same as JOIN.)

 

TEACHING MOMENT:  Because there's a wide range of skills that read these articles, I'd like to make sure we don't leave anyone confused or unsure and explain the INNER/OUTER joins.  If you know this already, just skip to the solution. 

 

INNER JOIN - joins the rows of two tables/views together, returning only rows where the join criteria of the table on the left = the join criteria of the table on the right.  Basically, it only returns rows where they intersect.

 

EXAMPLE: 

 

SELECT sys.Netbios_Name0 as [Computer Name],

       sf.fileName

FROM dbo.v_R_System as sys

     INNER JOIN dbo.v_GS_SoftwareFile as sf

       ON sf.resourceID = sys.resourceID

 


LEFT OUTER JOIN - joins the rows of two tables/views together, but returns ALL of the rows of the table/view that is on the left side of the join statement, and ONLY the rows of the table/view on the right side of the join statement that matches the join criteria.  Any columns from the table on the right will be NULL if they don't match the join criteria.

 

EXAMPLE: 

 

SELECT sys.Netbios_Name0 as [Computer Name],

       sf.fileName

FROM dbo.v_R_System as sys

     LEFT JOIN dbo.v_GS_SoftwareFile as sf

       ON sf.resourceID = sys.resourceID

 

 

RIGHT OUTER JOIN - joins the rows of two tables/views together, but returns ALL of the rows of the table/view that is on the RIGHT side of the join statement, and ONLY the rows of the table/view on the left side of the join statement that matches the join criteria.  Any columns from the table on the left will be NULL if they don't match the join criteria. 

 

EXAMPLE: 

 

SELECT sys.Netbios_Name0 as [Computer Name],

       sf.fileName

FROM dbo.v_R_System as sys

     RIGHT JOIN dbo.v_GS_SoftwareFile as sf

       ON sf.resourceID = sys.resourceID

 

 

 

THE SOLUTION

 

The solution is actually quite simple,

DON'T PUT THE FILTER CRITERIA IN THE WHERE CLAUSE, PUT IT IN THE FROM STATEMENT AFTER  'JOIN ON <blah> = <blah>'

 

Let's rewrite that above query and move the outlook.exe criteria into the join criteria:

 

SELECT sys.Netbios_Name0 as [Computer Name],

       sf.fileName

FROM dbo.v_R_System as sys

     INNER JOIN dbo.v_FullCollectionMembership as fcm

         ON fcm.ResourceID = sys.ResourceID

     LEFT JOIN dbo.v_GS_SoftwareFile as sf

         ON sf.resourceID = sys.resourceID

         AND sf.fileName = 'outlook.exe' --<Check it out>

WHERE fcm.CollectionID = 'SMS00004'


Now when you run this, you'll see what you expected from OUTPUT #1 above. 

What we're telling SQL now is to do a LEFT OUTER JOIN on dbo.V_GS_SoftwareFile where the resourceID is the same as in V_R_System BUT only for rows where the fileName = 'OUTLOOK.EXE'

 

SUMMARY & NEXT STEPS


So, to sum this up, we learned:

  • A brief (and quite incomplete) explanation of INNER/OUTER joins
  • If you have an OUTER JOIN and you need to include filter criteria that would normally go in the WHERE clause, put it after JOIN ON <blah> = <blah> in the FROM clause.

Going forward, if you want to make this more useful for other files and other collections, put this into a web report, replace 'OUTLOOK.EXE' and 'SMS00004' with @FILENAME and @COLLECTIONID, and create parameters for those.

 

And as always, highlight the query and hit CTRL+L to get the query execution plan. Mouse-over the operator on the far left and look at the cost.  In this case our query appears to use mostly index seeks or clustered index seeks (fast) and has a full cost of 3.58.  That's super. 

 

Hope this stuff is helpful!

 

Number2 (John Nelson)
MyITForum - Forum Posts
MyITForum - Blog
Add to Google

Comments

No Comments
Powered by Community Server (Commercial Edition), by Telligent Systems