Monday, 2 January 2012

Self-join in sql with example

Sql equivalent queries in Linq
Set all check boxes inside a div 
Get number of weekend between two dates
Bulk insert in Sql 2008
Get nth highest salary in a table
Use of Temp Table and Split Function

For self join in sql you can try the following example:
Create table employees:
emp_id emp_name emp_manager_id
1 John Null
2 Tom 1
3 Smith 1
4 Albert 2
5 David 2
6 Murphy 5
7 Petra 5
Now to get the names of managers from the above single table you can use sub queries or simply the self join.
Self Join SQL Query to get the names of manager and employees:
select e1.emp_name 'manager',e2.emp_name 'employee'
from employees e1 join employees e2
on e1.emp_id=e2.emp_manager_id
manager employee
John Tom
John Smith
Tom Albert
Tom David
David Murphy
David Petra

In the above self join query, employees table is joined with itself using table aliases e1 and e2. This creates the two views of a single table.
from employees e1 join employees e2
on e1.emp_id=e2.emp_manager_id
Here e.emp_manager_id passes the manager id from the 2nd view to the first aliased e1 table to get the names of managers.

 Explain it with another example

 Let’s illustrate the need for a self join with an example. Suppose we have the following table called emp_location, with both the employee name and their location:
New York
New York
Now, suppose we want to find out which employees are from the same location as the employee named Joe. What we could do is write a nested SQL query (basically a query within another query) like this:
SELECT employee_name
FROM employee
WHERE employee_location in
( SELECT employee_location
FROM employee
WHERE employee_name = "Joe")
A nested subquery for such a simple question is inefficient. Is there a more efficient and elegant solution to this problem?
It turns out that there is – we can use something called a self join. A self join is basically when a join is done on the same table – the best way to think of it is that we have 2 identical copies of the table, and we want to join them based on some predicate. If you need a refresher on join predicates (or just joins in general) then check this link out: Inner vs Outer joins.
Now, the key question is what would be our join predicate in this example? Well, we want to find all the employees who have the same location as Joe – so if we are doing a join we would want to make sure that the location is the same and that the employee name is Joe. So, our join predicate would be where e1.employee_location = e2.employee_location AND employee_name = "Joe". Note that e1 and e2 will represnt the 2 employee tables that we are doing a self join on. Now, here is what the SQL for a self join would look like to solve this problem:
SELECT e1.employee_name
FROM employee e1, employee e2
WHERE e1.employee_location = e2.employee_location
AND e2.employee_name="Joe";
This query will return the names Joe and Jack – since Jack is the only other person who lives in New York like Joe.
Generally, queries that refer to the same table can be greatly simplified by re-writing the queries as self joins. And, there is definitely a performance benefit for this as well.


  1. SRY......u might have explained bit easy....

  2. Actually, i think the last query could have been explained, in bit easy manner like, in self join the table e2's one row is actually fixed while doing a scan for inner join. When the value of e1.employee_location from the first table is being compared with the e2.employee_location, here actually e2.employee_location gets fixed to 'New York', because we are doing an AND operation on table e2 and saying that,we need employee_location value, whose employee_name must always be 'Joe'.

    Hope it makes this query bit easy to understand......

  3. I think you have expalined in reverse way...
    e1 must me employee and e2 must be manager her... but you have explained in reverse......

  4. not better explanation