You are on page 1of 13

wmfexcel

Work smarter by Mastering Functions


in Excel

Power Query Challenge – Joining two tables fully


Posted on March 8, 2020 by MF

Left Outer + Right Anti = A better Full Join

Power Query - Full Outer Join Revisit (Cantonese VO)

A sample file is available below.

If you prefer reading to watching, please continue to read this post.

The situation:

I have two tables.  One is Sales Table; One is Traffic Table.  What we need to do is to combine them into a single
fact table.

Privacy & Cookies: This site uses cookies. By continuing to use this website, you agree to their use.

To find out more, including how to control cookies, see here:


Our Cookie Policy

Close and accept


With Power Query, it is an easy job.  We can merge the two tables with Left Outer Join!

BUT… If we do that with Left Outer Join, we will miss some data points.  The following screenshot depicts the
situation:

Those traffic records with “?” do not have a matching record in the Sales table.  As a result, those traffic records
are missed in the output, and most of the time without any notice/warning…  

In ideal world, every store should have sales every day.  However in real world, there may be zero-sales days… sad
but true.  In those days, we had visitors (Traffic In) but no sales, which is no good.  In this case, if we use Left
Outer Join, we will miss out those data with Traffic but no sales.  

You may suggest we use the Traffic table as the anchor table to merge sales data to it, it will solve the problem (by
using Right Outer Join; or swap the order of the tables).  Well, that’s only true when all stores have traffic data. 
Nevertheless, only a few stores (in our example, 3 out of 8) have traffic data.

Thus we need a Full Join to return a result like this:

Privacy & Cookies: This site uses cookies. By continuing to use this website, you agree to their use.

To find out more, including how to control cookies, see here:


Our Cookie Policy

Close and accept


This, however, cannot be achieved directly with Full Outer Join in Power Query.  We need to achieve this with
extra steps.  Let’s see how we do it with Power Query!

You may download a sample file to follow along.

Sample File - Start

Sample File - Finish

Load the two tables into Power Query

Select any cell in the Sales table

1. Go to Data tab
2. Select From Table/Range

In the Power Query Editor

1. Under
Privacy Home
& Cookies: Tab
This site –> Close
uses cookies. & Load
By continuing to use this website, you agree to their use.

2. Close & Load to…


To find out more, including how to control cookies, see here:
Our Cookie Policy

3. Only Create Connection Close and accept


4. OK

Repeat the steps above for Traffic table.

Edit the Queries

Now we should have the two queries loaded to connection only, as you can see in the Queries and Connections
pane.  Right click the Sales query and Edit…

First (failed) attempt – Merge Queries with Full Outer Join

If you have experience with merging queries in Power Query, Full Outer Join should be your top-of-mind solution
for this problem.  Nevertheless, it’s not a convenient way as the layout of the output will not be what you want…

Let’s have a look!

Select the Sales query

1. Go to the Home tab of the Power Query Editor


2. Merge Queries
3. Merge Queries (Note: This step is not necessary if we are working on the same query)
4. On the first (Sales) table, select the matching columns 
5. Select Traffic as the second table
6. Select the matching columns (in the right order)
7. Select Full Outer Join (all rows from both)
8. OK
Privacy & Cookies: This site uses cookies. By continuing to use this website, you agree to their use.

To find out more, including how to control cookies, see here:


Our Cookie Policy

Close and accept


Now we should have a new column called Traffic…

Click the double arrows icon in the column header, and then

1. Check Traffic In only


2. Un-check the box
3. OK

Here we go!

We have the Traffic correctly mapped to the Sales table.  From row 53 to 56, the four records with traffic but no
sales are displayed there.  However nothing is under the Date and Store columns…
Privacy & Cookies: This site uses cookies. By continuing to use this website, you agree to their use.

To find out more, including how to control cookies, see here:


Our Cookie Policy

Close and accept


Wait… what if we go back one step and check also the “Date” and “Store“?

Well, although we get the Date and Store information as expected, we’d also got two extra columns across the
entire merged table…

This is obviously not an ideal layout.  Who wants two columns for the same attributes?

We will need many tedious steps to convert this layout back to normal.  Therefore I am going to do it with another
approach, which is exactly the topic of today.

Second attempt – Left Out Join + Right Anti Join followed by Append

A. Merge Queries with Left Outer Join

Repeat the Merge Step described above.  But this time we do a Left Outer Join:

Privacy & Cookies: This site uses cookies. By continuing to use this website, you agree to their use.

To find out more, including how to control cookies, see here:


Our Cookie Policy

Close and accept


Note: Expand the “Traffic In” column only

B. Duplicate the Sales Query

1. Right click the Sales query on the Queries pane


2. Select Duplicate

Privacy & Cookies: This site uses cookies. By continuing to use this website, you agree to their use.

To find out more, including how to control cookies, see here:


Our Cookie Policy

Close and accept


On the Query Settings on the right,

1. Rename it to “TrafficWithNoSales“
2. Delete the steps “Expanded Traffic” and “Merged Queries” by clicking the x

C. Doing a Right Anti Join

1. Select the TrafficWithNoSales query


2. Go to Home tab
3. Merge Queries

Privacy & Cookies: This site uses cookies. By continuing to use this website, you agree to their use.

To find out more, including how to control cookies, see here:


Our Cookie Policy

Close and accept


On the Merge queries dialog box:

1. Select the matching columns


2. Select Traffic
3. Select the matching columns (in order)
4. Select Right Anti (rows only in second)
5. OK

The result we have should contain only one row!  All the TrafficWithNoSales records are stored in the “Table”
under the Traffic column.

Click the “Table” to open it.

Yeah… We’ve got a table showing the records where we have Traffic but no sales.
Privacy & Cookies: This site uses cookies. By continuing to use this website, you agree to their use.

To find out more, including how to control cookies, see here:


Our Cookie Policy

Close and accept


 

D. Append the “Sales” and “TrafficWithNoSales” queries

1. Select the Sales query


2. Go to Home tab
3. Append Queries
4. Select “TrafficWithNoSales“
5. OK

Note: The columns “Date” and “Store” in the two tables must be exactly the same.  Otherwise the Appended
result may put them into different columns.  You may need to rename the column headers if required.

Almost there…

Now we have the TrafficWithNoSales records nicely appended to the Sales table merged with traffic data.

E. Replace null with 0

1. Go to the Transform tab


2. Replace Values
3. Replace
Privacy & Cookies: Values
This site uses(Note: this
cookies. By steptoisusenot
continuing thisnecessary)
website, you agree to their use.

To find out more, including how to control cookies, see here:


Our Cookie Policy
4. Replace null with 0
5. OK Close and accept
 

F. Final touch – Sort the columns

Tip: We may also rename the query to something more meaningful, e.g. Sales with Traffic data

Here we go!  A fully joined table as a result:

Load it to your workbook as you wish.


Privacy & Cookies: This site uses cookies. By continuing to use this website, you agree to their use.

To find out more, including how to control cookies, see here:


Our Cookie Policy
How do you solve this problem?
Close and accept
Would you do it differently? Please leave your comments below.

About MF
An Excel nerd who just transition into a role related to data analytics at current company......
Recently in love with Power
Query and Power BI.
Keep learning new Excel and Power BI stuffs and be amazed by all the new discoveries.
View all posts by MF →

This entry was posted in Power Query and tagged Append Query, Merge Query. Bookmark the permalink.

2 Responses to Power Query Challenge – Joining two tables fully

mma173 says:
March 24, 2020 at ^.^

The way I would resolve this is different. I read both of your solutions.

• I combined (appended) both tables (Sales and Traffic).

• Removed all columns except for Date and Store.

• Removed duplicates based on both columns.

• Merged the result twice again based on both columns: first time with the Sales Table then with the Traffic Table.

• Expanded the needed columns.

Done

 Like
Reply

MF says:
March 24, 2020 at ^.^

Cool trick. Thanks for sharing.

 Like
Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

wmfexcel
Blog at WordPress.com.

Privacy & Cookies: This site uses cookies. By continuing to use this website, you agree to their use.

To find out more, including how to control cookies, see here:


Our Cookie Policy

Close and accept

You might also like