You are on page 1of 42

Connecting to and

transforming data
Connecting to and transforming data
• Getting data
• Transforming data
• Merging, copying, and appending queries
• Verifying and loading data
Creating your first query
• To determine whether or not Personal Time Off (PTO) should be approved, it is important to
understand where the company is concerning budgets and forecasts. Whether or not the company,
department, and/or location is on target in terms of its budget can be an important consideration
when approving or denying time off.
• We will import data into our data model using a query. A query is simply a series of recorded
steps for connecting to and transforming data.
Creating your first query
• All: This lists all of the available connectors.
• File: File connectors, including Excel, text/CSV, XML, JSON, folder, PDF, and
SharePoint folders.
• Database: The Database section lists sources such as SQL Server, Access, Oracle,
IBM DB2, IBM Informix, IBM Netezza, MySQL, PostgreSQL, Sybase, Teradata,
SAP, Impala, Google BigQuery, Vertica, Snowflake, Essbase, and AtScale.
• Power Platform: Power Platform includes Power BI datasets and dataflows,
as well as Dataverse.
• Azure: Azure lists many different services, such as Azure SQL Database, Azure
Synapse Analytics, Azure Analysis Services, Azure Blob Storage, Azure Table
Storage, Azure Cosmos DB, Azure Data Lake Storage, Azure HDInsights (HDFS),
Azure HDInsights Spark, Azure Data Explorer (Kusto), Azure Databricks, and
Azure Cost Management.
• Online Services: There is a substantial collection of online services, including
Microsoft technologies such as SharePoint Online, Exchange Online, Dynamics
365, Common Data Service, DevOps, and GitHub, as well as third parties such
as Salesforce, Google, Adobe, QuickBooks, Smartsheet, Twilio, Zendesk, and
many others.
• Other: other connectors include Web, OData, Spark, Hadoop (HDFS), ODBC, R,
Python, and OLE DB.
1

2
Getting
additional data

2
Getting
additional data

2
Transforming data
• Power Query Editor
can be launched from
the Home tab by
choosing Transform
data in the Queries
section of the Ribbon.
Once launched, the
following screen will
be displayed:
Formula bar
• The Formula bar allows the user to view, enter, and modify the Power Query (M)
code. The Power Query formula language, commonly called M, is a functional
programming language comprised of functions, operators, and values. M is the
underlying data connection and transformation technology for Microsoft Power
Automate, PowerApps, Power BI Desktop, and Power Query in Excel.
• M is the language behind queries in Power BI. As you are building a query in
Power Query Editor, behind the scenes, this is building an M script that executes
to connect to, transform, and import your data. In reality, each of the applied steps
in a query is a line of Power Query M language code. You do not need to worry
about that just yet, but we will explore this in the Merging, copying, and
appending queries section.
Transforming budget and forecast data
Text (ABC) Whole number Text Decimal (1.2)
(123) or number
(ABC123)
Cleaning up extraneous bottom rows
Filtering rows
• Note that all of the distinct values that appear
in the column are listed, including (null),
Charlotte, Cleveland, and Nashville. As
datasets become larger, you may see a List
may be incomplete warning message. This
occurs because Power Query Editor samples
the first 1,000 rows of data. If you see this
message, you can click on the Load more link
to have Power Query Editor analyze all the
rows of data.
Unpivoting data
Using Fill
Changing data types
Changing data types
Transforming People, Tasks, and January data
2

1
Merging queries
2

3
Join Kind
• A fuzzy merge allows similar but not identical items to be matched during a merge. Options
include a Similarity threshold, which is optional. The similarity threshold is a number between
0.00 and 1.00. A value of 0.00 causes all values to match, while a value of 1.00 causes only exact
values to match. The default is a value of 0.80. Additional options include the ability to Ignore
case, as well as the ability to Match by combining text parts. For example, by ignoring case,
mIcrOSoft could match Microsoft, and by combining text parts, Micro and soft could be combined
to match Microsoft. When performing fuzzy merges, it is possible to have multiple values match.
You can use the optional Maximum number of matches setting to control how many matching
rows are returned for each input row. This is a number that can range from 1 to 2,147,483,647 (the
default). Finally, there is an option to use a Transformation table. This allows you to specify a
table of values with From and To columns that can be used during the merge process. For
example, the merge table might contain a value in the From column for USA that maps to a To
column of United States.
Expanding tables
Disabling queries from being loaded
Copying queries
Changing sources
Another way?
Appending queries
Appending queries
Appending queries
• We have used multiple pages in a single
Excel workbook for the Hours query. In a
production scenario, this would likely
involve multiple Excel workbooks, one for
each month. In this case, if you have many
files in the same format, consider using a
Combine Binaries (Folder) query:
https://docs.microsoft.com/en-us/power-bi/d
esktop-combine-binaries.
Verifying and
loading data
• The Query Dependencies
window is useful for
understanding how queries are
related to one another and how
queries are organized.
Organizing queries
Organizing queries
Checking column quality, distribution, and
profiles
Loading the data
Vietnam
https://future.ueh.edu.vn/

Thank You

You might also like