Chapter 3 of 12

Power Query & Data Transformation

Clean, transform, and shape your data using Power Query Editor — merge, append, pivot, unpivot, and M language basics.

Meritshot47 min read
Power BIPower QueryETLData TransformationM Language
All Power BI Chapters

Power Query & Data Transformation

Raw data is rarely clean enough to analyze directly. It arrives with missing values, inconsistent formats, duplicate rows, wrong data types, and structures that do not fit your model. Power Query is the ETL (Extract, Transform, Load) engine built into Power BI that lets you clean, reshape, and transform data before it reaches your data model — all through an intuitive, visual interface.

In this chapter, you will master the Power Query Editor, learn dozens of transformation techniques, understand how to merge and append queries, work with pivot and unpivot operations, and get started with the M language that powers Power Query behind the scenes.


What is Power Query?

Definition

Power Query is a data transformation and data preparation engine available in Power BI Desktop, Excel, Dataflows, and other Microsoft products. It provides a graphical interface for connecting to data sources, applying transformations, and loading the result into your data model.

Key Characteristics

  • No-code / low-code — Most transformations are done through the ribbon and context menus without writing any code
  • Repeatable — Every step you apply is recorded and replayed automatically during data refresh
  • Non-destructive — Transformations never modify the original data source. They are applied as a pipeline of steps on a copy of the data.
  • Extensible — Advanced users can write custom transformations using the M language (also known as Power Query Formula Language)

How Power Query Fits into the Power BI Workflow

┌──────────────┐    ┌──────────────────┐    ┌──────────────┐    ┌──────────────┐
│  Data Source  │───▶│  Power Query     │───▶│  Data Model  │───▶│  Report      │
│  (Raw Data)  │    │  (Transform/ETL) │    │  (DAX/Model) │    │  (Visuals)   │
└──────────────┘    └──────────────────┘    └──────────────┘    └──────────────┘

Power Query sits between your raw data sources and the data model. Its job is to ensure that data arriving in the model is clean, properly typed, correctly structured, and ready for analysis.

The Applied Steps Concept

One of Power Query's most powerful features is the Applied Steps panel. Every transformation you perform — renaming a column, filtering rows, changing a data type — is recorded as a named step. These steps form a sequential pipeline:

Source                → Connect to data source
Navigation           → Select the table/sheet
Promoted Headers     → Use first row as headers
Changed Type         → Set column data types
Removed Columns      → Drop unnecessary columns
Filtered Rows        → Keep only relevant rows
Renamed Columns      → Give columns meaningful names

Benefits of Applied Steps:

  • You can click on any step to see the data at that point in the pipeline
  • You can insert, delete, reorder, or modify any step
  • Steps are replayed automatically during every data refresh
  • You can rename steps to make the pipeline self-documenting

The Power Query Editor Interface

To open the Power Query Editor, click Home → Transform Data in Power BI Desktop.

Layout Overview

The Power Query Editor window has five main areas:

AreaLocationPurpose
RibbonTopButtons for all transformation operations, organized by tabs
Queries PaneLeft sidebarLists all queries (tables) in your project; organize into groups
Preview PaneCenterShows a preview of the data for the currently selected query and step
Query SettingsRight sidebarShows the query name and Applied Steps for the selected query
Formula BarBelow ribbonDisplays the M formula for the currently selected step

Ribbon Tabs

Home Tab

The most frequently used commands:

Button GroupCommands
QueryClose & Apply, Refresh Preview, Data source settings
New QueryNew Source, Recent Sources, Enter Data, Blank Query
ManageManage Parameters, Advanced Editor
TransformUse First Row as Headers, Replace Values, Group By
CombineMerge Queries, Append Queries
Data TypeDetect Data Type, column type dropdown
RemoveRemove Columns, Remove Rows, Remove Duplicates

Transform Tab

Operations that modify existing columns:

Button GroupCommands
TableTranspose, Reverse Rows, Count Rows, Group By
Any ColumnData Type, Rename, Move, Unpivot, Fill
Text ColumnFormat (Uppercase, Lowercase, Trim, Clean), Extract, Split Column
Number ColumnStatistics (Sum, Min, Max, Average), Standard, Scientific, Rounding
Date/Time ColumnDate (Year, Month, Day, Day of Week), Time (Hour, Minute, Second), Duration

Add Column Tab

Operations that create new columns while keeping the originals:

Button GroupCommands
GeneralCustom Column, Conditional Column, Index Column, Column from Examples
From TextFormat, Extract, Split Column, Merge Columns
From NumberStatistics, Standard, Scientific, Rounding
From Date/TimeDate, Time, Duration

View Tab

Controls what is visible in the editor:

CommandPurpose
Formula BarToggle the M formula bar visibility
MonospacedSwitch to a monospaced font for better data readability
Column QualityShow valid, error, and empty percentages for each column
Column DistributionShow value distribution histograms for each column
Column ProfileShow detailed statistics for the selected column
Query DependenciesVisual diagram showing how queries reference each other

Tip: Always enable Column Quality, Column Distribution, and Column Profile during data exploration. They reveal data quality issues at a glance.


Basic Transformations

These are the transformations you will use in virtually every Power Query project.

Remove Columns

Why: Removing unnecessary columns reduces model size and simplifies your data.

How to remove columns:

  1. Method 1: Select and remove — Select the column(s) you want to remove, right-click, and choose Remove Columns
  2. Method 2: Select and keep — Select the column(s) you want to keep, right-click, and choose Remove Other Columns (removes everything except selected columns)
  3. Method 3: Ribbon — Select columns, then Home → Remove Columns

Best Practice: Use "Remove Other Columns" when you only need a few columns from a wide table. This is more resilient to source schema changes because it specifies what to keep rather than what to remove.

Rename Columns

Why: Give columns clear, business-friendly names.

How:

  1. Double-click the column header and type the new name
  2. Or right-click the column header → Rename

Best Practice: Use consistent naming conventions:

  • PascalCase: OrderDate, CustomerName, TotalAmount
  • Or spaces: Order Date, Customer Name, Total Amount

Change Data Types

Why: Correct data types are essential for proper sorting, filtering, aggregation, and DAX calculations.

How:

  1. Click the data type icon in the column header (e.g., "ABC" for text, "1.2" for decimal)
  2. Select the correct type from the dropdown
  3. Or right-click → Change Type → Select the type
IconTypeDescription
ABCTextString values
123Whole NumberIntegers (no decimals)
1.2Decimal NumberNumbers with decimals (floating point)
1.20Fixed Decimal NumberCurrency-like precision (4 decimal places)
Date iconDateDate only (no time)
Clock iconTimeTime only (no date)
Calendar+ClockDate/TimeCombined date and time
Calendar+Clock+ZoneDate/Time/TimezoneDate, time, and timezone offset
DurationDurationTime spans (days, hours, minutes, seconds)
TRUE/FALSETrue/FalseBoolean values
BinaryBinaryBinary data (images, files)

Common Issue: When you change a data type, Power BI may add a "Changed Type" step. If you already have a previous "Changed Type" step, it will ask whether to Replace current or Add new step. Choose Replace current to keep your steps clean unless there is a specific reason to separate them.

Reorder Columns

Why: Organize columns in a logical order for readability.

How:

  1. Click and drag column headers to the desired position
  2. Or right-click a column → Move → Left / Right / To Beginning / To End

Remove Duplicates

Why: Eliminate duplicate rows that can inflate aggregations.

How:

  1. Select the column(s) that define uniqueness
  2. Right-click → Remove Duplicates
  3. Power Query removes rows where the selected column combination is not unique, keeping the first occurrence

Example: If you have a customer table with duplicate CustomerID values, select the CustomerID column and remove duplicates.

Remove Errors

Why: Error values in columns can break visualizations and DAX calculations.

How:

  1. Right-click the column with errors → Remove Errors
  2. All rows where the selected column contains an error value are removed

Alternative: Replace errors instead of removing them:

  • Right-click → Replace Errors → Enter a replacement value (e.g., 0, "Unknown", null)

Remove Blank Rows

Why: Blank rows add noise and inflate row counts.

How:

  • Go to Home → Remove Rows → Remove Blank Rows
  • This removes rows where ALL columns are null/blank

Column Operations

Split Column

Split a single column into multiple columns based on a pattern.

By Delimiter

Use case: A "Full Name" column that contains "First Last" needs to be split into separate columns.

Steps:

  1. Select the column to split
  2. Go to Transform → Split Column → By Delimiter (or right-click → Split Column → By Delimiter)
  3. Choose the delimiter (comma, semicolon, space, tab, custom, etc.)
  4. Choose where to split:
    • Each occurrence — Split at every delimiter (e.g., "A,B,C" → three columns)
    • Left-most delimiter — Split only at the first delimiter
    • Right-most delimiter — Split only at the last delimiter
  5. Click OK

Example:

Before: FullAddressAfter: StreetCityState
123 Main St, Springfield, IL123 Main StSpringfieldIL
456 Oak Ave, Portland, OR456 Oak AvePortlandOR

By Number of Characters

Use case: Fixed-width data where the first 3 characters represent a code.

Steps:

  1. Select the column
  2. Transform → Split Column → By Number of Characters
  3. Enter the number of characters (e.g., 3)
  4. Choose: Once (at position), Repeatedly (every N chars)
  5. Click OK

By Positions

Use case: Split at specific character positions (e.g., positions 0, 5, 10).

Steps:

  1. Select the column
  2. Transform → Split Column → By Positions
  3. Enter comma-separated positions (e.g., 0, 5, 10)
  4. Click OK

Merge Columns

Combine two or more columns into one.

Steps:

  1. Select multiple columns (Ctrl+click)
  2. Right-click → Merge Columns (or Transform → Merge Columns)
  3. Choose a separator (space, comma, dash, custom, or none)
  4. Enter a name for the new column
  5. Click OK

Example:

FirstNameLastName→ Merged: FullName
JohnSmithJohn Smith
JaneDoeJane Doe

Extract

Extract part of a text value from a column.

Available extractions (Transform → Extract):

ExtractionResult Example
First Characters (N)"Microsoft" → "Mic" (first 3)
Last Characters (N)"Microsoft" → "oft" (last 3)
Range (start, count)"Microsoft" → "cros" (start 2, count 4)
Text Before Delimiter"John-Smith" → "John" (before "-")
Text After Delimiter"John-Smith" → "Smith" (after "-")
Text Between Delimiters"A[B]C" → "B" (between "[" and "]")
Length"Microsoft" → 9

Column from Examples

Power Query can infer the transformation you want based on examples you provide.

Steps:

  1. Go to Add Column → Column from ExamplesFrom Selection or From All Columns
  2. A new column appears with empty cells
  3. Type the desired output for one or two rows
  4. Power Query detects the pattern and fills the rest
  5. Review the results and click OK

Example: Given a column with email addresses, type "gmail.com" for the first row that has "user@gmail.com". Power Query infers that you want to extract the domain and fills the rest automatically.


Row Operations

Filter Rows

Filtering rows is one of the most common operations. Click the dropdown arrow on any column header to access filter options.

Text Filters

FilterDescriptionExample
EqualsExact matchCity equals "New York"
Does Not EqualExclude exact matchCategory does not equal "Discontinued"
Begins WithStarts with textName begins with "Mc"
Ends WithEnds with textEmail ends with "@gmail.com"
ContainsText appears anywhereDescription contains "premium"
Does Not ContainText does not appearNotes does not contain "cancelled"
Is nullValue is nullPhone is null (find missing phone numbers)
Is Not nullValue is not nullEmail is not null (only customers with email)

Number Filters

FilterDescription
Equals / Does Not EqualExact numeric match
Greater Than / Greater Than or Equal ToValues above a threshold
Less Than / Less Than or Equal ToValues below a threshold
BetweenValues within a range
Is Even / Is OddEven or odd numbers

Date Filters

FilterDescription
Is Earliest / Is LatestThe minimum or maximum date
Before / AfterDates before or after a specific date
BetweenDates within a range
In the Previous / In the NextRelative date filters (last N days/months/years)
Year / Quarter / Month / DayFilter by date component

Keep / Remove Rows

Home → Remove Rows provides options:

OperationDescription
Remove Top RowsRemove the first N rows (useful for header metadata rows)
Remove Bottom RowsRemove the last N rows (useful for footer/summary rows)
Remove Alternate RowsRemove every Nth row (useful for cleaning semi-structured data)
Remove Blank RowsRemove rows where all columns are null
Remove ErrorsRemove rows containing error values
Keep Top RowsKeep only the first N rows
Keep Bottom RowsKeep only the last N rows
Keep Range of RowsKeep rows within a specified range (start position, count)
Keep DuplicatesKeep only rows that have duplicates (useful for finding repeated entries)
Keep ErrorsKeep only rows with errors (useful for error investigation)

Data Type Conversions

Correct data types are foundational. Here is a comprehensive reference for Power Query data types.

Available Data Types

TypeDescriptionStorageExample
TextAny string valueVariable"Hello World", "12345"
Whole Number64-bit integer8 bytes42, -100, 1000000
Decimal Number64-bit floating point (double precision)8 bytes3.14159, -0.001
Fixed Decimal128-bit fixed-point (4 decimal places)16 bytes1234.5678 (currency)
PercentageDecimal displayed as percentage8 bytes0.25 (displayed as 25%)
DateDate without time8 bytes2025-06-15
TimeTime without date8 bytes14:30:00
Date/TimeCombined date and time8 bytes2025-06-15 14:30:00
Date/Time/TimezoneDate, time, and timezone10 bytes2025-06-15 14:30:00 +05:30
DurationTime span8 bytes3.05:30:00 (3 days, 5 hrs, 30 min)
True/FalseBoolean1 bytetrue, false
BinaryRaw binary dataVariableImages, files

Common Conversion Issues

IssueSymptomCauseFix
Numbers as text"1234" instead of 1234; no aggregation availableSource stored numbers as stringsChange type to Whole Number or Decimal Number
Dates as numbers45000 instead of 2023-03-01Excel serial date formatChange type to Date
Dates as text"03/15/2025" shown as textSource formattingChange type to Date; set locale if format differs
Locale mismatch"1.234,56" not recognized as 1234.56European number format (comma as decimal)Use "Using Locale" option: Right-click → Change Type → Using Locale → Select appropriate locale
Boolean as text"Yes"/"No" or "1"/"0" instead of TRUE/FALSESource uses custom boolean valuesAdd a conditional column or replace values first, then convert
Null handlingErrors when converting nullsNull values cannot be converted to some typesReplace nulls first, then convert

Converting with Locale

For international data where number and date formats differ:

  1. Right-click the column header
  2. Select Change Type → Using Locale
  3. Choose the Data Type (e.g., Decimal Number)
  4. Choose the Locale (e.g., German (Germany) for comma-decimal format)
  5. Click OK

Adding Custom Columns

Custom Column

Create a new column using an M formula.

Steps:

  1. Go to Add Column → Custom Column
  2. Enter a column name
  3. Write an M expression in the formula box
  4. Available columns are listed on the right — double-click to insert them

Common Examples:

// Profit calculation
[Revenue] - [Cost]

// Full name from parts
[FirstName] & " " & [LastName]

// Tax calculation (10%)
[Amount] * 0.10

// Year extraction from date
Date.Year([OrderDate])

// Conditional logic
if [Amount] > 1000 then "High" else "Low"

// Text with number conversion
"Order-" & Text.From([OrderID])

Conditional Column

Create a column with if/then/else logic through a visual interface (no coding required).

Steps:

  1. Go to Add Column → Conditional Column
  2. Enter a column name
  3. Define conditions:
Column NameOperatorValueOutput
Amountis greater than10000"High"
Amountis greater than5000"Medium"
Amountis greater than0"Low"
  1. Set an Else value (e.g., "Unknown")
  2. Click OK

Example Result:

Amount→ Category
15000High
7500Medium
3200Low
-100Unknown

Note: Conditions are evaluated top to bottom. The first matching condition is used. Order matters!

Index Column

Add a sequential number to each row.

Steps:

  1. Go to Add Column → Index Column
  2. Choose starting value:
    • From 0 — First row is 0
    • From 1 — First row is 1
    • Custom — Specify start value and increment

Use cases:

  • Create a surrogate key for tables without a natural key
  • Maintain original row order after sorting
  • Number rows for ranking or reference

Pivoting and Unpivoting

Pivoting and unpivoting are essential operations for reshaping data between wide and long formats.

Understanding the Difference

ConceptDirectionDescriptionWhen to Use
UnpivotWide → LongConverts multiple columns into attribute-value pairsWhen you have months as columns and need a single "Month" column
PivotLong → WideConverts row values into column headersWhen you need to spread categories across columns

Unpivot Example

Scenario: You have a sales table where each month is a separate column:

Before (Wide Format):

ProductJanFebMarApr
Laptop10012095130
Phone200180210195
Tablet50657055

Goal: Convert to a long format with "Month" and "Sales" columns.

Steps:

  1. Select the columns to unpivot (Jan, Feb, Mar, Apr) — hold Ctrl and click each column header
  2. Right-click → Unpivot Columns (or Transform → Unpivot Columns)
  3. Two new columns are created:
    • Attribute — Contains the original column names (Jan, Feb, Mar, Apr)
    • Value — Contains the corresponding values
  4. Rename "Attribute" to "Month" and "Value" to "Sales"

After (Long Format):

ProductMonthSales
LaptopJan100
LaptopFeb120
LaptopMar95
LaptopApr130
PhoneJan200
PhoneFeb180
PhoneMar210
PhoneApr195
TabletJan50
TabletFeb65
TabletMar70
TabletApr55

Unpivot Options:

OptionDescription
Unpivot ColumnsUnpivot the selected columns
Unpivot Other ColumnsUnpivot all columns EXCEPT the selected ones (more resilient to schema changes)
Unpivot Only Selected ColumnsStrictly unpivot only the selected columns

Best Practice: Use "Unpivot Other Columns" whenever possible. If new month columns are added to the source (e.g., May, Jun), they are automatically included. With "Unpivot Columns", new columns would be ignored.

Pivot Example

Scenario: You have a long-format table and need to convert it to wide format.

Before (Long Format):

ProductMetricValue
LaptopRevenue50000
LaptopCost35000
LaptopUnits250
PhoneRevenue40000
PhoneCost22000
PhoneUnits400

Steps:

  1. Select the column whose values will become new column headers ("Metric" column)
  2. Go to Transform → Pivot Column
  3. In the dialog:
    • Values Column — Select "Value" (the column whose values will populate the new columns)
    • Advanced options — Choose the aggregate function (Sum, Count, Min, Max, Don't Aggregate)
  4. Click OK

After (Wide Format):

ProductRevenueCostUnits
Laptop5000035000250
Phone4000022000400

Merging Queries (Joins)

Merging queries is Power Query's equivalent of SQL JOINs. It combines columns from two tables based on matching values in one or more columns.

Step-by-Step: Merge Two Queries

  1. Select the primary query in the Queries pane
  2. Go to Home → Merge Queries (or Merge Queries as New to create a new query)
  3. In the Merge dialog:
    • The top table is already selected (your current query)
    • Select the bottom table from the dropdown
    • Click the matching column(s) in both tables (click to highlight)
    • For multi-column joins, Ctrl+click additional columns in order
  4. Choose the Join Kind
  5. Review the match indicator (e.g., "The selection has matched 950 of 1000 rows from the first table")
  6. Click OK
  7. A new column appears containing the merged table data (nested tables)
  8. Click the expand icon (double arrow ↔) on the new column header
  9. Select which columns to include from the merged table
  10. Choose whether to use the original column name as a prefix
  11. Click OK

Join Types Comparison

Join TypeReturnsSQL EquivalentDescription
Left OuterAll rows from left + matching from rightLEFT JOINKeep all left table rows; add matched right data (nulls for non-matches)
Right OuterAll rows from right + matching from leftRIGHT JOINKeep all right table rows; add matched left data (nulls for non-matches)
Full OuterAll rows from both tablesFULL OUTER JOINKeep all rows from both tables; nulls where no match
InnerOnly matching rowsINNER JOINOnly rows that have a match in both tables
Left AntiLeft rows with NO match in rightNOT EXISTSOnly left table rows that have no match in the right table
Right AntiRight rows with NO match in leftNOT EXISTSOnly right table rows that have no match in the left table

Visual Description of Joins

Imagine two overlapping circles representing Table A (left) and Table B (right):

  • Left Outer — The entire left circle plus the overlap
  • Right Outer — The entire right circle plus the overlap
  • Full Outer — Both entire circles
  • Inner — Only the overlap area
  • Left Anti — Only the left circle's non-overlapping area
  • Right Anti — Only the right circle's non-overlapping area

Practical Example

Scenario: You have an Orders table and a Customers table. You want to add customer details to each order.

Orders Table:

OrderIDCustomerIDOrderDateAmount
1001C012025-01-15250.00
1002C032025-01-16180.50
1003C012025-01-17320.00
1004C052025-01-1895.00

Customers Table:

CustomerIDCustomerNameCity
C01Alice JohnsonNew York
C02Bob SmithChicago
C03Carol WilliamsHouston
C04Dave BrownPhoenix

Using Left Outer Join (on CustomerID):

OrderIDCustomerIDOrderDateAmountCustomerNameCity
1001C012025-01-15250.00Alice JohnsonNew York
1002C032025-01-16180.50Carol WilliamsHouston
1003C012025-01-17320.00Alice JohnsonNew York
1004C052025-01-1895.00nullnull

Notice that OrderID 1004 (CustomerID C05) has null values for CustomerName and City because C05 does not exist in the Customers table.

Using Left Anti Join: Would return only OrderID 1004, because it is the only order without a matching customer.

Merge Queries vs Merge Queries as New

OptionBehavior
Merge QueriesAdds the merged columns to the current query
Merge Queries as NewCreates a brand-new query with the merge result, leaving original queries unchanged

Best Practice: Use "Merge Queries as New" when you want to preserve the original queries as staging tables and create a clean merged output.


Appending Queries

Appending stacks rows from two or more tables on top of each other. This is equivalent to SQL's UNION ALL.

Step-by-Step: Append Queries

  1. Select the primary query
  2. Go to Home → Append Queries (or Append Queries as New)
  3. Choose:
    • Two tables — Select the second table from the dropdown
    • Three or more tables — Add multiple tables from the "Available tables" list to the "Tables to append" list
  4. Click OK

How Column Matching Works

Power Query matches columns by name (case-insensitive):

ScenarioResult
Both tables have column "Sales"Values are combined into one "Sales" column
Table A has "Revenue", Table B has "Sales"Two separate columns; each has nulls where the column did not exist in the source table
Table A has "Date" (type: Date), Table B has "Date" (type: Text)Column is created; type conflict may cause errors

Example: Appending Monthly Sales

January Sales:

ProductAmount
Laptop5000
Phone3000

February Sales:

ProductAmount
Laptop6200
Phone2800
Tablet1500

After Appending:

ProductAmount
Laptop5000
Phone3000
Laptop6200
Phone2800
Tablet1500

Tip: Add an index column or source identifier column before appending so you can trace which source each row came from.


Grouping / Aggregation

Group By consolidates rows that share common values and calculates aggregates.

Step-by-Step: Group By

  1. Go to Transform → Group By (or Home → Group By)
  2. Choose Basic or Advanced:
    • Basic — Group by one column with one aggregation
    • Advanced — Group by multiple columns with multiple aggregations
  3. Configure:
    • Group By — Select the column(s) to group by
    • New Column Name — Name for the aggregated result column
    • Operation — Sum, Count Rows, Average, Min, Max, Count Distinct Values, All Rows
    • Add more aggregations by clicking Add aggregation
  4. Click OK

Available Aggregation Operations

OperationDescriptionResult Type
SumTotal of all valuesNumber
AverageMean of all valuesDecimal
MedianMiddle valueNumber
MinSmallest valueSame as source
MaxLargest valueSame as source
Count RowsNumber of rows in each groupWhole number
Count Distinct ValuesNumber of unique valuesWhole number
All RowsA nested table containing all rows for each groupTable

Example: Summarizing Sales Data

Source Data:

RegionProductAmount
NorthLaptop5000
NorthLaptop6200
NorthPhone3000
SouthLaptop4500
SouthPhone2800
SouthPhone3100

Group By (Advanced): Region and Product, with Sum of Amount and Count of Rows

RegionProductTotalAmountOrderCount
NorthLaptop112002
NorthPhone30001
SouthLaptop45001
SouthPhone59002

The "All Rows" Operation

Selecting "All Rows" creates a nested table column instead of an aggregate. This is useful when you want to keep the detailed rows accessible.

After grouping with "All Rows", you can expand the nested table or perform additional operations on it using custom M code.


Date/Time Transformations

Power Query provides extensive date and time operations.

Extracting Date Components

Select a date column, then go to Transform → Date or Add Column → Date:

ExtractionResult Example (from 2025-06-15)
Year2025
Month6
Month NameJune
Day15
Day of WeekSunday (number: 0)
Day of Week NameSunday
Day of Year166
Week of Year25
Quarter2
Start of Year2025-01-01
End of Year2025-12-31
Start of Month2025-06-01
End of Month2025-06-30
Start of Week2025-06-09 (Monday)
End of Week2025-06-15 (Sunday)
Start of Quarter2025-04-01
End of Quarter2025-06-30

Calculating Age from Birth Date

Use a Custom Column with this M formula:

let
    today = DateTime.LocalNow(),
    birthDate = [BirthDate],
    age = Duration.Days(today - birthDate) / 365.25
in
    Number.RoundDown(age, 0)

Calculating Date Differences

To calculate the number of days between two dates:

Duration.Days([EndDate] - [StartDate])

To calculate months (approximate):

(Date.Year([EndDate]) - Date.Year([StartDate])) * 12
+ (Date.Month([EndDate]) - Date.Month([StartDate]))

Creating a Fiscal Year Column

If your fiscal year starts in April:

if Date.Month([OrderDate]) >= 4
then "FY" & Text.From(Date.Year([OrderDate])) & "-" & Text.From(Date.Year([OrderDate]) + 1)
else "FY" & Text.From(Date.Year([OrderDate]) - 1) & "-" & Text.From(Date.Year([OrderDate]))

Text Transformations

Power Query offers numerous text operations.

Common Text Operations

OperationPathDescriptionBefore → After
TrimTransform → Format → TrimRemove leading/trailing spaces" Hello " → "Hello"
CleanTransform → Format → CleanRemove non-printable characters"Hello\x00World" → "HelloWorld"
LowercaseTransform → Format → lowercaseConvert to lowercase"HELLO" → "hello"
UPPERCASETransform → Format → UPPERCASEConvert to uppercase"hello" → "HELLO"
Proper CaseTransform → Format → Capitalize Each WordCapitalize first letter of each word"john doe" → "John Doe"
Add PrefixTransform → Format → Add PrefixPrepend text"123" → "ID-123"
Add SuffixTransform → Format → Add SuffixAppend text"report" → "report_v2"
Replace ValuesRight-click → Replace ValuesFind and replace"NY" → "New York"
Trim + Clean comboApply bothBest practice for cleaning textRemoves both spaces and hidden characters

Replace Values

Replace specific values in a column:

  1. Select the column
  2. Right-click → Replace Values (or Transform → Replace Values)
  3. Enter the Value to Find and the Replace With value
  4. Click OK

Common uses:

  • Standardize abbreviations: "CA" → "California"
  • Fix typos: "Recieved" → "Received"
  • Replace nulls: null → "Not Specified"

Extract Operations

The Transform → Extract menu provides:

OperationDescriptionExample
LengthNumber of characters"Hello" → 5
First CharactersFirst N characters"Hello World" (5) → "Hello"
Last CharactersLast N characters"Hello World" (5) → "World"
RangeCharacters from position"Hello World" (6, 5) → "World"
Text Before DelimiterText before first delimiter"user@email.com" (@) → "user"
Text After DelimiterText after first delimiter"user@email.com" (@) → "email.com"
Text Between DelimitersText between two delimiters"[ERROR] message" ([ , ]) → "ERROR"

M Language Basics

Every step you create in Power Query generates M code behind the scenes. Understanding M allows you to go beyond the GUI and write custom transformations.

What is M?

M (officially "Power Query Formula Language") is a functional, case-sensitive programming language designed for data transformation. It is:

  • Functional — Based on expressions and functions, not procedural statements
  • Case-sensitiveTable.SelectRows is valid; table.selectrows is not
  • Lazy-evaluated — Expressions are only computed when their results are needed
  • Dynamically typed — Variable types are inferred at runtime

Viewing and Editing M Code

Formula Bar: Each step shows its M expression in the formula bar. You can edit it directly.

Advanced Editor: To see the complete M code for a query:

  1. Go to Home → Advanced Editor
  2. The full M query is displayed, showing all steps

The let ... in Structure

Every Power Query query follows this structure:

let
    // Step 1: Connect to data source
    Source = Excel.Workbook(File.Contents("C:\Data\Sales.xlsx"), null, true),

    // Step 2: Navigate to the sheet
    Sheet1 = Source{[Item="Sheet1", Kind="Sheet"]}[Data],

    // Step 3: Promote headers
    #"Promoted Headers" = Table.PromoteHeaders(Sheet1, [PromoteAllScalars=true]),

    // Step 4: Change types
    #"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers", {
        {"OrderDate", type date},
        {"Amount", type number},
        {"Quantity", Int64.Type}
    }),

    // Step 5: Filter rows
    #"Filtered Rows" = Table.SelectRows(#"Changed Type", each [Amount] > 100)
in
    // Return the final step
    #"Filtered Rows"

Key points:

  • let defines a series of named steps (variables)
  • Each step references the previous step
  • Steps are separated by commas
  • in specifies which step's result to return (usually the last step)
  • Step names with spaces or special characters must be enclosed in #"..."
  • Comments use // for single-line or /* ... */ for multi-line

Common M Functions

Text Functions

FunctionDescriptionExample
Text.Upper(text)Convert to uppercaseText.Upper("hello") → "HELLO"
Text.Lower(text)Convert to lowercaseText.Lower("HELLO") → "hello"
Text.Proper(text)Capitalize each wordText.Proper("john doe") → "John Doe"
Text.Trim(text)Remove leading/trailing whitespaceText.Trim(" hi ") → "hi"
Text.Clean(text)Remove non-printable charactersText.Clean("hi\x00") → "hi"
Text.Start(text, count)First N charactersText.Start("Hello", 3) → "Hel"
Text.End(text, count)Last N charactersText.End("Hello", 3) → "llo"
Text.Range(text, start, count)Substring from positionText.Range("Hello", 2, 2) → "ll"
Text.Contains(text, value)Check if text contains valueText.Contains("Hello", "ell") → true
Text.Replace(text, old, new)Replace occurrencesText.Replace("Hi Bob", "Bob", "Alice") → "Hi Alice"
Text.Length(text)Character countText.Length("Hello") → 5
Text.From(value)Convert value to textText.From(42) → "42"
Text.Combine(list, sep)Join list with separatorText.Combine({"A","B","C"}, "-") → "A-B-C"
Text.Split(text, sep)Split text into listText.Split("A-B-C", "-")

Number Functions

FunctionDescriptionExample
Number.Round(num, digits)Round to N digitsNumber.Round(3.14159, 2) → 3.14
Number.RoundDown(num, digits)Round down (floor)Number.RoundDown(3.9, 0) → 3
Number.RoundUp(num, digits)Round up (ceiling)Number.RoundUp(3.1, 0) → 4
Number.Abs(num)Absolute valueNumber.Abs(-42) → 42
Number.Mod(num, divisor)Remainder (modulo)Number.Mod(10, 3) → 1
Number.Power(num, exp)ExponentiationNumber.Power(2, 10) → 1024
Number.Sqrt(num)Square rootNumber.Sqrt(144) → 12
Number.From(value)Convert to numberNumber.From("42") → 42

Date Functions

FunctionDescriptionExample
Date.Year(date)Extract yearDate.Year(#date(2025, 6, 15)) → 2025
Date.Month(date)Extract month numberDate.Month(#date(2025, 6, 15)) → 6
Date.Day(date)Extract dayDate.Day(#date(2025, 6, 15)) → 15
Date.DayOfWeek(date)Day of week (0=Sun)Date.DayOfWeek(#date(2025, 6, 15)) → 0
Date.DayOfYear(date)Day of yearDate.DayOfYear(#date(2025, 6, 15)) → 166
Date.WeekOfYear(date)Week numberDate.WeekOfYear(#date(2025, 6, 15)) → 25
Date.QuarterOfYear(date)Quarter (1-4)Date.QuarterOfYear(#date(2025, 6, 15)) → 2
Date.StartOfMonth(date)First day of monthDate.StartOfMonth(#date(2025, 6, 15)) → 2025-06-01
Date.EndOfMonth(date)Last day of monthDate.EndOfMonth(#date(2025, 6, 15)) → 2025-06-30
Date.AddDays(date, n)Add N daysDate.AddDays(#date(2025, 6, 15), 10) → 2025-06-25
Date.AddMonths(date, n)Add N monthsDate.AddMonths(#date(2025, 6, 15), 3) → 2025-09-15
Date.AddYears(date, n)Add N yearsDate.AddYears(#date(2025, 6, 15), 1) → 2026-06-15
DateTime.LocalNow()Current date/timeReturns the current local timestamp

Table Functions

FunctionDescription
Table.SelectRows(table, condition)Filter rows based on a condition
Table.RemoveColumns(table, columns)Remove specified columns
Table.RenameColumns(table, renames)Rename columns
Table.AddColumn(table, name, formula)Add a calculated column
Table.TransformColumnTypes(table, types)Change column data types
Table.PromoteHeaders(table)Use first row as column headers
Table.Sort(table, comparisons)Sort the table
Table.Distinct(table)Remove duplicate rows
Table.Group(table, keys, aggregations)Group by and aggregate
Table.NestedJoin(table1, key1, table2, key2, name, joinKind)Merge/join tables
Table.Combine(tables)Append tables together
Table.Pivot(table, pivotColumn, valueColumn, aggregation)Pivot operation
Table.Unpivot(table, columns, attributeName, valueName)Unpivot operation
Table.RowCount(table)Count number of rows
Table.ColumnNames(table)Get list of column names

List Functions

FunctionDescriptionExample
List.Sum(list)Sum all valuesList.Sum({1, 2, 3}) → 6
List.Average(list)Average of valuesList.Average({10, 20, 30}) → 20
List.Min(list)Minimum valueList.Min({5, 3, 8}) → 3
List.Max(list)Maximum valueList.Max({5, 3, 8}) → 8
List.Count(list)Number of elementsList.Count({1, 2, 3}) → 3
List.Distinct(list)Unique valuesList.Distinct({1, 2, 2, 3})
List.Contains(list, value)Check membershipList.Contains({1, 2, 3}, 2) → true
List.Sort(list)Sort valuesList.Sort({3, 1, 2})
List.Transform(list, func)Apply function to eachList.Transform({1,2,3}, each _ * 2)

Writing Custom M in the Advanced Editor

Example: Create a custom function that cleans text by trimming, cleaning, and converting to proper case:

let
    CleanText = (inputText as text) as text =>
        let
            trimmed = Text.Trim(inputText),
            cleaned = Text.Clean(trimmed),
            proper = Text.Proper(cleaned)
        in
            proper
in
    CleanText

You can then invoke this function from other queries to reuse the logic.

M Language Comments

// This is a single-line comment

/* This is a
   multi-line comment */

let
    // Step 1: Load data from the source file
    Source = Excel.Workbook(File.Contents("C:\Data\file.xlsx"), null, true),

    /* Step 2: Apply transformations
       - Remove unnecessary columns
       - Filter to active records only */
    Transformed = Table.SelectRows(Source, each [Status] = "Active")
in
    Transformed

Query Folding

Query folding is one of the most important performance concepts in Power Query.

What is Query Folding?

Query folding is the process by which Power Query translates its transformation steps into native queries (e.g., SQL) that are executed at the data source rather than in Power BI's memory.

Without query folding: Power BI downloads all data first, then applies transformations locally (slow for large datasets).

With query folding: The data source handles filtering, sorting, and transformations, and only the result is sent to Power BI (fast and efficient).

Example

If you apply these steps to a SQL Server table:

  1. Filter rows where OrderDate >= 2025-01-01
  2. Remove the Notes column
  3. Sort by OrderDate

With query folding, Power Query generates this SQL:

SELECT [OrderID], [CustomerID], [OrderDate], [Amount]
FROM [dbo].[Orders]
WHERE [OrderDate] >= '2025-01-01'
ORDER BY [OrderDate]

The SQL Server processes the filter, column selection, and sort. Only the matching rows are sent to Power BI.

How to Check if Query Folding is Active

  1. In Power Query Editor, right-click on any step in the Applied Steps panel
  2. Look for "View Native Query":
    • Available (not grayed out) — Query folding is active for this step. Click to see the generated SQL.
    • Grayed out — Query folding has broken at or before this step.

Steps That Support Query Folding

Step TypeFoldable?
Filter rows (basic conditions)Yes
Remove columnsYes
Rename columnsYes
Sort rowsYes
Change data types (basic)Yes
Group ByYes
Merge (join) with another folded queryYes
Keep/Remove top/bottom rowsYes
Add conditional column (simple)Sometimes

Steps That Break Query Folding

Step TypeFoldable?Reason
Add Index ColumnNoNo SQL equivalent
Merge ColumnsNoCustom string operation
Pivot/Unpivot (in some sources)NoComplex transformation
Custom Column (complex M logic)NoCannot translate to SQL
Replace Values (with complex logic)NoCustom logic
Column from ExamplesNoPattern-based transformation
Change type using LocaleNoLocal conversion function
Buffer table (Table.Buffer)NoForces local evaluation
Added column referencing other queriesNoCross-query reference

Why Query Folding Matters

AspectWith FoldingWithout Folding
Data TransferOnly needed data is transferredAll data transferred, then filtered locally
Refresh TimeFastSlow (especially for large datasets)
Memory UsageLowHigh (all data loaded into memory)
Source LoadSource handles efficiently via indexesMinimal source work; heavy local processing
Incremental RefreshWorks correctlyCannot partition data properly

Best Practices for Maintaining Query Folding

  1. Apply filters early — Filter rows in the first few steps, before any non-foldable operations
  2. Remove columns early — Reduce the data volume before non-foldable steps
  3. Avoid custom columns before filters — Add custom columns after all foldable operations
  4. Use native SQL — For complex logic, write it as a custom SQL query in the source connection
  5. Check regularly — Right-click steps periodically to verify "View Native Query" is available
  6. Keep foldable steps together — Group all foldable operations at the top of the step list

Best Practices

Follow these best practices to create efficient, maintainable, and performant Power Query transformations.

1. Name Your Steps

Rename Applied Steps to describe what they do:

Default NameBetter Name
Changed TypeSet Column Data Types
Filtered RowsFilter to Active Customers Only
Removed ColumnsRemove Audit Columns
Merged QueriesJoin with Product Table
Grouped RowsSummarize Sales by Region

Right-click any step → Rename to change its name.

2. Remove Unnecessary Columns Early

Drop columns you do not need as soon as possible. This:

  • Reduces memory usage during transformations
  • Improves performance
  • Keeps the preview clean and readable
  • Maintains query folding (column removal folds)

3. Filter Rows Early

Apply row filters before heavy transformations. This reduces the number of rows processed in subsequent steps and maintains query folding.

4. Use Query Folding

Structure your steps to maintain query folding as long as possible. Place all foldable operations (filters, column removal, type changes) before non-foldable ones (custom columns, pivots, M functions).

5. Organize Queries into Groups

Right-click in the Queries pane → New Group to create folders:

📁 Staging (raw data, not loaded to model)
   ├── Raw_Sales
   ├── Raw_Customers
   └── Raw_Products
📁 Transformations
   ├── Clean_Sales
   ├── Clean_Customers
   └── Clean_Products
📁 Output (loaded to model)
   ├── Fact_Sales
   ├── Dim_Customer
   └── Dim_Product

6. Disable Load for Staging Queries

Right-click a staging query → uncheck Enable Load. This prevents intermediate/helper queries from being loaded into the data model, reducing model size.

7. Reference vs Duplicate Queries

ActionBehavior
ReferenceCreates a new query that starts from the result of the referenced query. Changes to the source query automatically flow to the referencing query.
DuplicateCreates an independent copy. Changes to the original do not affect the duplicate.

Best Practice: Use Reference when you want multiple queries to share the same initial transformations (e.g., a staging query referenced by multiple output queries). Use Duplicate when you need an independent starting point.

8. Use Parameters for Dynamic Values

Replace hardcoded values (file paths, server names, date ranges) with parameters. This makes your queries flexible and easy to maintain.

9. Document Complex Logic

Add comments in the Advanced Editor for complex M expressions. Future you (and your colleagues) will thank you.

10. Test with a Subset

When developing transformations on a large dataset, use Keep Top Rows (e.g., top 1000) during development for faster preview times. Remove the step before publishing.


Practice Exercises

Exercise 1: Basic Cleaning

Objective: Practice fundamental data cleaning operations.

Instructions:

  1. In Power BI Desktop, click Home → Enter Data and create a table called "RawEmployees" with this data:
| ID  | Full Name       | Department  | Salary  | Start Date |
|-----|-----------------|-------------|---------|------------|
| 1   |   John Smith    | SALES       | 55000   | 2020-03-15 |
| 2   | jane doe        | Marketing   | 62000   | 2019-08-01 |
| 3   | BOB WILLIAMS    | sales       | 48000   | 2021-01-10 |
| 4   | Alice Johnson   | MARKETING   | 71000   | 2018-06-20 |
| 5   |   Charlie Brown | Engineering | 85000   | 2022-02-28 |
| 6   | Diana Prince    | engineering | 92000   | 2017-11-05 |
| 7   | john smith      | Sales       | 55000   | 2020-03-15 |
|     |                 |             |         |            |
  1. Open Power Query Editor (Transform Data)
  2. Perform these cleaning steps:
    • Remove the blank row at the bottom
    • Trim whitespace from Full Name
    • Convert Full Name to Proper Case
    • Standardize Department to Proper Case
    • Change Salary to Whole Number type
    • Change Start Date to Date type
    • Remove the duplicate row (ID 7 duplicates ID 1)
  3. Rename each Applied Step to describe what it does
  4. Click Close & Apply

Exercise 2: Split and Merge Columns

Objective: Practice splitting and merging column operations.

Instructions:

  1. Create a table called "Contacts" with this data:
| FullName      | Email              | Phone         |
|---------------|--------------------|---------------|
| John Smith    | john@company.com   | 212-555-1234  |
| Jane Doe      | jane@company.com   | 310-555-5678  |
| Bob Williams  | bob@other.org      | 415-555-9012  |
| Alice Johnson | alice@company.com  | 617-555-3456  |
  1. In Power Query:
    • Split FullName by space delimiter into FirstName and LastName columns
    • Split Email by "@" to extract the domain name
    • Split Phone by "-" to extract area code, exchange, and number
    • Merge FirstName, LastName, and domain into a new column: "FirstName.LastName@domain"
  2. Close & Apply

Exercise 3: Unpivot Monthly Data

Objective: Transform wide-format data into long format.

Instructions:

  1. Create a table called "MonthlySales" with this data:
| Store    | Jan  | Feb  | Mar  | Apr  | May  | Jun  |
|----------|------|------|------|------|------|------|
| Store A  | 1200 | 1350 | 1100 | 1500 | 1400 | 1600 |
| Store B  | 900  | 1050 | 980  | 1100 | 1200 | 1150 |
| Store C  | 1500 | 1600 | 1450 | 1700 | 1800 | 1650 |
  1. In Power Query:
    • Select the Store column
    • Use Unpivot Other Columns to convert month columns into rows
    • Rename "Attribute" to "Month" and "Value" to "Sales"
    • Change Sales to Whole Number type
  2. Close & Apply and create a line chart showing Sales by Month with Store as the Legend

Exercise 4: Merge Queries (Join)

Objective: Practice joining two tables.

Instructions:

  1. Create two tables:

Orders:

| OrderID | CustomerID | Product  | Amount |
|---------|-----------|----------|--------|
| 101     | C01       | Laptop   | 1200   |
| 102     | C02       | Phone    | 800    |
| 103     | C01       | Tablet   | 500    |
| 104     | C03       | Laptop   | 1200   |
| 105     | C04       | Monitor  | 350    |

Customers:

| CustomerID | Name           | City        |
|-----------|----------------|-------------|
| C01       | Alice Johnson  | New York    |
| C02       | Bob Smith      | Chicago     |
| C03       | Carol Williams | Houston     |
| C05       | Eve Davis      | Seattle     |
  1. In Power Query:
    • Merge the Orders query with Customers using a Left Outer join on CustomerID
    • Expand the merged column to include Name and City
    • Note which orders have null customer information (and why)
    • Create a new merge using Left Anti join to find orders without matching customers
  2. Close & Apply

Exercise 5: Group By with Multiple Aggregations

Objective: Summarize data using Group By.

Instructions:

  1. Using the Orders table from Exercise 4 (after joining with Customers):
    • Open Power Query Editor
    • Use Group By (Advanced) to group by City
    • Add these aggregations:
      • Total Amount (Sum of Amount)
      • Order Count (Count Rows)
      • Average Order Value (Average of Amount)
      • Max Single Order (Max of Amount)
  2. Close & Apply and create a table visual showing the summary

Exercise 6: Write M Code

Objective: Practice writing M language formulas.

Instructions:

  1. Open Power Query Editor with any table containing text and number columns
  2. Open the Advanced Editor and study the existing M code
  3. Create the following Custom Columns using M formulas:
    • A column that concatenates two text columns with a dash separator
    • A column that rounds a number column to 2 decimal places
    • A column that extracts the year from a date column
    • A column that uses if/then/else logic to categorize a numeric column into "High", "Medium", and "Low"
  4. After creating each column, look at the formula bar to see the M expression
  5. Try modifying the formula directly in the formula bar

Exercise 7: Verify Query Folding

Objective: Understand which steps support query folding.

Instructions:

  1. Connect to any SQL Server database (or use a DirectQuery-compatible source)
  2. Load a table and apply these steps in order:
    • Filter rows (e.g., where Year > 2024)
    • Remove three columns
    • Change data types
    • Right-click each step and check if "View Native Query" is available
  3. Now add these additional steps:
    • Add an Index Column
    • Add a Custom Column with an M formula
    • Right-click these new steps and check "View Native Query" — it should be grayed out
  4. Rearrange steps to put foldable operations before non-foldable ones
  5. Document which steps folded and which did not

Summary

In this chapter, you mastered Power Query — the essential data transformation engine in Power BI:

  • Power Query is the ETL engine that sits between your data sources and the data model, cleaning and shaping data through a visual interface of recorded steps
  • The Power Query Editor has four key areas: Ribbon, Queries Pane, Preview Pane, and Query Settings (Applied Steps)
  • Basic transformations include removing columns, renaming columns, changing data types, removing duplicates, removing errors, and removing blank rows
  • Column operations let you split columns by delimiters, merge columns, extract substrings, and create columns from examples
  • Row operations include filtering by text, number, and date conditions, keeping/removing top/bottom rows, and handling duplicates and errors
  • Data type conversions require attention to locale settings, especially for international number and date formats
  • Custom columns can be created through the Custom Column dialog, Conditional Column wizard, or Index Column button
  • Pivoting converts long-format data to wide format; Unpivoting converts wide-format data to long format — both are essential for reshaping data
  • Merging queries joins two tables using six join types: Left Outer, Right Outer, Full Outer, Inner, Left Anti, and Right Anti
  • Appending queries stacks rows from multiple tables based on matching column names
  • Group By aggregates data using Sum, Count, Average, Min, Max, and other operations
  • Date/Time transformations let you extract year, month, day, quarter, week, and calculate date differences
  • Text transformations include trim, clean, case changes, replace values, and extract operations
  • M Language is the functional language behind Power Query, using a let ... in structure with functions for text, numbers, dates, tables, and lists
  • Query Folding translates Power Query steps into native source queries (like SQL) for dramatically better performance — always check that it is active for your critical steps
  • Best practices include naming steps, removing columns early, filtering early, organizing queries into groups, disabling load for staging queries, and using parameters

In the next chapter, you will learn about Data Modeling — how to structure your tables with relationships, create DAX measures, and build an efficient star schema for your reports.