Power Query & Data Transformation
Raw data is rarely clean enough to analyze directly. It arrives with missing values, inconsistent formats, duplicate rows, wrong data types, and structures that do not fit your model. Power Query is the ETL (Extract, Transform, Load) engine built into Power BI that lets you clean, reshape, and transform data before it reaches your data model — all through an intuitive, visual interface.
In this chapter, you will master the Power Query Editor, learn dozens of transformation techniques, understand how to merge and append queries, work with pivot and unpivot operations, and get started with the M language that powers Power Query behind the scenes.
What is Power Query?
Definition
Power Query is a data transformation and data preparation engine available in Power BI Desktop, Excel, Dataflows, and other Microsoft products. It provides a graphical interface for connecting to data sources, applying transformations, and loading the result into your data model.
Key Characteristics
- No-code / low-code — Most transformations are done through the ribbon and context menus without writing any code
- Repeatable — Every step you apply is recorded and replayed automatically during data refresh
- Non-destructive — Transformations never modify the original data source. They are applied as a pipeline of steps on a copy of the data.
- Extensible — Advanced users can write custom transformations using the M language (also known as Power Query Formula Language)
How Power Query Fits into the Power BI Workflow
┌──────────────┐ ┌──────────────────┐ ┌──────────────┐ ┌──────────────┐
│ Data Source │───▶│ Power Query │───▶│ Data Model │───▶│ Report │
│ (Raw Data) │ │ (Transform/ETL) │ │ (DAX/Model) │ │ (Visuals) │
└──────────────┘ └──────────────────┘ └──────────────┘ └──────────────┘
Power Query sits between your raw data sources and the data model. Its job is to ensure that data arriving in the model is clean, properly typed, correctly structured, and ready for analysis.
The Applied Steps Concept
One of Power Query's most powerful features is the Applied Steps panel. Every transformation you perform — renaming a column, filtering rows, changing a data type — is recorded as a named step. These steps form a sequential pipeline:
Source → Connect to data source
Navigation → Select the table/sheet
Promoted Headers → Use first row as headers
Changed Type → Set column data types
Removed Columns → Drop unnecessary columns
Filtered Rows → Keep only relevant rows
Renamed Columns → Give columns meaningful names
Benefits of Applied Steps:
- You can click on any step to see the data at that point in the pipeline
- You can insert, delete, reorder, or modify any step
- Steps are replayed automatically during every data refresh
- You can rename steps to make the pipeline self-documenting
The Power Query Editor Interface
To open the Power Query Editor, click Home → Transform Data in Power BI Desktop.
Layout Overview
The Power Query Editor window has five main areas:
| Area | Location | Purpose |
|---|---|---|
| Ribbon | Top | Buttons for all transformation operations, organized by tabs |
| Queries Pane | Left sidebar | Lists all queries (tables) in your project; organize into groups |
| Preview Pane | Center | Shows a preview of the data for the currently selected query and step |
| Query Settings | Right sidebar | Shows the query name and Applied Steps for the selected query |
| Formula Bar | Below ribbon | Displays the M formula for the currently selected step |
Ribbon Tabs
Home Tab
The most frequently used commands:
| Button Group | Commands |
|---|---|
| Query | Close & Apply, Refresh Preview, Data source settings |
| New Query | New Source, Recent Sources, Enter Data, Blank Query |
| Manage | Manage Parameters, Advanced Editor |
| Transform | Use First Row as Headers, Replace Values, Group By |
| Combine | Merge Queries, Append Queries |
| Data Type | Detect Data Type, column type dropdown |
| Remove | Remove Columns, Remove Rows, Remove Duplicates |
Transform Tab
Operations that modify existing columns:
| Button Group | Commands |
|---|---|
| Table | Transpose, Reverse Rows, Count Rows, Group By |
| Any Column | Data Type, Rename, Move, Unpivot, Fill |
| Text Column | Format (Uppercase, Lowercase, Trim, Clean), Extract, Split Column |
| Number Column | Statistics (Sum, Min, Max, Average), Standard, Scientific, Rounding |
| Date/Time Column | Date (Year, Month, Day, Day of Week), Time (Hour, Minute, Second), Duration |
Add Column Tab
Operations that create new columns while keeping the originals:
| Button Group | Commands |
|---|---|
| General | Custom Column, Conditional Column, Index Column, Column from Examples |
| From Text | Format, Extract, Split Column, Merge Columns |
| From Number | Statistics, Standard, Scientific, Rounding |
| From Date/Time | Date, Time, Duration |
View Tab
Controls what is visible in the editor:
| Command | Purpose |
|---|---|
| Formula Bar | Toggle the M formula bar visibility |
| Monospaced | Switch to a monospaced font for better data readability |
| Column Quality | Show valid, error, and empty percentages for each column |
| Column Distribution | Show value distribution histograms for each column |
| Column Profile | Show detailed statistics for the selected column |
| Query Dependencies | Visual diagram showing how queries reference each other |
Tip: Always enable Column Quality, Column Distribution, and Column Profile during data exploration. They reveal data quality issues at a glance.
Basic Transformations
These are the transformations you will use in virtually every Power Query project.
Remove Columns
Why: Removing unnecessary columns reduces model size and simplifies your data.
How to remove columns:
- Method 1: Select and remove — Select the column(s) you want to remove, right-click, and choose Remove Columns
- Method 2: Select and keep — Select the column(s) you want to keep, right-click, and choose Remove Other Columns (removes everything except selected columns)
- Method 3: Ribbon — Select columns, then Home → Remove Columns
Best Practice: Use "Remove Other Columns" when you only need a few columns from a wide table. This is more resilient to source schema changes because it specifies what to keep rather than what to remove.
Rename Columns
Why: Give columns clear, business-friendly names.
How:
- Double-click the column header and type the new name
- Or right-click the column header → Rename
Best Practice: Use consistent naming conventions:
- PascalCase:
OrderDate,CustomerName,TotalAmount - Or spaces:
Order Date,Customer Name,Total Amount
Change Data Types
Why: Correct data types are essential for proper sorting, filtering, aggregation, and DAX calculations.
How:
- Click the data type icon in the column header (e.g., "ABC" for text, "1.2" for decimal)
- Select the correct type from the dropdown
- Or right-click → Change Type → Select the type
| Icon | Type | Description |
|---|---|---|
| ABC | Text | String values |
| 123 | Whole Number | Integers (no decimals) |
| 1.2 | Decimal Number | Numbers with decimals (floating point) |
| 1.20 | Fixed Decimal Number | Currency-like precision (4 decimal places) |
| Date icon | Date | Date only (no time) |
| Clock icon | Time | Time only (no date) |
| Calendar+Clock | Date/Time | Combined date and time |
| Calendar+Clock+Zone | Date/Time/Timezone | Date, time, and timezone offset |
| Duration | Duration | Time spans (days, hours, minutes, seconds) |
| TRUE/FALSE | True/False | Boolean values |
| Binary | Binary | Binary data (images, files) |
Common Issue: When you change a data type, Power BI may add a "Changed Type" step. If you already have a previous "Changed Type" step, it will ask whether to Replace current or Add new step. Choose Replace current to keep your steps clean unless there is a specific reason to separate them.
Reorder Columns
Why: Organize columns in a logical order for readability.
How:
- Click and drag column headers to the desired position
- Or right-click a column → Move → Left / Right / To Beginning / To End
Remove Duplicates
Why: Eliminate duplicate rows that can inflate aggregations.
How:
- Select the column(s) that define uniqueness
- Right-click → Remove Duplicates
- Power Query removes rows where the selected column combination is not unique, keeping the first occurrence
Example: If you have a customer table with duplicate CustomerID values, select the CustomerID column and remove duplicates.
Remove Errors
Why: Error values in columns can break visualizations and DAX calculations.
How:
- Right-click the column with errors → Remove Errors
- All rows where the selected column contains an error value are removed
Alternative: Replace errors instead of removing them:
- Right-click → Replace Errors → Enter a replacement value (e.g., 0, "Unknown", null)
Remove Blank Rows
Why: Blank rows add noise and inflate row counts.
How:
- Go to Home → Remove Rows → Remove Blank Rows
- This removes rows where ALL columns are null/blank
Column Operations
Split Column
Split a single column into multiple columns based on a pattern.
By Delimiter
Use case: A "Full Name" column that contains "First Last" needs to be split into separate columns.
Steps:
- Select the column to split
- Go to Transform → Split Column → By Delimiter (or right-click → Split Column → By Delimiter)
- Choose the delimiter (comma, semicolon, space, tab, custom, etc.)
- Choose where to split:
- Each occurrence — Split at every delimiter (e.g., "A,B,C" → three columns)
- Left-most delimiter — Split only at the first delimiter
- Right-most delimiter — Split only at the last delimiter
- Click OK
Example:
| Before: FullAddress | After: Street | City | State |
|---|---|---|---|
| 123 Main St, Springfield, IL | 123 Main St | Springfield | IL |
| 456 Oak Ave, Portland, OR | 456 Oak Ave | Portland | OR |
By Number of Characters
Use case: Fixed-width data where the first 3 characters represent a code.
Steps:
- Select the column
- Transform → Split Column → By Number of Characters
- Enter the number of characters (e.g., 3)
- Choose: Once (at position), Repeatedly (every N chars)
- Click OK
By Positions
Use case: Split at specific character positions (e.g., positions 0, 5, 10).
Steps:
- Select the column
- Transform → Split Column → By Positions
- Enter comma-separated positions (e.g., 0, 5, 10)
- Click OK
Merge Columns
Combine two or more columns into one.
Steps:
- Select multiple columns (Ctrl+click)
- Right-click → Merge Columns (or Transform → Merge Columns)
- Choose a separator (space, comma, dash, custom, or none)
- Enter a name for the new column
- Click OK
Example:
| FirstName | LastName | → Merged: FullName |
|---|---|---|
| John | Smith | John Smith |
| Jane | Doe | Jane Doe |
Extract
Extract part of a text value from a column.
Available extractions (Transform → Extract):
| Extraction | Result Example |
|---|---|
| First Characters (N) | "Microsoft" → "Mic" (first 3) |
| Last Characters (N) | "Microsoft" → "oft" (last 3) |
| Range (start, count) | "Microsoft" → "cros" (start 2, count 4) |
| Text Before Delimiter | "John-Smith" → "John" (before "-") |
| Text After Delimiter | "John-Smith" → "Smith" (after "-") |
| Text Between Delimiters | "A[B]C" → "B" (between "[" and "]") |
| Length | "Microsoft" → 9 |
Column from Examples
Power Query can infer the transformation you want based on examples you provide.
Steps:
- Go to Add Column → Column from Examples → From Selection or From All Columns
- A new column appears with empty cells
- Type the desired output for one or two rows
- Power Query detects the pattern and fills the rest
- Review the results and click OK
Example: Given a column with email addresses, type "gmail.com" for the first row that has "user@gmail.com". Power Query infers that you want to extract the domain and fills the rest automatically.
Row Operations
Filter Rows
Filtering rows is one of the most common operations. Click the dropdown arrow on any column header to access filter options.
Text Filters
| Filter | Description | Example |
|---|---|---|
| Equals | Exact match | City equals "New York" |
| Does Not Equal | Exclude exact match | Category does not equal "Discontinued" |
| Begins With | Starts with text | Name begins with "Mc" |
| Ends With | Ends with text | Email ends with "@gmail.com" |
| Contains | Text appears anywhere | Description contains "premium" |
| Does Not Contain | Text does not appear | Notes does not contain "cancelled" |
| Is null | Value is null | Phone is null (find missing phone numbers) |
| Is Not null | Value is not null | Email is not null (only customers with email) |
Number Filters
| Filter | Description |
|---|---|
| Equals / Does Not Equal | Exact numeric match |
| Greater Than / Greater Than or Equal To | Values above a threshold |
| Less Than / Less Than or Equal To | Values below a threshold |
| Between | Values within a range |
| Is Even / Is Odd | Even or odd numbers |
Date Filters
| Filter | Description |
|---|---|
| Is Earliest / Is Latest | The minimum or maximum date |
| Before / After | Dates before or after a specific date |
| Between | Dates within a range |
| In the Previous / In the Next | Relative date filters (last N days/months/years) |
| Year / Quarter / Month / Day | Filter by date component |
Keep / Remove Rows
Home → Remove Rows provides options:
| Operation | Description |
|---|---|
| Remove Top Rows | Remove the first N rows (useful for header metadata rows) |
| Remove Bottom Rows | Remove the last N rows (useful for footer/summary rows) |
| Remove Alternate Rows | Remove every Nth row (useful for cleaning semi-structured data) |
| Remove Blank Rows | Remove rows where all columns are null |
| Remove Errors | Remove rows containing error values |
| Keep Top Rows | Keep only the first N rows |
| Keep Bottom Rows | Keep only the last N rows |
| Keep Range of Rows | Keep rows within a specified range (start position, count) |
| Keep Duplicates | Keep only rows that have duplicates (useful for finding repeated entries) |
| Keep Errors | Keep only rows with errors (useful for error investigation) |
Data Type Conversions
Correct data types are foundational. Here is a comprehensive reference for Power Query data types.
Available Data Types
| Type | Description | Storage | Example |
|---|---|---|---|
| Text | Any string value | Variable | "Hello World", "12345" |
| Whole Number | 64-bit integer | 8 bytes | 42, -100, 1000000 |
| Decimal Number | 64-bit floating point (double precision) | 8 bytes | 3.14159, -0.001 |
| Fixed Decimal | 128-bit fixed-point (4 decimal places) | 16 bytes | 1234.5678 (currency) |
| Percentage | Decimal displayed as percentage | 8 bytes | 0.25 (displayed as 25%) |
| Date | Date without time | 8 bytes | 2025-06-15 |
| Time | Time without date | 8 bytes | 14:30:00 |
| Date/Time | Combined date and time | 8 bytes | 2025-06-15 14:30:00 |
| Date/Time/Timezone | Date, time, and timezone | 10 bytes | 2025-06-15 14:30:00 +05:30 |
| Duration | Time span | 8 bytes | 3.05:30:00 (3 days, 5 hrs, 30 min) |
| True/False | Boolean | 1 byte | true, false |
| Binary | Raw binary data | Variable | Images, files |
Common Conversion Issues
| Issue | Symptom | Cause | Fix |
|---|---|---|---|
| Numbers as text | "1234" instead of 1234; no aggregation available | Source stored numbers as strings | Change type to Whole Number or Decimal Number |
| Dates as numbers | 45000 instead of 2023-03-01 | Excel serial date format | Change type to Date |
| Dates as text | "03/15/2025" shown as text | Source formatting | Change type to Date; set locale if format differs |
| Locale mismatch | "1.234,56" not recognized as 1234.56 | European number format (comma as decimal) | Use "Using Locale" option: Right-click → Change Type → Using Locale → Select appropriate locale |
| Boolean as text | "Yes"/"No" or "1"/"0" instead of TRUE/FALSE | Source uses custom boolean values | Add a conditional column or replace values first, then convert |
| Null handling | Errors when converting nulls | Null values cannot be converted to some types | Replace nulls first, then convert |
Converting with Locale
For international data where number and date formats differ:
- Right-click the column header
- Select Change Type → Using Locale
- Choose the Data Type (e.g., Decimal Number)
- Choose the Locale (e.g., German (Germany) for comma-decimal format)
- Click OK
Adding Custom Columns
Custom Column
Create a new column using an M formula.
Steps:
- Go to Add Column → Custom Column
- Enter a column name
- Write an M expression in the formula box
- Available columns are listed on the right — double-click to insert them
Common Examples:
// Profit calculation
[Revenue] - [Cost]
// Full name from parts
[FirstName] & " " & [LastName]
// Tax calculation (10%)
[Amount] * 0.10
// Year extraction from date
Date.Year([OrderDate])
// Conditional logic
if [Amount] > 1000 then "High" else "Low"
// Text with number conversion
"Order-" & Text.From([OrderID])
Conditional Column
Create a column with if/then/else logic through a visual interface (no coding required).
Steps:
- Go to Add Column → Conditional Column
- Enter a column name
- Define conditions:
| Column Name | Operator | Value | Output |
|---|---|---|---|
| Amount | is greater than | 10000 | "High" |
| Amount | is greater than | 5000 | "Medium" |
| Amount | is greater than | 0 | "Low" |
- Set an Else value (e.g., "Unknown")
- Click OK
Example Result:
| Amount | → Category |
|---|---|
| 15000 | High |
| 7500 | Medium |
| 3200 | Low |
| -100 | Unknown |
Note: Conditions are evaluated top to bottom. The first matching condition is used. Order matters!
Index Column
Add a sequential number to each row.
Steps:
- Go to Add Column → Index Column
- Choose starting value:
- From 0 — First row is 0
- From 1 — First row is 1
- Custom — Specify start value and increment
Use cases:
- Create a surrogate key for tables without a natural key
- Maintain original row order after sorting
- Number rows for ranking or reference
Pivoting and Unpivoting
Pivoting and unpivoting are essential operations for reshaping data between wide and long formats.
Understanding the Difference
| Concept | Direction | Description | When to Use |
|---|---|---|---|
| Unpivot | Wide → Long | Converts multiple columns into attribute-value pairs | When you have months as columns and need a single "Month" column |
| Pivot | Long → Wide | Converts row values into column headers | When you need to spread categories across columns |
Unpivot Example
Scenario: You have a sales table where each month is a separate column:
Before (Wide Format):
| Product | Jan | Feb | Mar | Apr |
|---|---|---|---|---|
| Laptop | 100 | 120 | 95 | 130 |
| Phone | 200 | 180 | 210 | 195 |
| Tablet | 50 | 65 | 70 | 55 |
Goal: Convert to a long format with "Month" and "Sales" columns.
Steps:
- Select the columns to unpivot (Jan, Feb, Mar, Apr) — hold Ctrl and click each column header
- Right-click → Unpivot Columns (or Transform → Unpivot Columns)
- Two new columns are created:
- Attribute — Contains the original column names (Jan, Feb, Mar, Apr)
- Value — Contains the corresponding values
- Rename "Attribute" to "Month" and "Value" to "Sales"
After (Long Format):
| Product | Month | Sales |
|---|---|---|
| Laptop | Jan | 100 |
| Laptop | Feb | 120 |
| Laptop | Mar | 95 |
| Laptop | Apr | 130 |
| Phone | Jan | 200 |
| Phone | Feb | 180 |
| Phone | Mar | 210 |
| Phone | Apr | 195 |
| Tablet | Jan | 50 |
| Tablet | Feb | 65 |
| Tablet | Mar | 70 |
| Tablet | Apr | 55 |
Unpivot Options:
| Option | Description |
|---|---|
| Unpivot Columns | Unpivot the selected columns |
| Unpivot Other Columns | Unpivot all columns EXCEPT the selected ones (more resilient to schema changes) |
| Unpivot Only Selected Columns | Strictly unpivot only the selected columns |
Best Practice: Use "Unpivot Other Columns" whenever possible. If new month columns are added to the source (e.g., May, Jun), they are automatically included. With "Unpivot Columns", new columns would be ignored.
Pivot Example
Scenario: You have a long-format table and need to convert it to wide format.
Before (Long Format):
| Product | Metric | Value |
|---|---|---|
| Laptop | Revenue | 50000 |
| Laptop | Cost | 35000 |
| Laptop | Units | 250 |
| Phone | Revenue | 40000 |
| Phone | Cost | 22000 |
| Phone | Units | 400 |
Steps:
- Select the column whose values will become new column headers ("Metric" column)
- Go to Transform → Pivot Column
- In the dialog:
- Values Column — Select "Value" (the column whose values will populate the new columns)
- Advanced options — Choose the aggregate function (Sum, Count, Min, Max, Don't Aggregate)
- Click OK
After (Wide Format):
| Product | Revenue | Cost | Units |
|---|---|---|---|
| Laptop | 50000 | 35000 | 250 |
| Phone | 40000 | 22000 | 400 |
Merging Queries (Joins)
Merging queries is Power Query's equivalent of SQL JOINs. It combines columns from two tables based on matching values in one or more columns.
Step-by-Step: Merge Two Queries
- Select the primary query in the Queries pane
- Go to Home → Merge Queries (or Merge Queries as New to create a new query)
- In the Merge dialog:
- The top table is already selected (your current query)
- Select the bottom table from the dropdown
- Click the matching column(s) in both tables (click to highlight)
- For multi-column joins, Ctrl+click additional columns in order
- Choose the Join Kind
- Review the match indicator (e.g., "The selection has matched 950 of 1000 rows from the first table")
- Click OK
- A new column appears containing the merged table data (nested tables)
- Click the expand icon (double arrow ↔) on the new column header
- Select which columns to include from the merged table
- Choose whether to use the original column name as a prefix
- Click OK
Join Types Comparison
| Join Type | Returns | SQL Equivalent | Description |
|---|---|---|---|
| Left Outer | All rows from left + matching from right | LEFT JOIN | Keep all left table rows; add matched right data (nulls for non-matches) |
| Right Outer | All rows from right + matching from left | RIGHT JOIN | Keep all right table rows; add matched left data (nulls for non-matches) |
| Full Outer | All rows from both tables | FULL OUTER JOIN | Keep all rows from both tables; nulls where no match |
| Inner | Only matching rows | INNER JOIN | Only rows that have a match in both tables |
| Left Anti | Left rows with NO match in right | NOT EXISTS | Only left table rows that have no match in the right table |
| Right Anti | Right rows with NO match in left | NOT EXISTS | Only right table rows that have no match in the left table |
Visual Description of Joins
Imagine two overlapping circles representing Table A (left) and Table B (right):
- Left Outer — The entire left circle plus the overlap
- Right Outer — The entire right circle plus the overlap
- Full Outer — Both entire circles
- Inner — Only the overlap area
- Left Anti — Only the left circle's non-overlapping area
- Right Anti — Only the right circle's non-overlapping area
Practical Example
Scenario: You have an Orders table and a Customers table. You want to add customer details to each order.
Orders Table:
| OrderID | CustomerID | OrderDate | Amount |
|---|---|---|---|
| 1001 | C01 | 2025-01-15 | 250.00 |
| 1002 | C03 | 2025-01-16 | 180.50 |
| 1003 | C01 | 2025-01-17 | 320.00 |
| 1004 | C05 | 2025-01-18 | 95.00 |
Customers Table:
| CustomerID | CustomerName | City |
|---|---|---|
| C01 | Alice Johnson | New York |
| C02 | Bob Smith | Chicago |
| C03 | Carol Williams | Houston |
| C04 | Dave Brown | Phoenix |
Using Left Outer Join (on CustomerID):
| OrderID | CustomerID | OrderDate | Amount | CustomerName | City |
|---|---|---|---|---|---|
| 1001 | C01 | 2025-01-15 | 250.00 | Alice Johnson | New York |
| 1002 | C03 | 2025-01-16 | 180.50 | Carol Williams | Houston |
| 1003 | C01 | 2025-01-17 | 320.00 | Alice Johnson | New York |
| 1004 | C05 | 2025-01-18 | 95.00 | null | null |
Notice that OrderID 1004 (CustomerID C05) has null values for CustomerName and City because C05 does not exist in the Customers table.
Using Left Anti Join: Would return only OrderID 1004, because it is the only order without a matching customer.
Merge Queries vs Merge Queries as New
| Option | Behavior |
|---|---|
| Merge Queries | Adds the merged columns to the current query |
| Merge Queries as New | Creates a brand-new query with the merge result, leaving original queries unchanged |
Best Practice: Use "Merge Queries as New" when you want to preserve the original queries as staging tables and create a clean merged output.
Appending Queries
Appending stacks rows from two or more tables on top of each other. This is equivalent to SQL's UNION ALL.
Step-by-Step: Append Queries
- Select the primary query
- Go to Home → Append Queries (or Append Queries as New)
- Choose:
- Two tables — Select the second table from the dropdown
- Three or more tables — Add multiple tables from the "Available tables" list to the "Tables to append" list
- Click OK
How Column Matching Works
Power Query matches columns by name (case-insensitive):
| Scenario | Result |
|---|---|
| Both tables have column "Sales" | Values are combined into one "Sales" column |
| Table A has "Revenue", Table B has "Sales" | Two separate columns; each has nulls where the column did not exist in the source table |
| Table A has "Date" (type: Date), Table B has "Date" (type: Text) | Column is created; type conflict may cause errors |
Example: Appending Monthly Sales
January Sales:
| Product | Amount |
|---|---|
| Laptop | 5000 |
| Phone | 3000 |
February Sales:
| Product | Amount |
|---|---|
| Laptop | 6200 |
| Phone | 2800 |
| Tablet | 1500 |
After Appending:
| Product | Amount |
|---|---|
| Laptop | 5000 |
| Phone | 3000 |
| Laptop | 6200 |
| Phone | 2800 |
| Tablet | 1500 |
Tip: Add an index column or source identifier column before appending so you can trace which source each row came from.
Grouping / Aggregation
Group By consolidates rows that share common values and calculates aggregates.
Step-by-Step: Group By
- Go to Transform → Group By (or Home → Group By)
- Choose Basic or Advanced:
- Basic — Group by one column with one aggregation
- Advanced — Group by multiple columns with multiple aggregations
- Configure:
- Group By — Select the column(s) to group by
- New Column Name — Name for the aggregated result column
- Operation — Sum, Count Rows, Average, Min, Max, Count Distinct Values, All Rows
- Add more aggregations by clicking Add aggregation
- Click OK
Available Aggregation Operations
| Operation | Description | Result Type |
|---|---|---|
| Sum | Total of all values | Number |
| Average | Mean of all values | Decimal |
| Median | Middle value | Number |
| Min | Smallest value | Same as source |
| Max | Largest value | Same as source |
| Count Rows | Number of rows in each group | Whole number |
| Count Distinct Values | Number of unique values | Whole number |
| All Rows | A nested table containing all rows for each group | Table |
Example: Summarizing Sales Data
Source Data:
| Region | Product | Amount |
|---|---|---|
| North | Laptop | 5000 |
| North | Laptop | 6200 |
| North | Phone | 3000 |
| South | Laptop | 4500 |
| South | Phone | 2800 |
| South | Phone | 3100 |
Group By (Advanced): Region and Product, with Sum of Amount and Count of Rows
| Region | Product | TotalAmount | OrderCount |
|---|---|---|---|
| North | Laptop | 11200 | 2 |
| North | Phone | 3000 | 1 |
| South | Laptop | 4500 | 1 |
| South | Phone | 5900 | 2 |
The "All Rows" Operation
Selecting "All Rows" creates a nested table column instead of an aggregate. This is useful when you want to keep the detailed rows accessible.
After grouping with "All Rows", you can expand the nested table or perform additional operations on it using custom M code.
Date/Time Transformations
Power Query provides extensive date and time operations.
Extracting Date Components
Select a date column, then go to Transform → Date or Add Column → Date:
| Extraction | Result Example (from 2025-06-15) |
|---|---|
| Year | 2025 |
| Month | 6 |
| Month Name | June |
| Day | 15 |
| Day of Week | Sunday (number: 0) |
| Day of Week Name | Sunday |
| Day of Year | 166 |
| Week of Year | 25 |
| Quarter | 2 |
| Start of Year | 2025-01-01 |
| End of Year | 2025-12-31 |
| Start of Month | 2025-06-01 |
| End of Month | 2025-06-30 |
| Start of Week | 2025-06-09 (Monday) |
| End of Week | 2025-06-15 (Sunday) |
| Start of Quarter | 2025-04-01 |
| End of Quarter | 2025-06-30 |
Calculating Age from Birth Date
Use a Custom Column with this M formula:
let
today = DateTime.LocalNow(),
birthDate = [BirthDate],
age = Duration.Days(today - birthDate) / 365.25
in
Number.RoundDown(age, 0)
Calculating Date Differences
To calculate the number of days between two dates:
Duration.Days([EndDate] - [StartDate])
To calculate months (approximate):
(Date.Year([EndDate]) - Date.Year([StartDate])) * 12
+ (Date.Month([EndDate]) - Date.Month([StartDate]))
Creating a Fiscal Year Column
If your fiscal year starts in April:
if Date.Month([OrderDate]) >= 4
then "FY" & Text.From(Date.Year([OrderDate])) & "-" & Text.From(Date.Year([OrderDate]) + 1)
else "FY" & Text.From(Date.Year([OrderDate]) - 1) & "-" & Text.From(Date.Year([OrderDate]))
Text Transformations
Power Query offers numerous text operations.
Common Text Operations
| Operation | Path | Description | Before → After |
|---|---|---|---|
| Trim | Transform → Format → Trim | Remove leading/trailing spaces | " Hello " → "Hello" |
| Clean | Transform → Format → Clean | Remove non-printable characters | "Hello\x00World" → "HelloWorld" |
| Lowercase | Transform → Format → lowercase | Convert to lowercase | "HELLO" → "hello" |
| UPPERCASE | Transform → Format → UPPERCASE | Convert to uppercase | "hello" → "HELLO" |
| Proper Case | Transform → Format → Capitalize Each Word | Capitalize first letter of each word | "john doe" → "John Doe" |
| Add Prefix | Transform → Format → Add Prefix | Prepend text | "123" → "ID-123" |
| Add Suffix | Transform → Format → Add Suffix | Append text | "report" → "report_v2" |
| Replace Values | Right-click → Replace Values | Find and replace | "NY" → "New York" |
| Trim + Clean combo | Apply both | Best practice for cleaning text | Removes both spaces and hidden characters |
Replace Values
Replace specific values in a column:
- Select the column
- Right-click → Replace Values (or Transform → Replace Values)
- Enter the Value to Find and the Replace With value
- Click OK
Common uses:
- Standardize abbreviations: "CA" → "California"
- Fix typos: "Recieved" → "Received"
- Replace nulls: null → "Not Specified"
Extract Operations
The Transform → Extract menu provides:
| Operation | Description | Example |
|---|---|---|
| Length | Number of characters | "Hello" → 5 |
| First Characters | First N characters | "Hello World" (5) → "Hello" |
| Last Characters | Last N characters | "Hello World" (5) → "World" |
| Range | Characters from position | "Hello World" (6, 5) → "World" |
| Text Before Delimiter | Text before first delimiter | "user@email.com" (@) → "user" |
| Text After Delimiter | Text after first delimiter | "user@email.com" (@) → "email.com" |
| Text Between Delimiters | Text between two delimiters | "[ERROR] message" ([ , ]) → "ERROR" |
M Language Basics
Every step you create in Power Query generates M code behind the scenes. Understanding M allows you to go beyond the GUI and write custom transformations.
What is M?
M (officially "Power Query Formula Language") is a functional, case-sensitive programming language designed for data transformation. It is:
- Functional — Based on expressions and functions, not procedural statements
- Case-sensitive —
Table.SelectRowsis valid;table.selectrowsis not - Lazy-evaluated — Expressions are only computed when their results are needed
- Dynamically typed — Variable types are inferred at runtime
Viewing and Editing M Code
Formula Bar: Each step shows its M expression in the formula bar. You can edit it directly.
Advanced Editor: To see the complete M code for a query:
- Go to Home → Advanced Editor
- The full M query is displayed, showing all steps
The let ... in Structure
Every Power Query query follows this structure:
let
// Step 1: Connect to data source
Source = Excel.Workbook(File.Contents("C:\Data\Sales.xlsx"), null, true),
// Step 2: Navigate to the sheet
Sheet1 = Source{[Item="Sheet1", Kind="Sheet"]}[Data],
// Step 3: Promote headers
#"Promoted Headers" = Table.PromoteHeaders(Sheet1, [PromoteAllScalars=true]),
// Step 4: Change types
#"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers", {
{"OrderDate", type date},
{"Amount", type number},
{"Quantity", Int64.Type}
}),
// Step 5: Filter rows
#"Filtered Rows" = Table.SelectRows(#"Changed Type", each [Amount] > 100)
in
// Return the final step
#"Filtered Rows"
Key points:
letdefines a series of named steps (variables)- Each step references the previous step
- Steps are separated by commas
inspecifies which step's result to return (usually the last step)- Step names with spaces or special characters must be enclosed in
#"..." - Comments use
//for single-line or/* ... */for multi-line
Common M Functions
Text Functions
| Function | Description | Example |
|---|---|---|
Text.Upper(text) | Convert to uppercase | Text.Upper("hello") → "HELLO" |
Text.Lower(text) | Convert to lowercase | Text.Lower("HELLO") → "hello" |
Text.Proper(text) | Capitalize each word | Text.Proper("john doe") → "John Doe" |
Text.Trim(text) | Remove leading/trailing whitespace | Text.Trim(" hi ") → "hi" |
Text.Clean(text) | Remove non-printable characters | Text.Clean("hi\x00") → "hi" |
Text.Start(text, count) | First N characters | Text.Start("Hello", 3) → "Hel" |
Text.End(text, count) | Last N characters | Text.End("Hello", 3) → "llo" |
Text.Range(text, start, count) | Substring from position | Text.Range("Hello", 2, 2) → "ll" |
Text.Contains(text, value) | Check if text contains value | Text.Contains("Hello", "ell") → true |
Text.Replace(text, old, new) | Replace occurrences | Text.Replace("Hi Bob", "Bob", "Alice") → "Hi Alice" |
Text.Length(text) | Character count | Text.Length("Hello") → 5 |
Text.From(value) | Convert value to text | Text.From(42) → "42" |
Text.Combine(list, sep) | Join list with separator | Text.Combine({"A","B","C"}, "-") → "A-B-C" |
Text.Split(text, sep) | Split text into list | Text.Split("A-B-C", "-") → |
Number Functions
| Function | Description | Example |
|---|---|---|
Number.Round(num, digits) | Round to N digits | Number.Round(3.14159, 2) → 3.14 |
Number.RoundDown(num, digits) | Round down (floor) | Number.RoundDown(3.9, 0) → 3 |
Number.RoundUp(num, digits) | Round up (ceiling) | Number.RoundUp(3.1, 0) → 4 |
Number.Abs(num) | Absolute value | Number.Abs(-42) → 42 |
Number.Mod(num, divisor) | Remainder (modulo) | Number.Mod(10, 3) → 1 |
Number.Power(num, exp) | Exponentiation | Number.Power(2, 10) → 1024 |
Number.Sqrt(num) | Square root | Number.Sqrt(144) → 12 |
Number.From(value) | Convert to number | Number.From("42") → 42 |
Date Functions
| Function | Description | Example |
|---|---|---|
Date.Year(date) | Extract year | Date.Year(#date(2025, 6, 15)) → 2025 |
Date.Month(date) | Extract month number | Date.Month(#date(2025, 6, 15)) → 6 |
Date.Day(date) | Extract day | Date.Day(#date(2025, 6, 15)) → 15 |
Date.DayOfWeek(date) | Day of week (0=Sun) | Date.DayOfWeek(#date(2025, 6, 15)) → 0 |
Date.DayOfYear(date) | Day of year | Date.DayOfYear(#date(2025, 6, 15)) → 166 |
Date.WeekOfYear(date) | Week number | Date.WeekOfYear(#date(2025, 6, 15)) → 25 |
Date.QuarterOfYear(date) | Quarter (1-4) | Date.QuarterOfYear(#date(2025, 6, 15)) → 2 |
Date.StartOfMonth(date) | First day of month | Date.StartOfMonth(#date(2025, 6, 15)) → 2025-06-01 |
Date.EndOfMonth(date) | Last day of month | Date.EndOfMonth(#date(2025, 6, 15)) → 2025-06-30 |
Date.AddDays(date, n) | Add N days | Date.AddDays(#date(2025, 6, 15), 10) → 2025-06-25 |
Date.AddMonths(date, n) | Add N months | Date.AddMonths(#date(2025, 6, 15), 3) → 2025-09-15 |
Date.AddYears(date, n) | Add N years | Date.AddYears(#date(2025, 6, 15), 1) → 2026-06-15 |
DateTime.LocalNow() | Current date/time | Returns the current local timestamp |
Table Functions
| Function | Description |
|---|---|
Table.SelectRows(table, condition) | Filter rows based on a condition |
Table.RemoveColumns(table, columns) | Remove specified columns |
Table.RenameColumns(table, renames) | Rename columns |
Table.AddColumn(table, name, formula) | Add a calculated column |
Table.TransformColumnTypes(table, types) | Change column data types |
Table.PromoteHeaders(table) | Use first row as column headers |
Table.Sort(table, comparisons) | Sort the table |
Table.Distinct(table) | Remove duplicate rows |
Table.Group(table, keys, aggregations) | Group by and aggregate |
Table.NestedJoin(table1, key1, table2, key2, name, joinKind) | Merge/join tables |
Table.Combine(tables) | Append tables together |
Table.Pivot(table, pivotColumn, valueColumn, aggregation) | Pivot operation |
Table.Unpivot(table, columns, attributeName, valueName) | Unpivot operation |
Table.RowCount(table) | Count number of rows |
Table.ColumnNames(table) | Get list of column names |
List Functions
| Function | Description | Example |
|---|---|---|
List.Sum(list) | Sum all values | List.Sum({1, 2, 3}) → 6 |
List.Average(list) | Average of values | List.Average({10, 20, 30}) → 20 |
List.Min(list) | Minimum value | List.Min({5, 3, 8}) → 3 |
List.Max(list) | Maximum value | List.Max({5, 3, 8}) → 8 |
List.Count(list) | Number of elements | List.Count({1, 2, 3}) → 3 |
List.Distinct(list) | Unique values | List.Distinct({1, 2, 2, 3}) → |
List.Contains(list, value) | Check membership | List.Contains({1, 2, 3}, 2) → true |
List.Sort(list) | Sort values | List.Sort({3, 1, 2}) → |
List.Transform(list, func) | Apply function to each | List.Transform({1,2,3}, each _ * 2) → |
Writing Custom M in the Advanced Editor
Example: Create a custom function that cleans text by trimming, cleaning, and converting to proper case:
let
CleanText = (inputText as text) as text =>
let
trimmed = Text.Trim(inputText),
cleaned = Text.Clean(trimmed),
proper = Text.Proper(cleaned)
in
proper
in
CleanText
You can then invoke this function from other queries to reuse the logic.
M Language Comments
// This is a single-line comment
/* This is a
multi-line comment */
let
// Step 1: Load data from the source file
Source = Excel.Workbook(File.Contents("C:\Data\file.xlsx"), null, true),
/* Step 2: Apply transformations
- Remove unnecessary columns
- Filter to active records only */
Transformed = Table.SelectRows(Source, each [Status] = "Active")
in
Transformed
Query Folding
Query folding is one of the most important performance concepts in Power Query.
What is Query Folding?
Query folding is the process by which Power Query translates its transformation steps into native queries (e.g., SQL) that are executed at the data source rather than in Power BI's memory.
Without query folding: Power BI downloads all data first, then applies transformations locally (slow for large datasets).
With query folding: The data source handles filtering, sorting, and transformations, and only the result is sent to Power BI (fast and efficient).
Example
If you apply these steps to a SQL Server table:
- Filter rows where
OrderDate >= 2025-01-01 - Remove the
Notescolumn - Sort by
OrderDate
With query folding, Power Query generates this SQL:
SELECT [OrderID], [CustomerID], [OrderDate], [Amount]
FROM [dbo].[Orders]
WHERE [OrderDate] >= '2025-01-01'
ORDER BY [OrderDate]
The SQL Server processes the filter, column selection, and sort. Only the matching rows are sent to Power BI.
How to Check if Query Folding is Active
- In Power Query Editor, right-click on any step in the Applied Steps panel
- Look for "View Native Query":
- Available (not grayed out) — Query folding is active for this step. Click to see the generated SQL.
- Grayed out — Query folding has broken at or before this step.
Steps That Support Query Folding
| Step Type | Foldable? |
|---|---|
| Filter rows (basic conditions) | Yes |
| Remove columns | Yes |
| Rename columns | Yes |
| Sort rows | Yes |
| Change data types (basic) | Yes |
| Group By | Yes |
| Merge (join) with another folded query | Yes |
| Keep/Remove top/bottom rows | Yes |
| Add conditional column (simple) | Sometimes |
Steps That Break Query Folding
| Step Type | Foldable? | Reason |
|---|---|---|
| Add Index Column | No | No SQL equivalent |
| Merge Columns | No | Custom string operation |
| Pivot/Unpivot (in some sources) | No | Complex transformation |
| Custom Column (complex M logic) | No | Cannot translate to SQL |
| Replace Values (with complex logic) | No | Custom logic |
| Column from Examples | No | Pattern-based transformation |
| Change type using Locale | No | Local conversion function |
| Buffer table (Table.Buffer) | No | Forces local evaluation |
| Added column referencing other queries | No | Cross-query reference |
Why Query Folding Matters
| Aspect | With Folding | Without Folding |
|---|---|---|
| Data Transfer | Only needed data is transferred | All data transferred, then filtered locally |
| Refresh Time | Fast | Slow (especially for large datasets) |
| Memory Usage | Low | High (all data loaded into memory) |
| Source Load | Source handles efficiently via indexes | Minimal source work; heavy local processing |
| Incremental Refresh | Works correctly | Cannot partition data properly |
Best Practices for Maintaining Query Folding
- Apply filters early — Filter rows in the first few steps, before any non-foldable operations
- Remove columns early — Reduce the data volume before non-foldable steps
- Avoid custom columns before filters — Add custom columns after all foldable operations
- Use native SQL — For complex logic, write it as a custom SQL query in the source connection
- Check regularly — Right-click steps periodically to verify "View Native Query" is available
- Keep foldable steps together — Group all foldable operations at the top of the step list
Best Practices
Follow these best practices to create efficient, maintainable, and performant Power Query transformations.
1. Name Your Steps
Rename Applied Steps to describe what they do:
| Default Name | Better Name |
|---|---|
| Changed Type | Set Column Data Types |
| Filtered Rows | Filter to Active Customers Only |
| Removed Columns | Remove Audit Columns |
| Merged Queries | Join with Product Table |
| Grouped Rows | Summarize Sales by Region |
Right-click any step → Rename to change its name.
2. Remove Unnecessary Columns Early
Drop columns you do not need as soon as possible. This:
- Reduces memory usage during transformations
- Improves performance
- Keeps the preview clean and readable
- Maintains query folding (column removal folds)
3. Filter Rows Early
Apply row filters before heavy transformations. This reduces the number of rows processed in subsequent steps and maintains query folding.
4. Use Query Folding
Structure your steps to maintain query folding as long as possible. Place all foldable operations (filters, column removal, type changes) before non-foldable ones (custom columns, pivots, M functions).
5. Organize Queries into Groups
Right-click in the Queries pane → New Group to create folders:
📁 Staging (raw data, not loaded to model)
├── Raw_Sales
├── Raw_Customers
└── Raw_Products
📁 Transformations
├── Clean_Sales
├── Clean_Customers
└── Clean_Products
📁 Output (loaded to model)
├── Fact_Sales
├── Dim_Customer
└── Dim_Product
6. Disable Load for Staging Queries
Right-click a staging query → uncheck Enable Load. This prevents intermediate/helper queries from being loaded into the data model, reducing model size.
7. Reference vs Duplicate Queries
| Action | Behavior |
|---|---|
| Reference | Creates a new query that starts from the result of the referenced query. Changes to the source query automatically flow to the referencing query. |
| Duplicate | Creates an independent copy. Changes to the original do not affect the duplicate. |
Best Practice: Use Reference when you want multiple queries to share the same initial transformations (e.g., a staging query referenced by multiple output queries). Use Duplicate when you need an independent starting point.
8. Use Parameters for Dynamic Values
Replace hardcoded values (file paths, server names, date ranges) with parameters. This makes your queries flexible and easy to maintain.
9. Document Complex Logic
Add comments in the Advanced Editor for complex M expressions. Future you (and your colleagues) will thank you.
10. Test with a Subset
When developing transformations on a large dataset, use Keep Top Rows (e.g., top 1000) during development for faster preview times. Remove the step before publishing.
Practice Exercises
Exercise 1: Basic Cleaning
Objective: Practice fundamental data cleaning operations.
Instructions:
- In Power BI Desktop, click Home → Enter Data and create a table called "RawEmployees" with this data:
| ID | Full Name | Department | Salary | Start Date |
|-----|-----------------|-------------|---------|------------|
| 1 | John Smith | SALES | 55000 | 2020-03-15 |
| 2 | jane doe | Marketing | 62000 | 2019-08-01 |
| 3 | BOB WILLIAMS | sales | 48000 | 2021-01-10 |
| 4 | Alice Johnson | MARKETING | 71000 | 2018-06-20 |
| 5 | Charlie Brown | Engineering | 85000 | 2022-02-28 |
| 6 | Diana Prince | engineering | 92000 | 2017-11-05 |
| 7 | john smith | Sales | 55000 | 2020-03-15 |
| | | | | |
- Open Power Query Editor (Transform Data)
- Perform these cleaning steps:
- Remove the blank row at the bottom
- Trim whitespace from Full Name
- Convert Full Name to Proper Case
- Standardize Department to Proper Case
- Change Salary to Whole Number type
- Change Start Date to Date type
- Remove the duplicate row (ID 7 duplicates ID 1)
- Rename each Applied Step to describe what it does
- Click Close & Apply
Exercise 2: Split and Merge Columns
Objective: Practice splitting and merging column operations.
Instructions:
- Create a table called "Contacts" with this data:
| FullName | Email | Phone |
|---------------|--------------------|---------------|
| John Smith | john@company.com | 212-555-1234 |
| Jane Doe | jane@company.com | 310-555-5678 |
| Bob Williams | bob@other.org | 415-555-9012 |
| Alice Johnson | alice@company.com | 617-555-3456 |
- In Power Query:
- Split FullName by space delimiter into FirstName and LastName columns
- Split Email by "@" to extract the domain name
- Split Phone by "-" to extract area code, exchange, and number
- Merge FirstName, LastName, and domain into a new column: "FirstName.LastName@domain"
- Close & Apply
Exercise 3: Unpivot Monthly Data
Objective: Transform wide-format data into long format.
Instructions:
- Create a table called "MonthlySales" with this data:
| Store | Jan | Feb | Mar | Apr | May | Jun |
|----------|------|------|------|------|------|------|
| Store A | 1200 | 1350 | 1100 | 1500 | 1400 | 1600 |
| Store B | 900 | 1050 | 980 | 1100 | 1200 | 1150 |
| Store C | 1500 | 1600 | 1450 | 1700 | 1800 | 1650 |
- In Power Query:
- Select the Store column
- Use Unpivot Other Columns to convert month columns into rows
- Rename "Attribute" to "Month" and "Value" to "Sales"
- Change Sales to Whole Number type
- Close & Apply and create a line chart showing Sales by Month with Store as the Legend
Exercise 4: Merge Queries (Join)
Objective: Practice joining two tables.
Instructions:
- Create two tables:
Orders:
| OrderID | CustomerID | Product | Amount |
|---------|-----------|----------|--------|
| 101 | C01 | Laptop | 1200 |
| 102 | C02 | Phone | 800 |
| 103 | C01 | Tablet | 500 |
| 104 | C03 | Laptop | 1200 |
| 105 | C04 | Monitor | 350 |
Customers:
| CustomerID | Name | City |
|-----------|----------------|-------------|
| C01 | Alice Johnson | New York |
| C02 | Bob Smith | Chicago |
| C03 | Carol Williams | Houston |
| C05 | Eve Davis | Seattle |
- In Power Query:
- Merge the Orders query with Customers using a Left Outer join on CustomerID
- Expand the merged column to include Name and City
- Note which orders have null customer information (and why)
- Create a new merge using Left Anti join to find orders without matching customers
- Close & Apply
Exercise 5: Group By with Multiple Aggregations
Objective: Summarize data using Group By.
Instructions:
- Using the Orders table from Exercise 4 (after joining with Customers):
- Open Power Query Editor
- Use Group By (Advanced) to group by City
- Add these aggregations:
- Total Amount (Sum of Amount)
- Order Count (Count Rows)
- Average Order Value (Average of Amount)
- Max Single Order (Max of Amount)
- Close & Apply and create a table visual showing the summary
Exercise 6: Write M Code
Objective: Practice writing M language formulas.
Instructions:
- Open Power Query Editor with any table containing text and number columns
- Open the Advanced Editor and study the existing M code
- Create the following Custom Columns using M formulas:
- A column that concatenates two text columns with a dash separator
- A column that rounds a number column to 2 decimal places
- A column that extracts the year from a date column
- A column that uses
if/then/elselogic to categorize a numeric column into "High", "Medium", and "Low"
- After creating each column, look at the formula bar to see the M expression
- Try modifying the formula directly in the formula bar
Exercise 7: Verify Query Folding
Objective: Understand which steps support query folding.
Instructions:
- Connect to any SQL Server database (or use a DirectQuery-compatible source)
- Load a table and apply these steps in order:
- Filter rows (e.g., where Year > 2024)
- Remove three columns
- Change data types
- Right-click each step and check if "View Native Query" is available
- Now add these additional steps:
- Add an Index Column
- Add a Custom Column with an M formula
- Right-click these new steps and check "View Native Query" — it should be grayed out
- Rearrange steps to put foldable operations before non-foldable ones
- Document which steps folded and which did not
Summary
In this chapter, you mastered Power Query — the essential data transformation engine in Power BI:
- Power Query is the ETL engine that sits between your data sources and the data model, cleaning and shaping data through a visual interface of recorded steps
- The Power Query Editor has four key areas: Ribbon, Queries Pane, Preview Pane, and Query Settings (Applied Steps)
- Basic transformations include removing columns, renaming columns, changing data types, removing duplicates, removing errors, and removing blank rows
- Column operations let you split columns by delimiters, merge columns, extract substrings, and create columns from examples
- Row operations include filtering by text, number, and date conditions, keeping/removing top/bottom rows, and handling duplicates and errors
- Data type conversions require attention to locale settings, especially for international number and date formats
- Custom columns can be created through the Custom Column dialog, Conditional Column wizard, or Index Column button
- Pivoting converts long-format data to wide format; Unpivoting converts wide-format data to long format — both are essential for reshaping data
- Merging queries joins two tables using six join types: Left Outer, Right Outer, Full Outer, Inner, Left Anti, and Right Anti
- Appending queries stacks rows from multiple tables based on matching column names
- Group By aggregates data using Sum, Count, Average, Min, Max, and other operations
- Date/Time transformations let you extract year, month, day, quarter, week, and calculate date differences
- Text transformations include trim, clean, case changes, replace values, and extract operations
- M Language is the functional language behind Power Query, using a
let ... instructure with functions for text, numbers, dates, tables, and lists - Query Folding translates Power Query steps into native source queries (like SQL) for dramatically better performance — always check that it is active for your critical steps
- Best practices include naming steps, removing columns early, filtering early, organizing queries into groups, disabling load for staging queries, and using parameters
In the next chapter, you will learn about Data Modeling — how to structure your tables with relationships, create DAX measures, and build an efficient star schema for your reports.