Get & Transform: An Alternative to Reformat Macros

Excel 2016 includes a set of features called Get & Transform. In previous versions of Excel, these capabilities were included in the Power Query Add-In. In this post, we’ll see how a Get & Transform Query can be used as an alternative to a VBA macro.

Overview

Here is the scenario. We export data out of some system, and save it in a CSV file. We then need to prepare it for use. Perhaps to import into another system, or perhaps to use in a PivotTable or formula-based report. We basically need to clean up the data, remove some columns, change the headers, and so on.

Back in the old days, we could automate such a task with a VBA macro that reformatted the data. Yes, it was difficult to write initially, but, it felt great when we got it working. And life was good, until something in the data changed….like a new column. Some type of change like this could break the macro. Then, we would need to crack open the Visual Basic Editor and troubleshoot. It took some time to figure out how to resolve the issue, but, it felt great once we got it working again. Until something else changed. Argh.

The good news is that a Get & Transform Query is an easy alternative to building such a VBA macro. Best of all, modifications are easy to make when something changes.

Objective

For the purposes of this post, here is what we’d like to automate without a VBA macro:

  • Retrieve data into Excel from a CSV file
  • Put all email addresses into lower case
  • Put all state codes into UPPER case
  • Capitalize the first word of all other names
  • Remove a column
  • Order the columns
  • Change the column headers
  • Filter out certain rows

And, we want to be able to refresh next month with a single click, or better yet, no click.

Sound like a tall order? It is easy these days. All we need is a Get & Transform Query.

Note: if you are working along in a version of Excel other than Excel 2016 for Windows, you may not have the Get & Transform tools, or, you may need to download the Power Query add-in.

Details

Let’s just jump right in. We’ll basically take the steps in the same order as the bullets above.

Retrieve CSV data

We’ll retrieve orders from a CSV file that was exported from our ecommerce system. But, if you are working with data in some other format, you’ll be glad to know that Get & Transform works with tons of data sources.

Note: if you’d like to work along with these steps, feel free to download the sample data file using the link at the end of this post.

To begin, we create a new workbook and then select the following Ribbon command:

  • Data > New Query > From File > From CSV

This opens the Import Data dialog, where we simply browse to and select the CSV file. Once we do, we are presented with a dialog that allows us to preview a sample of the data, as shown below.

Data Preview by Jeff Lenning

Since our data needs to be cleaned up a bit, we’ll click the Edit button. This opens the Query Editor dialog, as shown below.

Query Editor by Jeff Lenning

Now, it is time to clean up, or transform, our data.

Before we jump in, it is important to realize that the following transformations are done inside of Excel, and are not being made to the source data. The source data, in our case the CSV file, is left unaltered. The preview just shows what our data will look like when it arrives in Excel.

Lower case

The first task is to clean up the email address column. Since the customers enter their information online, the email addresses are inconsistent. Some customers enter their address in all caps ([email protected]), some in lower case ([email protected]), and some in proper case ([email protected]). Since we like our data to be nice and tidy, we’ll convert all email addresses into lower case. To do so, we begin by selecting the email column as shown below.

Email before

To transform into lower case, we can either right-click the column header and select Transform > lowercase, or, click the following Ribbon icon:

  • Text Columns > Format > lowercase

Either way, Excel updates the data as shown below.

Email after by Jeff Lenning

Note: to undo a step in the Query Editor, you click the corresponding x in the Applied Steps list box.

Upper case

Also sloppy is the State code. We have upper, lower, and mixed case. So, we right-click the State column header, and transform to UPPERCASE. The results are shown below.

State code by Jeff Lenning

Capitalize

Actually, the City, StreetAddress, GiveName, and Surname columns are all sloppy as well. So, we simply transform each of them to Capitalize Each Word. The clean results are shown below.

Capitalize Each Word by Jeff Lenning

Remove columns

We don’t need the MiddleName column, so, we’ll remove it. We can do it with the Remove Columns Ribbon command, or, we can simply right-click the column header and select Remove.

The update is shown below.

Remove column by Jeff Lenning

Order the columns

We need to change the order of the columns. To do so, we can just click-and-drag the column headers, or, right-click a column header and use the Move option.

In our case, we need the email address column to be the first column, so, we just right-click the Email column header, and select Move > To Beginning.

The update is shown below.

Move column by Jeff Lenning

Change column headers

Now we’d like to update the column headers. Doing so is straightforward. We can right-click the column header and select Rename, or, use the following Ribbon command:

  • Transform > Rename

We update the GivenName header to First Name, Surname to Last Name, and StreetAddress to Address. These updates are shown below.

Rename columns by Jeff Lenning

Filter

Finally, we want to exclude any rows where the amount is zero. To do this, we simply select the drop-down in the Amount column header, and select Number Filters > Does Not Equal…0.

The update is shown below.

Filter by Jeff Lenning

With our data cleaned up and looking good, we are ready to bring it into Excel.

Load

To return the data to Excel, we use the Close & Load command. If we have a specific destination in mind, such as existing worksheet, we can click Close & Load To. If we just want to pull the data into a new worksheet, we can click Close & Load. In our case, we click Close & Load, and bam…the transformed data flows into an Excel table as shown below.

Clean data by Jeff Lenning

And…we didn’t use a single line of VBA code to do it!

Updating

What about updating next period? Well, it is easy. There are actually two issues to consider. The first is updating the Excel file with newly exported CSV data. The second is updating the query to perform different steps. Let’s talk about both issues.

New Data

When you export new data next period, if you save it in the same folder and with the same file name as the old CSV, all you need to do in Excel is right-click any data cell and select Refresh, or, use the following Ribbon command:

  • Query Tools > Refresh

That will update the table with a single click. Can you have Excel update the data without any clicks? Yes, you can ask Excel to refresh the query automatically when you open the workbook. To do this, right-click any data cell, and select Table > External Data Properties. In the resulting External Data Properties dialog, click the little Connection Properties icon to the right of the Name field. In the resulting Connection Properties dialog, check the Refresh data when opening the file checkbox, as shown below.

Connection properties by Jeff Lenning

Now, you just export the new CSV file, save it to the same folder with the same name, and then open your Excel file. Excel will automatically apply the transformations and retrieve the updated data into the table. Wow.

If you use a different file name, then you’ll need to edit the query. Which also happens to be how you add, edit, or delete steps. So, let’s talk about how to edit the query.

Edit the Query

To edit the query, you can select any data cell and then double-click the query in the Workbook Queries panel, or, click the following Ribbon command:

  • Query Tools > Edit

This will display the Query Editor dialog. Here, you can change all kinds of things. To edit an existing step, click the gear in the Applied Step panel. For example, to change the data source, click the gear on the Source applied step. To change the order of a step, right-click it and select Move Up or Move Down.

To delete a step, click the x in the Applied Step dialog.

To add a new step, select the last step and then perform the task. To add a step in the middle of the current steps, select any step and then perform the task. Be careful when inserting an intermediate step because it may affect subsequent steps and break your query.

Conclusion

If you haven’t yet played with the Get & Transform features, they are worth digging in to. I don’t want to sound overly dramatic here, but, they are a game changer. They provide us new ways to solve old issues and allow us to easily accomplish tasks that were previously really hard or impractical. I have a series of Get & Transform blog posts topics lined up that will demonstrate some of the capabilities. If you have any Get & Transform tips, please share by posting a comment below…thanks!

Additional Resources

  • If you’d like to practice the steps above, open a new Excel workbook and then use this sample CSV data file: Data

 

Posted in , ,

Jeff Lenning

I love sharing the things I've learned about Excel, and I built Excel University to help me do that. My motto is: Learn Excel. Work Faster.

Excel is not what it used to be.

You need the Excel Proficiency Roadmap now. Includes 6 steps for a successful journey, 3 things to avoid, and weekly Excel tips.

Want to learn Excel?

Our training programs start at $29 and will help you learn Excel quickly.

9 Comments

  1. Miranda Parks on June 17, 2016 at 6:11 am

    WOW! This is seriously a game changer!! Thanks for the tips and the easy walk though instructions! Will be following many more of your post!

    • jefflenning on June 17, 2016 at 9:59 am

      Game changer indeed my friend! G&T is my favorite built-in enhancement in 2016 🙂

  2. David Blumberg on June 20, 2016 at 1:02 pm

    Great stuff, thanks for sharing your skill and expertise with us.
    I always look forward to your posts.

    David

    • jefflenning on June 20, 2016 at 1:03 pm

      My pleasure 🙂

  3. Jim Heetderks on November 9, 2017 at 8:32 am

    OK, I took the 45 minute intro to Power Query yesterday and I wasn’t quite sure this is a feature I’d use very much. BUT – I just found my first application of it (pulling in a list of names and addresses from an online database), and I love it! Good stuff. I’m already thinking of other uses. Thanks again.

  4. Will on March 9, 2018 at 3:14 pm

    Is there a way to build into the query to ensure zip codes that start in 0 to add the 0 in?

    • Jeff Lenning on March 15, 2018 at 1:46 pm

      We have a couple of options. The easiest is to simply force a leading zero to be displayed in Excel. To do so, select the results table zip column and Format Cells > Custom > “00000”. Or, inside Power Query, you can create a conditional column that checks to see if the length of the zip code is 4, and if so, adds a leading zero. This works when the data type of the zip code is Text. Create the conditional column, and then modify the formula to include an argument like this: each if Text.Length([ZipCode]) = 4 then “0” & [ZipCode] else [ZipCode]
      Hope it helps!
      Thanks
      Jeff

  5. Triet Luong on April 11, 2021 at 5:34 am

    I love your lessons, all are helpful and easy to understand. Thanks so much, Jeff

Leave a Comment