{"id":7169,"date":"2014-03-07T00:00:06","date_gmt":"2014-03-07T00:00:06","guid":{"rendered":"http:\/\/craftydba.com\/?p=7169"},"modified":"2016-04-19T20:29:04","modified_gmt":"2016-04-19T20:29:04","slug":"power-query-part-4","status":"publish","type":"post","link":"https:\/\/craftydba.com\/?p=7169","title":{"rendered":"Power Query \u2013 Part 4"},"content":{"rendered":"<p>There has been a big push for self service business intelligence tools by Microsoft in recent months.  The center of this new universe is Microsoft Excel.  <\/p>\n<p>I will carry on with talks about Power Query which is a self service ETL add-in for MS Excel. This tool was <a href=\"http:\/\/blogs.msdn.com\/b\/powerbi\/archive\/2013\/08\/29\/try-the-latest-power-query-update.aspx\">released <\/a>in August of 2013. In addition to my blog, the <a href=\"http:\/\/office.microsoft.com\/en-us\/excel-help\/microsoft-power-query-for-excel-help-HA104003813.aspx\">online help<\/a> is a good place to start learning about this tool.<\/p>\n<p>Last time, I demonstrated how the &#8220;reduce&#8221; section of ribbon has buttons that can refine the number of rows and columns in the source data set.  Today, I am going to start covering the &#8220;sort&#8221; and &#8220;transform&#8221; sections of the ribbon. <\/p>\n<p><a href=\"https:\/\/craftydba.com\/wp-content\/uploads\/2014\/03\/F01-power-query-toolbar-ribbon.jpg\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/craftydba.com\/wp-content\/uploads\/2014\/03\/F01-power-query-toolbar-ribbon-1024x107.jpg\" alt=\"F01-power-query-toolbar-ribbon\" width=\"665\" height=\"69\" class=\"aligncenter size-large wp-image-7171\" srcset=\"https:\/\/craftydba.com\/wp-content\/uploads\/2014\/03\/F01-power-query-toolbar-ribbon-1024x107.jpg 1024w, https:\/\/craftydba.com\/wp-content\/uploads\/2014\/03\/F01-power-query-toolbar-ribbon-300x31.jpg 300w, https:\/\/craftydba.com\/wp-content\/uploads\/2014\/03\/F01-power-query-toolbar-ribbon.jpg 1101w\" sizes=\"auto, (max-width: 665px) 100vw, 665px\" \/><\/a><\/p>\n<p>Again, I will be working with the US <a href=\"http:\/\/en.wikipedia.org\/wiki\/List_of_U.S._state_abbreviations\">state abbreviations<\/a> data set from Wikipedia.  <\/p>\n<p>The &#8220;sort&#8221; operation allows you to order the data in ascending or descending order.  The picture below shows states in descending order of abbreviation.<\/p>\n<p><a href=\"https:\/\/craftydba.com\/wp-content\/uploads\/2014\/03\/F02-power-query-transformsort-data.png\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/craftydba.com\/wp-content\/uploads\/2014\/03\/F02-power-query-transformsort-data.png\" alt=\"F02-power-query-transform~sort-data\" width=\"581\" height=\"358\" class=\"aligncenter size-full wp-image-7176\" srcset=\"https:\/\/craftydba.com\/wp-content\/uploads\/2014\/03\/F02-power-query-transformsort-data.png 581w, https:\/\/craftydba.com\/wp-content\/uploads\/2014\/03\/F02-power-query-transformsort-data-300x184.png 300w\" sizes=\"auto, (max-width: 581px) 100vw, 581px\" \/><\/a><\/p>\n<p>Many times the data we get for analysis is not in the correct format.  The core MS Excel application has a &#8220;text to columns&#8221; operation to parse data that might be in a fixed width or delimited format.  Power Query has the &#8220;split column&#8221; operation instead.<\/p>\n<p>One way to split columns is on fixed length (position).  <\/p>\n<p><a href=\"https:\/\/craftydba.com\/wp-content\/uploads\/2014\/03\/F03-power-query-transformsplit-column-by-position.jpg\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/craftydba.com\/wp-content\/uploads\/2014\/03\/F03-power-query-transformsplit-column-by-position.jpg\" alt=\"F03-power-query-transform~split-column-by-position\" width=\"702\" height=\"323\" class=\"aligncenter size-full wp-image-7180\" srcset=\"https:\/\/craftydba.com\/wp-content\/uploads\/2014\/03\/F03-power-query-transformsplit-column-by-position.jpg 702w, https:\/\/craftydba.com\/wp-content\/uploads\/2014\/03\/F03-power-query-transformsplit-column-by-position-300x138.jpg 300w\" sizes=\"auto, (max-width: 702px) 100vw, 702px\" \/><\/a><\/p>\n<p>I see this option less flexible than the one built into the core product that allows you to place arrows where divisions occur in a sample preview of the data.  In real life, a fixed length file will not have columns with the same number of characters.<\/p>\n<p>The image below shows the core product converting fixed length data from the [Employee] table in the [HumanResources] schema in the [AdventureWorks2012] database into columns.<\/p>\n<p><a href=\"https:\/\/craftydba.com\/wp-content\/uploads\/2014\/03\/F03-power-query-transformtext-to-columns.jpg\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/craftydba.com\/wp-content\/uploads\/2014\/03\/F03-power-query-transformtext-to-columns.jpg\" alt=\"F03-power-query-transform~text-to-columns\" width=\"523\" height=\"396\" class=\"aligncenter size-full wp-image-7212\" srcset=\"https:\/\/craftydba.com\/wp-content\/uploads\/2014\/03\/F03-power-query-transformtext-to-columns.jpg 523w, https:\/\/craftydba.com\/wp-content\/uploads\/2014\/03\/F03-power-query-transformtext-to-columns-300x227.jpg 300w\" sizes=\"auto, (max-width: 523px) 100vw, 523px\" \/><\/a><\/p>\n<p>Another way to split columns is on a known character (delimiter).  A very common format is comma separated values (CSV) shown below. <\/p>\n<p><a href=\"https:\/\/craftydba.com\/wp-content\/uploads\/2014\/03\/F04-power-query-transformsplit-column-by-delimited.jpg\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/craftydba.com\/wp-content\/uploads\/2014\/03\/F04-power-query-transformsplit-column-by-delimited.jpg\" alt=\"F04-power-query-transform~split-column-by-delimited\" width=\"702\" height=\"325\" class=\"aligncenter size-full wp-image-7182\" srcset=\"https:\/\/craftydba.com\/wp-content\/uploads\/2014\/03\/F04-power-query-transformsplit-column-by-delimited.jpg 702w, https:\/\/craftydba.com\/wp-content\/uploads\/2014\/03\/F04-power-query-transformsplit-column-by-delimited-300x138.jpg 300w\" sizes=\"auto, (max-width: 702px) 100vw, 702px\" \/><\/a><\/p>\n<p>Sometimes we get data for analysis that needs to be aggregated before being used.  That is where the &#8220;group by&#8221; operation comes to play.<\/p>\n<p>For instance, our US state abbreviations data set can be categorized by type.  We might be interested in the number of mail parcels that go out to the islands.<\/p>\n<p>First, the raw data has a type column that we can group by.<\/p>\n<p><a href=\"https:\/\/craftydba.com\/wp-content\/uploads\/2014\/03\/F05-power-query-transformdata-before-group-by.jpg\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/craftydba.com\/wp-content\/uploads\/2014\/03\/F05-power-query-transformdata-before-group-by.jpg\" alt=\"F05-power-query-transform~data-before-group-by\" width=\"601\" height=\"360\" class=\"aligncenter size-full wp-image-7186\" srcset=\"https:\/\/craftydba.com\/wp-content\/uploads\/2014\/03\/F05-power-query-transformdata-before-group-by.jpg 601w, https:\/\/craftydba.com\/wp-content\/uploads\/2014\/03\/F05-power-query-transformdata-before-group-by-300x179.jpg 300w\" sizes=\"auto, (max-width: 601px) 100vw, 601px\" \/><\/a><\/p>\n<p>Second, we need to decide on what aggregation function we want to use.  All the standard SQL functions are available:  count, avg, sum, min, and max.<\/p>\n<p><a href=\"https:\/\/craftydba.com\/wp-content\/uploads\/2014\/03\/F05-power-query-transformgroup-by-choices.jpg\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/craftydba.com\/wp-content\/uploads\/2014\/03\/F05-power-query-transformgroup-by-choices.jpg\" alt=\"F05-power-query-transform~group-by-choices\" width=\"702\" height=\"308\" class=\"aligncenter size-full wp-image-7188\" srcset=\"https:\/\/craftydba.com\/wp-content\/uploads\/2014\/03\/F05-power-query-transformgroup-by-choices.jpg 702w, https:\/\/craftydba.com\/wp-content\/uploads\/2014\/03\/F05-power-query-transformgroup-by-choices-300x131.jpg 300w\" sizes=\"auto, (max-width: 702px) 100vw, 702px\" \/><\/a><\/p>\n<p>Third, applying the group by operation to our data produces our expected result.<\/p>\n<p><a href=\"https:\/\/craftydba.com\/wp-content\/uploads\/2014\/03\/power-query-state-data-before-group-by-3.jpg\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/craftydba.com\/wp-content\/uploads\/2014\/03\/power-query-state-data-before-group-by-3.jpg\" alt=\"power-query-state-data-before-group-by-3\" width=\"311\" height=\"151\" class=\"aligncenter size-full wp-image-7191\" srcset=\"https:\/\/craftydba.com\/wp-content\/uploads\/2014\/03\/power-query-state-data-before-group-by-3.jpg 311w, https:\/\/craftydba.com\/wp-content\/uploads\/2014\/03\/power-query-state-data-before-group-by-3-300x145.jpg 300w\" sizes=\"auto, (max-width: 311px) 100vw, 311px\" \/><\/a><\/p>\n<p>Summing-up this article, there are many operations in the &#8220;transform&#8221; section of the ribbon that allow you to mold your data into a format that you want.  Next time, we will finish exploring the rest of the operators in this section.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>There has been a big push for self service business intelligence tools by Microsoft in recent months. The center of this new universe is Microsoft Excel. I will carry on with talks about Power Query which is a self service ETL add-in for MS Excel. This tool was released in August of 2013. In addition to my blog, the online help is a good place to start learning about this tool. Last time, I demonstrated how the &#8220;reduce&#8221; section of ribbon has buttons that can refine the number of rows&hellip;<\/p>\n","protected":false},"author":1,"featured_media":7070,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1042],"tags":[93,15,1043,1045,811,1063],"class_list":["post-7169","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-power-query","tag-group-by","tag-john-f-miner-iii","tag-power-query","tag-self-service-etl","tag-sort","tag-split-column"],"_links":{"self":[{"href":"https:\/\/craftydba.com\/index.php?rest_route=\/wp\/v2\/posts\/7169","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/craftydba.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/craftydba.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/craftydba.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/craftydba.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=7169"}],"version-history":[{"count":0,"href":"https:\/\/craftydba.com\/index.php?rest_route=\/wp\/v2\/posts\/7169\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/craftydba.com\/index.php?rest_route=\/wp\/v2\/media\/7070"}],"wp:attachment":[{"href":"https:\/\/craftydba.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=7169"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/craftydba.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=7169"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/craftydba.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=7169"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}