Empowering Fabric Lakehouse Users Microsoft Fabric allows the developers to create delta tables in the Lakehouse. However, the automation of copying, updating, and deleting of data files in the Lakehouse might not be possible. How can we empower the end users to make these changes? Business Problem Today, we are going to cover two new tools from Microsoft that are in preview. First, the one lake explorer allows users to access files in the Lakehouse like they were in windows explorer. Second, the data wrangler extension for Visual Studio Code…
Thread 05 – Data Engineering with Fabric
Data Presentation Layer Microsoft Fabric allows the developer to create delta tables in the lake house. The bronze tables contain multiple versions of the truth and the silver tables a cleaned up, single version of the truth. How can we combine the silver tables into a relational model for consumption from the gold layer? Business Problem Our manager at adventure works has asked us to use a metadata driven solution to ingest CSV files from external storage into a Microsoft Fabric. A typical medallion architecture will be used in the…
Thread 04 – Data Engineering with Fabric
Metadata Driven Pipelines What is a metadata driven pipeline? Wikipedia defines metadata as “data that provides information about other data”. As a developer, we can create a non parameterized pipeline and/or notebook to solve a business problem. However, if we have to solve the same problem a hundred times, the amount of code can get unwieldly. A better way to solve this problem is to store metadata in the delta lake. This data will drive how the Azure Data Factory and Spark Notebooks execute. Business Problem Our manager has asked…
Thread 03 – Data Engineering with Fabric
Full versus Incremental Loads The loading of data from a source system to target system has been well documented over the years. My first introduction to an Extract, Transform and Load program was DTS for SQL Server 7.0 in 1998. In a data lake, we have a bronze quality zone that supposed to represent the raw data in a delta file format. This might include versions of the files for auditing. In the silver quality zone, we have a single version of truth. The data is de-duplicated and cleaned up.…
Thread 02 – Data Engineering with Fabric
Managing Files and Folders What is a data lake? It is just a bunch of files organized by folders. Keeping these files organized prevents your data lake from becoming a data swamp. Today, we are going to learn about a python library that can help you. Business Problem Our manager has given us weather data to load into Microsoft Fabric. We need to create folders in the landing zone to organize these files by both full and incremental loads. How can we accomplish this task? Technical Solution This use case…
Thread 01 – Data Engineering with Fabric
Managed Vs Unmanaged Tables Microsoft Fabric was release to the general availability on November 15th, 2024. I will be writing a quick post periodically in 2024 to get you up to speed on how to manipulate data in the lake house using spark. I really like the speed of the starter pools in Microsoft Fabric. A one to ten node pool will be available for consumption in less than 10 seconds. Read all about this new compute from this learn page. Business Problem Our manager has given us weather data…
New England – Microsoft Developers Group
I am enthusiastic about presenting to the New England Microsoft Developers group on June 14th, 2018. As an added benefit, I am looking forward to catching up with my friend and chapter leader, Andy Novick. The meeting venue is at the following address. Microsoft Technology Center 5 Wayside Road Burlington, MA 01803 I hope to see you at the event. Topic: Introduction to Cosmos Database for the C# developer Abstract: Azure Cosmos database is Microsoft’s new globally distributed, multi-model database service. Right now, there is support for five different application…
MONTREAL CANADA – SQL SAT #758
I am traveling to the city of saints to present at SQL Saturday #748 in Montreal, Canada. It will be a good time to meet old friends, make new ones and learn something new. I am really glad that my friend Dennis has decided to go on the road trip with me. It is really boring driving 375+ miles by yourself! I hope you have time to attend this awesome free event on June 2, 2018. I will be talking in a morning session on the following topic. Topic:…
New York City – SQL SATURDAY #716
I am enthusiastic about presenting at SQL Saturday #716 New York City at the Microsoft office in New York City on May 19, 2018. Of course, I am looking forward to seeing new friends, making new acquaintances, and learning something new. The meeting venue is at the following address. Microsoft Technology Center 11 Times Square New York, NY 10036 Details about the presentations are below. Topic: Standard and Custom Auditing of Azure SQL Database Abstract: The process of classifying a company into an industry segment has been around since the…
Philadelphia – SQL SATURDAY #714
I traveling to the city of brotherly love to present at SQL Saturday #714 in Blue Bell, PA. It will be a good time to meet old friends, make new ones and learn something new. I am really glad that my daughter has decided to go on the road trip with me. It is really boring driving 350+ miles by yourself! I hope you have time to attend this awesome free event on April 21, 2018. I will be talking in a morning session on the following topic. Topic:…