Basic Training – Data Types – Part 6


I am continuing my series of talks on fundamental topics like data types. I am proud to be a United States Army Reservist (USAR) Veteran.

Just like boot camp I went to so long ago, I am going nick name the series BASIC TRAINING.

The most basic unit of any database is a TABLE which consists of COLUMNS. A vital decision during the initial database design is to choose the data types that will capture the information you want in the least amount of space.

Today, I am exploring binary strings by extending the sample database named [BASIC] that contains a schema named [TRAINING] and table named [BINARY_STRINGS].

Books online describes three data types that are categorized as binary. I created a field for each type in our new table as well as a surrogate key.

Many people use binary fields to store documents such as PDF files. Due to the size of these files, I am going to make adjustments to increase the data and log files associated with the [BASIC] database.

We should test the newly created table by loading some binary data.

To make this test more complex, I found a E-book version of Arthur Conan Doyle’s Sherlock Holmes adventures.

To complete this example, what would a database be if you could not extract the PDF document when it was needed? We will first need to enable to command shell extended stored procedure using the code below.

The only utility that can export binary objects is the bcp.exe program. The TSQL code below will create and execute a command shell script to extract the document.

The screen shot below shows the execution of the BCP program exporting the binary document to disk.

As a database designer, you should always question the components that make up your database.

One question that you might have is ‘What is the maximum number of bytes that a row can have?’. This is important because data is stored in the *.MDF or *.NDF files as pages ~ 8k. Since a page can only save 8060 bytes, you can figure out how many records can fit on a page and how many bytes are wasted space.

The following code uses the sys.columns table to count the number of fields and calculate the maximum row size.

We can see that 5 columns in the table have a maximum record length of 28 bytes. This information is misleading since it does not account for the out of row data. It only counts the first three columns. The [sp_spaceused] stored procedure shows us that 1 index page and many data pages have been allocated for the table.

We can query the dynamic management view named index physical stats to the an accurate number of data pages in the heap. A heap is a table without an index. In our example, there are 1297 pages allocated for the documents.

Last but not least, the sp_help stored procedure displays the details of the table. This includes many different settings that can be choosen as a DDL designer such as computed column, field length, nullabilty, and collation to mention a few.

In summary, binary strings can be used to save bit patterns from your application specific integrated circuit or your favorite music file (MP3) in the database. Next time, I will be going over types classified as other.

Related posts

Leave a Comment