Import & Export Data – Part 4

The BCP command line utility is the Cadillac of ETL programs for text based data files. It is REALLY FAST. It can perform data imports, data exports and generate format files. What it lacks in the transformation step can be crafted with custom TSQL.

Today, I am going to concentrate on how BCP can be used to export data to text files. The format of the output text files can be either character, native, or unicode. A quick demonstration of how to create character and xml format files will also be covered during the discussion.

I will be using the Adventure Works Sample database supplied by Microsoft during this talk. The xp_cmdshell extended stored procedure will be used execute BCP jobs from a query window inside of SQL Server Management Studio (SSMS).

There are many arguements that can be used to change how the statement executes. Here are some used that are used in the examples below.

BCP SWITCH ACTION
-T use a trusted connection.
-U supply user name for standard security.
-P supply user password for standard security.
-c character based output.
-w unicode based output.
-n native code based output.
-q use quoted identifiers during query execution.
-f output location for format file.
-x xml based format file output.

It is very important to select the correct version of the BCP utility. I ran into issues when creating the examples because I had both SQL Server 2000 and SQL Server 2008 R2 client tools. Therefore, the examples below use a XP DOS commands to change to the correct binn directory before executing the BCP utility. The process wait command, &, is used to make the programs are executed in a serial manor in the DOS command prompt.

The first example creates a comma seperated value character based file called DEPTLIST1.CSV. It is important to note that the out command is used. It takes the name of a table or view as input. The next three examples use the Department table as the data source.

The second example creates a comma seperated value, unicode based file called DEPTLIST2.TXT. It is important to note that standard security is used in which the user name and user password are passed to the BCP utility.

The third example creates a comma seperated value, native code based file called DEPTLIST3.TXT.

The fourth example creates a pipe delimited character based file called PEOPLE.TXT. It is important to note that the queryout command is used. It takes any valid query as input. A hexidecimal value is supplied for the field delimiter. Using the | in a CMD shell is not wise since it is a reserved process control character. I suggest using the -q option if the BCP utility complains about quotes.

The fifth example creates a character based format file called CURRENCY.FMT. It is important to note that the format command is used. The contentts of the format file is based on the Currency table. This file can be used as is to import data.

The sixth example creates a XML based format file called CURRENCY.XML. It is important to note that the format command is used with the -x switch. The contents of the format file is based on the Currency table which can be modified to perform advance processing.

Exporting data by using the BCP command line utility is the best choice for moving large amounts of data. There are many options (switches) that can be specified that change the behavior of the program. Target files can be either a character, native, or unicode format.

A format file can be created to skip columns or define additional mappings. The BCP utility is a great choice for creating ETL processes as long as the input files or export tables are relatively error free. Think about using staging tables to validate the incoming data before adding to your warehouse. This will save you time in the long run.

Next time, I am going to start to talk about distributed queries which opens the door to new database and file formats that can be selected as import targets or export destinations.

Related posts

Leave a Comment