I have been recently devoting a-lot of time answering questions on STACKOVERFLOW.
Many times, you are just helping and/or teaching younger SQL Server developers and/or administrators the tricks of the trade. Sometimes you come across a GEM, a question that you can blog about so that when they are asked again, you have a well constructed answer.
One classic questions asked by new database developers is whether or not the user input is a number?
I thought it was an excellent opportunity to review what tools are available given a particular version of the database engine.
If I was new to SQL Server, my first thought would be to use the ISNUMERIC() function to determine if the text is really a number. This function has been part of the core engine since version 2005.
The TSQL snippet below calls this function with valid and invalid data.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
/* Sample calls to isnumeric() */ -- Test our old function select 1 as test_no, isnumeric('1.0') as is_numeric union all select 2 as test_no, isnumeric('.5') as is_numeric union all select 3 as test_no, isnumeric('1.') as is_numeric union all select 4 as test_no, isnumeric(NULL) as is_numeric union all select 5 as test_no, isnumeric('.') as is_numeric union all select 6 as test_no, isnumeric('2.1.0') as is_numeric go |
It is interesting to note that a period is considered a number. Please see screen shot below.
In version 2012, a bunch of new routines allow you to TRY a conversion. Check out TRY_CAST(), TRY_CONVERT() and TRY_PARSE() from books on-line. Instead of returning a error during a failed conversion, they return a NULL value. Let’s see how these routines convert a string that could be considered a decimal into a integer.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
/* Sample calls to try functions */ -- returns null declare @a varchar(4) = '1.0'; select try_cast(@a as int); -- returns null declare @b varchar(4) = '1.0'; select try_convert(int, @b); -- returns 1 declare @c varchar(4) = '1.0'; select try_parse(@c as int); |
Two of the three functions return NULL which is what we do not want. All three functions fail to ignore a period in the input. On the other hand, all three functions work if the target data type is float. We could always convert to float; then round or truncate to an integer?
However, I want a better solution. Let’s write our own function!
Here are the business rules.
- detect non-numeric data, return 0
- detect multiple periods, return 0
- whole number data, return 1
- decimal data, return -1
Let’s think about solving each business case separately.
To detect non-numeric data, we can use the LIKE function with a regular expression. The TSQL snippet below solves our first problem.
1 2 3 4 5 6 7 8 9 10 |
/* Testing for non-numeric data */ declare @d varchar(8) = '1a.0b'; select case when @d like '%[^0-9.]%' then 'non-numeric' else 'numeric' end as test; |
To detect multiple periods, we have to traverse the string looking for them. There has to be a better way?
Jeff Moden made the concept of a tally table popular. Here is my own version that is used to split the string at the single character level. The results are a table with index position and character value.
While this is not the fastest function, it is good enough for small strings and very tiny in code length. See article by Aaron Bertrand on a comparison of string splitting functions for performance.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 |
/* usp_split_string() function */ -- Use msdb use msdb; go -- Drop existing function if object_id('usp_split_string') > 0 drop function dbo.usp_split_string go -- Create new function create function dbo.usp_split_string(@value varchar(4000)) returns table as return ( with tally as ( select row_number() over (order by (select 1)) as n from sys.all_columns x cross apply sys.all_columns y ) select top 100 n as idx, substring(@value, n, 1) as val from tally where n <= len(@value) order by n ); go -- Sample call select * from dbo.usp_split_string('This is my simple string'); go |
Here are the results of a sample call.
The next step is to create a function using the split string function that counts the occurrence of any given character in a string.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
/* usp_count_chars() function */ -- Use msdb use msdb; go -- Drop existing function if object_id('usp_count_chars') > 0 drop function dbo.usp_count_chars go -- Create new function create function dbo.usp_count_chars(@value varchar(4000), @char varchar(1)) returns table as return ( select count(val) as cnt from dbo.usp_split_string (@value) where val = @char ); go -- Sample call select * from dbo.usp_count_chars('This is my simple string', 'i'); go |
The sample call correctly detects 4 i’s in the phrase.
You might have not noticed, but I am creating In-line table valued functions. They are faster than regular functions. See Wayne Sheffield’s blog article that attests to this fact.
I am also installing these functions in the [msdb] database so that you can re-use them in your programming adventures.
Last but not least, we need to pull it all together to write are improved function. We will use a case statement so that we can handle issues like null values, multiple periods or just a period. Please note, I choose a scalar function so that the new function can be used like the old one, ISNUMERIC().
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 |
/* usp_is_numeric() function */ -- Drop existing function if object_id('usp_is_numeric') > 0 drop function dbo.usp_is_numeric go -- Create new function create function dbo.usp_is_numeric(@value varchar(4000)) returns int as begin return ( select case when @value is null then 0 when rtrim(ltrim(@value)) = '.' then 0 when @value like '%[^0-9.]%' then 0 when period.cnt > 1 then 0 when period.cnt = 1 then -1 else 1 end from dbo.usp_count_chars(@value, '.') as period ) end; go -- Test our new function select 1 as test_no, msdb.dbo.usp_is_numeric('1') as is_numeric union all select 2 as test_no, msdb.dbo.usp_is_numeric('.5') as is_numeric union all select 3 as test_no, msdb.dbo.usp_is_numeric('1.') as is_numeric union all select 4 as test_no, msdb.dbo.usp_is_numeric(NULL) as is_numeric union all select 5 as test_no, msdb.dbo.usp_is_numeric('.') as is_numeric union all select 6 as test_no, msdb.dbo.usp_is_numeric('2.1.0') as is_numeric go |
The output shows that the new function ignores bad data, detects numeric values and supports decimal values. Given this output, you can correctly make decisions on how to convert the string to a variable.
In summary, this was a great opportunity to explore tally tables, split string functions, inline table value functions and regular scalar functions. Three new functions have been created that you can use to solve all sorts of programming tasks.
I hope you enjoyed this article as much as I did writing it.
You could rephrase business rule 2, into “if #periods >= 2 return 0”
Which translates into the following:
sign(charindex(‘.’,yoursearchstring)) + sign(coalesce(charindex(‘.’,substr(yoursearchstring,charindex(‘.’,yoursearchstring) + 1, len(yoursearchstring))).0))
sign(charindex(‘.’,yoursearchstring)) –> first period
sign(coalesce(charindex(‘.’,substr(yoursearchstring,charindex(‘.’,yoursearchstring) + 1, len(yoursearchstring))).0)) –> second period
(I originally was looking for info on PowerBI, but thought your posts quite good/interesting)
There is a typo:
sign(coalesce(charindex(‘.’,substr(yoursearchstring,charindex(‘.’,yoursearchstring) + 1, len(yoursearchstring))),0)).