Professional Documents
Culture Documents
You can control the sort order of SQL query results using the ORDER BY clause. When
sorting on a numeric column, the resulting order typically makes intuitive sense, but when
sorting on a string column, you might be surprised by the resulting order. This is especially
true when the strings include numbers, or a mix of numbers and letters or other characters
within a value.
Unfortunately, there isn't a simple explanation to tell you how SQL will sort your results,
because it depends on what collation you are using.
Collations have different options associated with them, and many can be customized
depending on the system you are using. For English, case sensitivity is a major one to
consider—should "A" and "a" be considered the same character for the purposes of
ordering? Others include accent sensitivity (for example, should "a" and "á" be considered
the same), Kana sensitivity (which distinguishes between the two types of Japanese
characters), and script order (for example, which should be ordered first: Hebrew, Greek,
or Cyrillic). See "Customization
When using Unicode
—an industry standard that assigns a number to each character or symbol— SQL will most
likely follow the Unicode ordering to distinguish the order of two characters, while taking
customizations into account. Non-Unicode data may have a different order:
When you use a SQL collation you might see different results for comparisons of the same
characters, depending on the underlying data type. For example, if you are using the SQL
collation "SQL_Latin1_General_CP1_CI_AS", the non-Unicode string 'a-c' is less than the
string 'ab' because the hyphen ("-") is sorted as a separate character that comes before "b".
However, if you convert these strings to Unicode and you perform the same comparison,
the Unicode string N'a-c' is considered to be greater than N'ab' because the Unicode sorting
rules use a "word sort" that ignores the hyphen. (4)
When it comes to numbers represented within strings, you must remember than string
sorting is done on a character-by-character basis. For example:
'42' < This compares only the first characters: '4'<'7'. The order is now established and any
'71' other remaining characters can be ignored.
'42' < The first characters are the same, '4' = '4', so the sort then compares the next
'45' characters, '2'<'5'. So '42' < '45'.
Although numerically 42 > 7, the sort compares the first characters, '4' and '7'. Since
'42' <
'4' < '7', the order is established and any other remaining characters are ignored. For
'7'
this string sort, '42' < '7'.
You can sometimes find ways to customize the sort, when necessary. For example, "Use
SQL Server to Sort Alphanumeric Values
" (5) provides a method, usable with Microsoft SQL Server, to sort values with a mixture of
letters and numerals that would consider '7' < '42'.
Spaces, especially leading spaces, often cause confusion as well. The space character is
typically considered to come before any number or letter, and some punctuation as well.
Again, sort order is done character by character. For example:
The first characters are equivalent, 'n' = 'n', so the sort would move to the
'no one' <
second characters. These are also equivalent, 'o' = 'o', so the sort moves to the
'nobody'
third characters. These are ' ' and 'b', and ' ' < 'b', so 'no one' < 'nobody'.
' start' < Notice that the first character in the string on the left is a space. While 'begin' <
'begin' 'start' because 'b' < 's', these string sort as ' start' < 'begin' because ' ' < 'b'.