Please see my other Database Development articles.
Ranking Functions
There are several new impressive advancements in the latest version of t-sql; this article will focus on “Ranking” functions.
Prior to the release of SQL Server 2005 with its t-sql enhancements, working with blocks of related data was clunky at best. T-sql authors mostly relied on “GROUP BY” statements and then had to perform “CURSOR” and/or “SORT BY” acrobatics in order to return contiguous blocks of data in any meaningful manner.
Ranking functions overcome previous limitations to working with subsets of data within implicit groups.
In other words, SQL Server developers are now able to in essence, perform “GROUP BY” operations within existing groups.
We can achieve such actions by choosing from several segregation actions according to the desired result along with the familiar “SORT BY” finishing touches.
To begin these examples, first I’ll show you the source data.
It’s simply a series of records reflecting average data about people such as would be common to most databases .
For example, for all of our example will be working with the following fields: FirstName, Age, and Gender.
RANK()
This function is similar to “ROW_NUMBER()” except instead of simply sequentially numbering records, the sequence is assigned by a field’s data, in this case “Age.”
In the very first example, you saw the first three rows numbered “1, 2, 3,” according to the sequence of the rows.
However, that told me nothing of the “Age” groupings naturally occurring within the result set.
In this example, the sequence only increments when a new value is encountered. Therefore, I get a more meaningful sequential order when I instruct the ranking function to “RANK()” “OVER,” that is “Using” the “Age” column.
RANK() with PARTITION
In my last example, you may have noticed I lost my PARTITIONing according to the natural grouping according to the person’s “Gender.”
This can easily be reapplied using the same “PARTITION” clause.
DENSE_RANK() with PARTITION
You may have noticed in the two previous examples, the sequence occasionally skips a number.
This anomaly is due in part to the number of records sharing the same value within the field used by the “RANK().”
To avoid breaks in our sequence, we’ll simply use this function with instructs the result set sequence to be more “dense” in its numbering.