Remove duplicate records in two or more columns

If we wanna have a primary key or unique index on two or more columns, using CTE and DERIVED TABLE with ROW_NUMBER is very useful to remove dup records. In fact that’s CTE usage is because we cannot use RANKING FUNCTIONS in WHERE clause.

For select dup records, we use CTE like this:

-- This is just an example
-- Don't worry about sample table structure

;with  T ( RN, GroupId, UserName )
       as ( select ROW_NUMBER() over ( partition by GroupId, UserName order by GroupId ) as RN
                 , GroupId
                 , UserName
                 from Tbl )
   select GroupId
        , UserName
        from T
        where T.RN > 1 ;

For delete those dup records, just change CTE to a DERIVED TABLE like this:

delete T
     from ( select ROW_NUMBER() over ( partition by GroupId, UserName order by GroupId ) as RN
                 , GroupId
                 , UserName
                 from Tbl ) T
     where T.RN > 1 ;

Advertisements