A co-worker and I were chatting about a code problem he was having that was likely due to duplicate entries in a database table. He thought that records in the table were unique based on two columns, but that didn’t seem to agree with what was happening in code. He wanted to write a SQL query to identify any duplicates, but he didn’t know how to do it. He doesn’t write a lot of SQL and wasn’t comfortable with it. He uses LINQ everyday, though, so I suggested he do it through one of my favorite tools: LINQPad.
Regardless of whether you’re doing it in SQL or LINQPad, the approach is the same: group by the fields and filter to show only groups with more than one item.
So, let’s do that in LINQPad. Here are the quick-steps to get you caught up to the point of writing your query:
- Open LINQPad
- Configure connection
- Configure query to use the connection
- Write query
There are two things to keep in mind when writing queries in LINQPad. First, the table names will be pluralized. “SomeObject” becomes “SomeObjects.” Second, field names will be case-sensitive and always start with a capital letter. Now let’s get to business…
SomeObjects.GroupBy(x => x.FirstColumn + "|" + x.SecondColumn) .Where(x => x.Count() > 1)
Boom! That’s all there is to it. I concatenate the columns to create a group key, and filter the results to only show groups with more than one item.