Problem Statement
How do DISTINCT and GROUP BY differ when removing duplicates, and when would you prefer one?
Explanation
DISTINCT removes duplicate rows from the projection exactly as listed. It is concise when you just need unique combinations of columns without any aggregates.
GROUP BY collapses rows by keys and invites aggregates in the same step. Prefer GROUP BY when you also need counts or sums per unique combination. Prefer DISTINCT when you only need uniqueness and no metrics.
Code Solution
SolutionRead Only
SELECT DISTINCT user_id FROM logins; -- vs SELECT user_id, COUNT(*) FROM logins GROUP BY user_id;
Practice Sets
This question appears in the following practice sets:
