Megan Taylor

front-end dev, volunteacher, news & data junkie, bibliophile, Flyers fan, sci-fi geek and kitteh servant

Intro to Databases: SQL Aggregation

Aggregation functions in SELECT clause over values in multiple rows: min, max, sum, avg, count

new clauses: Group By (partition relations into groups), Having (test filters on results of aggregate values)

Demo: simple college admissions database
College (cName, state, enrollment)
Student (sID, sName, GPA, sizeHS)
Apply (sID, cName, major, decision)

compute avg GPA of students in database
select avg(GPA)
from Student

find lowest GPA of students applying to CS
select min(GPA)
from Student, Apply
where Student.sID = Apply.sID and major = ‘CS’;

find average GPA of students applying to CS
select avg(GPA)
from Student
where sID in (select sID from Apply where major = ‘CS’);

count the number of colleges with enrollment greater than 15000
select count(*)
from College
where enrollment > 15000;

count number of students applying to cornell
select count(distinct sID)
from Apply
where cName = ‘Cornell’;

students such that number of other students with same GPA is equal to number of other students with same sizeHS
select *
from Student S1
where (select count(*) from Student S2
where S2.sID <> S1.sID and S2.GPA = S1.GPA) =
(select count(*) from Student S2
where S2.sID <> S1.sID and S2.sizeHS = S1.sizeHS);

amount by which the average GPA of students applying to CS exceeds the average GPA of students not applying to CS
select CS.avgGPA – NonCS.avgGPA
from (select avg(GPA) as avgGPA from Student
where sID in (
select sID from Apply where major = ‘CS’)) as CS,
(select avg(GPA) as avgGPA from Student
where sID not in (
select sID from Apply where major = ‘CS’)) as nonCS;

same query with subqueries in select clause
select distinct (select avg(GPA) as avgGPA from Student
where sID in (
select sID from Apply where major = ‘CS’)) –
(select avg(GPA) as avgGPA from Student
where sID not in (
select sID from Apply where major = ‘CS’)) as d
from Student;

Number of applications to each college
select cName, count(*)
from Apply
group by cName;

find total enrollment of students for each state
select state, sum(enrollment)
from College
group by state;

for each college and major combination, find min and max GPAs for students applying
select cName, major, min(GPA), max(GPA)
from Student, Apply
where Student.sID = Apply.sID
group by cName, major;

find the difference between min and max GPAs of students applying for each college and major combination
select mx-mn
from (select cName, major, min(GPA) as mn, max(GPA) as mx
from Student, Apply
where Student.sID = Apply.sID
group by cName, major) M;

find max spread between GPAs of students applying for each college and major
select max(mx-mn)
from (select cName, major, min(GPA) as mn, max(GPA) as mx
from Student, Apply
where Student.sID = Apply.sID
group by cName, major) M;

number of colleges applied to by each student
select Student.sID, count(distinct cName)
from Student, Apply
where Student.sID = Apply.sID
group by Student.sID
union
select sID, 0
from Student
where sID not in (select sID from Apply);

colleges with fewer than 5 applications
select cName
from Apply
group by cName
having count(*) < 5; same query without having or group by clauses select distinct cName from Apply A1 where 5 > (select count(*) from Apply A2 where A2.cName = A1.cName);

Every query with a group by or having clause can be written without them.

colleges with fewer than 5 distinct students applying
select cName
from Apply
group by cName
having count(distinct sID) < 5; all majors represented where max GPA of students applying for that major is lower than avg GPA of all students select major from Student, Apply where Student.sID = Apply.sID group by major having max(GPA) < (select avg(GPA) from Student);

November 5, 2011 | Comments Off on Intro to Databases: SQL Aggregation | Categories: Posts | Permalink

Comments are closed.

%d bloggers like this: