The result of an arithmetic expression (involving, for example +, −, ∗,or / ) is null if any of the input values is null.
For example, if a query has an expression r.A + 5, and r.A is null for a particular tuple, then the expression result must also be null for that tuple.
SQL therefore treats as unknown the result of any comparison involving a null value. This creates a third logical value in addition to true and false. if r.A is null, then "1 < r.A" as well as "not (1 < r.A)" evaluate to unknown.
Since the predicate in a where clause can involve Boolean operations such as and, or, and not on the results of comparisons, the definitions of the Boolean operations are extended to deal with the value unknown.
and: The result of true and unknown is unknown, false and unknown is false, while unknown and unknown is unknown.
or: The result of true or unknown is true, false or unknown is unknown, while unknown or unknown is unknown.
not: The result of not unknown is unknown.
If the where clause predicate evaluates to either false or unknown for a tuple, that tuple is not added to the result.
When a query uses the select distinct clause, duplicate tuples must be eliminated.
For this purpose, when comparing values of corresponding attributes from two tuples, the values are treated as identical if either both are non-null and equal in value, or both are null.
Thus two copies of a tuple, such as {(’A’,null), (’A’,null)}, are treated as being identical, even if some of the attributes have a null value.
Using the distinct clause then retains only one copy of such identical tuples.
Note that the treatment of null above is different from the way nulls are treated in predicates, where a comparison “null=null” would return unknown, rather than true.
The approach of treating tuples as identical if they have the same values for all attributes, even if some of the values are null, is also used for the set operations union, intersection and except.
Q . To find all instructors who appear in the instructor relation with null values for salary
select name
from instructor
where salary is null;
Aggregate functions are functions that take a collection (a set or multiset) of values as input and return a single value.
SQL offers five built-in aggregate functions:
Average: avg
Minimum: min
Maximum: max
Total : sum
Count: count
The input to sum and avg must be a collection of numbers, but the other operators can operate on collections of non-numeric data types, such as strings, as well.
Q . Find the average salary of instructors in the Computer Science department
select avg (salary)
from instructor
where deptname = 'Comp. Sci.';
We can give a meaningful name to the result relation attribute that is generated by aggregation.
select avg (salary) as avg_salary
from instructor
where deptname = 'Comp. Sci.';
Q . Find the total number of instructors who teach a course in the Spring 2010 semester.
teaches(ID, course_id, sec_id, semester, year)
select count (distinct ID)
from teaches
where semester = ’Spring’ and year = 2010;
Q . To find the number of tuples in the course relation
select count (*)
from course;
SQL does not allow the use of distinct with count (*). It is legal to use distinct with max and min, even though the result does not change. We can use the keyword all in place of distinct to specify duplicate retention, but, since all is the default, there is no need to do so.
To apply the aggregate function not only to a single set of tuples, but also to a group of sets of tuples; we use the group by clause.
The attribute or attributes given in the group by clause are used to form groups.
Tuples with the same value on all attributes in the group by clause are placed in one group.
When an SQL query uses grouping, it is important to ensure that the only attributes that appear in the select statement without being aggregated are those that are present in the group by clause.
Thus, any attribute that is not present in the group by clause must appear only inside an aggregate function if it appears in the select clause, otherwise the query is treated as erroneous.
Q . Find the average salary in each department.
instructor(ID, name, dept_name, salary)
select deptname, avg (salary) as avg salary
from instructor
group by deptname;
Q . Find the average salary of all instructors
instructor(ID, name, dept_name, salary)
select avg (salary)
from instructor;
Q . Find the number of instructors in each department who teach a course in the Spring 2010 semester.
instructor(ID, name, dept_name, salary)
teaches(ID, course_id, sec_id, semester, year)
select deptname, count (distinct ID) as instr_count
from instructor natural join teaches
where semester = ’Spring’ and year = 2010
group by deptname;
To state a condition that applies to groups rather than to tuples we use the having clause of SQL.
SQL applies predicates in the having clause after groups have been formed, so aggregate functions may be used.
Any attribute that is present in the having clause without being aggregated must appear in the group by clause, otherwise the query is treated as erroneous.
Q . Find only those departments where the average salary of the instructors is more than $42,000
instructor(ID, name, dept_name, salary)
select deptname, avg (salary) as avg salary
from instructor
group by deptname
having avg (salary) > 42000;
The from clause is first evaluated to get a relation (This is true even for non aggregation based queries).
If a where clause is present, the predicate in the where clause is applied on the result relation of the from clause.
Tuples satisfying the where predicate are then placed into groups by the group by clause if it is present. If the group by clause is absent, the entire set of tuples satisfying the where predicate is treated as being in one group
The having clause, if it is present, is applied to each group; the groups that do not satisfy the having clause predicate are removed
The select clause uses the remaining groups to generate tuples of the result of the query, applying the aggregate functions to get a single result tuple for each group.
Q . For each course section offered in 2009, find the average total credits (tot_cred) of all students enrolled in the section, if the section had at least 2 students.
student (ID, name, dept_name, tot_cred)
takes (ID, course_id, sec_id, semester, year, grade)
select course_id, semester, year, sec_id, avg (tot cred)
from takes natural join student
where year = 2009
group by course_id, semester, year, sec_id
having count (ID) > 2;
In general, aggregate functions treat nulls according to the following rule:
All aggregate functions except count (*) ignore null values in their input collection.
As a result of null values being ignored, the collection of values may be empty.
The count of an empty collection is defined to be 0, and all other aggregate operations return a value of null when applied on an empty collection.
SQL provides a mechanism for nesting subqueries. A subquery is a select-from-where expression that is nested within another query.
A common use of subqueries is to perform tests for :
set membership
make set comparisons
determine set cardinality
Set Membership
SQL allows testing tuples for membership in a relation.
The in connective tests for set membership, where the set is a collection of values produced by a select clause. The not in connective tests for the absence of set membership.
Q. Find all the courses taught in the both the Fall 2009 and Spring 2010 semesters.
section(course_id, sec_id, semester, year, building, room_no, time_slot_id)
select distinct courseid
from section
where semester = ’Fall’ and year= 2009 and
courseid in (select course id
from section
where semester = ’Spring’ and year= 2010);
Q. To find all the courses taught in the Fall 2009 semester but not in the Spring 2010 semester.
section(course_id, sec_id, semester, year, building, room_no, time_slot_id)
select distinct courseid
from section
where semester = ’Fall’ and year= 2009 and
courseid not in (select course id
from section
where semester = ’Spring’ and year= 2010);
Q. find the total number of (distinct) students who have taken course sections taught by the instructor with ID 110011.
teaches(ID, course_id, sec_id, semester, year)
takes(ID, course_id, sec_id, semester, year, grade)
select count (distinct ID)
from takes
where (courseid, sec id, semester, year)
in (select course id, sec id, semester, year
from teaches
where teaches.ID = 10101);
Q. Find the names of all instructors whose salary is greater than at least one instructor in the Biology department
instructor(ID, name, dept_name, salary)
select name
from instructor
where salary > some (select salary
from instructor
where deptname = ’Biology’);
The phrase “greater than at least one” is represented in SQL by > some
Q. find the names of all instructors that have a salary value greater than that of each instructor in the Biology department.
instructor(ID, name, dept_name, salary)
select name
from instructor
where salary > all (select salary
from instructor
where deptname = ’Biology’);
Q. Find the departments that have the highest average salary.
instructor(ID, name, dept_name, salary)
select deptname
from instructor
group by deptname
having avg (salary) ≥ all (select avg (salary)
from instructor
group by deptname);
Q. The 'exists' construct returns the value true if the argument subquery is nonempty. Using the exists construct, write the query “Find all courses taught in both the Fall 2009 semester and in the Spring 2010 semester”
section(course_id, sec_id, semester, year, building, room_no, time_slot_id)
select courseid
from section as S
where semester = ’Fall’ and year= 2009 and
exists (select *
from section as T
where semester = ’Spring’ and year= 2010 and
S.courseid = T.courseid);
Q. Find all students who have taken all courses offered in the Biology department.
select distinct S.ID, S.name
from student as S
where not exists ((select courseid
from course
where deptname = ’Biology’)
except
(select T.courseid
from takes as T
where S.ID = T.ID));
First subquery finds the set of all courses offered in the Biology department.
(select courseid
from course
where deptname = ’Biology’)
Second subquery finds all the courses that student S.ID has taken
(select T.courseid
from takes as T
where S.ID = T.ID)
Q. SQL includes a boolean function for testing whether a subquery has duplicate tuples in its result. The unique construct returns the value true if the argument subquery contains no duplicate tuples. Using the unique construct, write the query “Find all courses that were offered at most once in 2009”
section(course_id, sec_id, semester, year, building, room_no, time_slot_id)
course (course_id, title, dept_name, credits)
select T.courseid
from course as T
where unique (select R.courseid
from section as R
where T.courseid = R.course id and R.year = 2009);
Note : The unique test on a relation is defined to fail if and only if the relation contains two tuples t1 and t2 such that t1 = t2 .Since the test fails if any of the fields of t1 or t2 are null, it is possible for unique to be true even if there are multiple copies of a tuple, as long as at least one of the attributes of the tuple is null.