Lec3 SQL语句
阅读信息
1042 字 10 分钟 本页总访问量 加载中...
DDL
创建表,示例:
SQL CREATE TABLE branch (
branch_name char ( 15 ) not null , -- char表示固定长度
branch_city varchar ( 30 ) , -- varchar表示变长
assets numeric ( 8 , 2 ) ,
primary key ( branch_name ),
check ( assets >= 0 )
)
SQL中domain的类型:
char(n):定长字符串,长度固定为 n,不足部分通常用空格填充。
varchar(n):变长字符串,最大长度为 n,只占实际使用的字符空间。
int:整数类型,表示机器允许范围内的整数。
smallint:较小范围的整数类型,占用存储空间比 int 更小。
numeric(p, d):定点数类型,总共有 p 位数字,其中小数部分有 d 位,常用于精确计算(如金额)。
real / double precision:浮点数类型,分别表示单精度和双精度浮点数,精度依赖具体机器实现。
float(n):浮点数类型,至少具有 n 位有效数字的精度。
NULL 值:所有 domain 类型都允许 NULL,表示“未知”或“缺失”的值;若声明为NOT NULL则不允许NULL。
date:日期类型,包含年、月、日,例如date '2007-2-27'
time:时间类型,表示一天中的时间(时、分、秒),例如time '11:18:16' 或 time '11:18:16.28'
timestamp:时间戳类型,同时包含日期和时间,例如timestamp '2011-3-17 11:18:16.28'
完整性约束:
not null:不允许用null填
primary key (K):规定主键,默认为not null
Check (P):监测P值是否满足某些要求
删除表:DROP TABLE branch2
修改表:
增加列:ALTER TABLE r ADD column_name,自动用null填充
删除列:ALTER TABLE r DROP column_name
修改列的类型:ALTER TABLE r MODIFY (branch_name char(40), assets not null),只要类型兼容即可修改
创建index:CREATE INDEX index_name ON table_name
创建candidate key:CREATE UNIQUE INDEX index_name ON table_name
DML
选择语句
SELECT结构:
SQL SELECT column_list FROM table_list WHERE predicate
用*可表示所有列。SELECT对应泛化投影,可以对列进行运算操作
table_list为不同表的笛卡尔积,而不是自然连接【为什么?】。若要表示自然连接,需要在约束中加上对应属性相等的约束。
predicate为约束条件,可用and, or, not, between等词表示
SELECT返回结果不默认去重。在返回的列的前面加distinct关键字则表示去重,即返回的整个元组无重复。e.g.SELECT DISTINCT branch_name, balance FROM branch。在列前加ALL表示允许重复,即默认加ALL。
用LIKE表示模糊匹配,%表示任意多个字符,_表示任意单个字符。e.g.SELECT customer_name FROM customer WHERE customer_name LIKE '%a%'
用||表示返回值的拼接,e.g.SELECT 'name=' || customer_name FROM customer,返回显示格式为name=Abies
用ORDER BY表示对表排序,默认升序排序。如果排序的列有多列,则优先级依次递减。e.g.SELECT branch_name FROM loan ORDER BY customer_name, loan DESC
集合运算
UNION表示并,INTERSECT表示交,EXCEPT表示差
聚合函数
返回时可重命名,e.g.SELECT AVG(balance)avg_bal FROM account
除聚合函数中的属性,其他属性只能出现在GROUP BY中。e.g.SELECT branch_name, AVG(balance)avg_bal FROM account GROUP BY brach_name
distinct可用在COUNT中,表示相同的记录只计数一次。
需要按照聚合函数结果筛选时,用HAVING关键字,不能用WHERE(因为WHRER是一行一行判断,而不是整体判断)。e.g.:
SQL -- Find the names of all branches located in city Brooklyn where the average account balance is more than $1,200.
SELECT A . branch_name , avg ( balance ) FROM account A , branch B
WHERE A . branch_name = B . branch_name and branch_city = 'Brooklyn'
GROUP BY A . branch_name
HAVING avg ( balance ) > 1200
执行顺序:From → where → group (aggregate) → having → select → distinct → order by
null值
null值判断时结果为unknown。CHECK中,unknown也满足要求,只有false为不满足。
【??待补充】
嵌套子查询
E.g.1,用嵌套子查询的in表示UNION:
SQL SELECT distinct customer_name FROM borrower
WHERE customer_name in ( SELECT customer_name FROM depositor )
E.g.2,用嵌套子查询的not in表示EXCEPT:
SQL SELECT distinct customer_name FROM borrower
WHERE customer_name not in ( SELECT customer_name FROM depositor )
加some表示子查询中存在某些值满足;加all表示子查询中的任意值都满足。
自比较时,写成嵌套子查询或自连接。
视图view
创建视图:CREATE VIEW view_name AS ...
使用视图:将view当成一张表操作
派生表 :嵌套子查询中,可用AS将子查询返回内容也视为一张表,称为派生表(local view或derived relations);外层查询时,列名为派生表中的属性、WHERE中用派生表的数据判断。派生表必须重命名(即使没有用到)e.g.:
SQL SELECT branch_name , avg_bal
FROM ( SELECT branch_name , avg ( balance )
FROM account
GROUP BY branch_name )
AS result ( branch_name , avg_bal )
WHERE avg_bal > 500
WITH从句 :通过WITH语句创建的派生表只在当前语句有效,而不是全局有效。e.g.:
SQL WITH branch_total ( branch_name , a_bra_total ) AS
SELECT branch_name , sum ( balance )
FROM account
GROUP BY branch_name
WITH total_avg ( value ) AS
SELECT avg ( a_bra_total )
FROM branch_total
SELECT branch_name , a_bra_total
FROM branch_total A , total_avg B
WHERE A . a_bra_total >= B . value
修改数据库
DELETE、UPDATE、INSERT只能对单表操作。要注意先后顺序和外键约束。
DELETE和INSERT中,在同一SQL语句内,除非外层查询的元组变量引入内层查询,否则层查询只进行一次。e.g.,下面的SQL语句中,AVG只在最开始执行一次,而不是每次删除后都执行。
SQL DELETE FROM account
WHERE balance < (
SELECT AVG ( balance )
FROM account
)
UPDATE时可使用case语句:
SQL UPDATE account
SET balance = CASE
WHEN balance <= 10000
THEN balance * 1 . 05
ELSE balance * 1 . 06
END
建立在单个基本表上的视图,且视图的列对应表的列,称为“行列视图”。所有视图中,只有行列视图允许插入(插入仍然需要到底层的物理表上执行)。
事务transaction
事务是一系列数据库操作(查询、更新等)的集合,作为一个逻辑单元整体执行。
Transaction结束方式有两种:
COMMIT WORK:提交事务,所有修改永久生效
ROLLBACK WORK:回滚事务,撤销该事务中的所有修改
JOIN语句
FROM后写多张表,为多表的笛卡尔积;JOIN为多张表的连接
JOIN condition:
natural:自然连接,共有属性相同时连接
ON :按一定条件连接
USING (...):括号中列出的属性名相等时连接
JOIN type:
INNER JOIN:只返回两个表中连接条件匹配的行,不匹配的行直接丢弃。
LEFT JOIN:返回左表的全部行,右表只返回匹配的行,右表无匹配时用 NULL 填充。
RIGHT JOIN:返回右表的全部行,左表只返回匹配的行,左表无匹配时用 NULL 填充。
FULL JOIN:返回左右两个表的全部行,任一侧无匹配时都用 NULL 填充。
自然连接和USING条件中,同名属性只有一列;而用ON表示条件时,同名属性有多列。
示例
输出时字符串操作
CONCAT()表示字符串拼接,LOWER()和UPPER()表示统一大小写
SQL SELECT CONCAT ( 'Customer name: ' , UPPER ( customer_name ))
FROM customer
LENGTH()表示字符串长度
SUBSTRING(字符串,起始位置,长度)表示提取子串
TRIM()表示去除两端空格
LTRIM()表示去除左侧空格
RTRIM()表示去除右侧空格
REPLACE(字符串,被替换,替换)表示替换字符串
INSTR()表示查找子串位置
表示 同时有
e.g.要输出同时有A和B的name:用FROM A表示有A,用嵌套自查询+IN表示有B
SQL SELECT DISTINCT name
FROM A
WHERE name IN (
SELECT name
FROM B
)
同理,可用NOT IN表示有A但没有B
如果输出的和条件中信息不在一个表中,可以用多项IN表示,如
SQL SELECT name
FROM A
WHERE ( need1 , need2 ) IN (
SELECT need1 , need2
FROM B
)
聚合函数注意
如果有GROUP BY,则SELECT中只能有两类,否则会数量不匹配报错:
GROUP BY对应的属性
其他属性的聚合函数
最值查询
子查询得到最值,外层查询用最值筛选,注意内外层中要用某些一致信息连接。e.g.找同一支行中最大存款的账户:
SQL SELECT account_number
FROM account A
WHERE balance >= (
SELECT MAX ( balance )
FROM account B
WHERE A . branch_name = B . branch_name
)
利用SOME/ALL从句,外层遍历每个元组,内层检查是否满足要求。用重命名的方法,对同一个表检查。e.g.
SQL SELECT account_number
FROM account A
WHERE balance >= ALL (
SELECT balance
FROM account B
WHERE A . branch_name = B . branch_name
)
最多一个 问题
谓词UNIQUE用于判断子查询的结果中,是否不存在重复元组。e.g.找出所有在Perriridge银行至多有一个账户的客户:
Text Only branch(branch-name, branch-city, assets)
account(account-number, branch-name, balance) depositor(customer-name, account-number)
SQL SELECT customer_name
FROM depositor AS T
WHERE UNIQUE (
SELECT R . customer_name
FROM account , depositor AS R
WHERE T . customer_name = R . customer_name
AND R . account_number = account . account_number
AND account . branch_name = 'Perryridge'
);
同理,可用NOT UNIQUE表示“至少两个”问题。
WITH从句
WITH从句和子查询的区别为:
WITH从句必须重命名
WITH从句返回结果的作用范围为整个SQL语句,可在语句中多次使用,而子查询只能用一次
当返回多个属性时,可类似表命名CTE_name(attribute1, attribute2),类似创建临时的VIEW
一个聚合多个输出 问题
Text Only Student(sno, sname, ssex, sage, sdept)
Enrolled(sno, cno, grade)
Course(cno, cname, credits)
要求:找出至少注册10门课的学生,并输出选择的门数
直接将Student和Enrolled自然连接,用sname聚合筛选(为什么用sname筛选?因为输出必须为筛选项,要输出sname)
SQL SELECT sname , COUNT ( cno )
FROM Student
JOIN Enrolled USING ( sno )
GROUP BY sname
HAVING COUNT ( cno ) > 10 ;
如果要同时输出sno和sname,可以先统计再连接。注意导出的表必须命名,输出时要指明是哪个表里的信息。
SQL SELECT s1 . sno , s1 . sname , s2 . cnt -- 输出信息
FROM Student s1
JOIN (
SELECT sno , COUNT ( cno ) AS cnt -- 选出选课大于10的sno
FROM Enrolled
GROUP BY sno
HAVING COUNT ( cno ) > 10
) AS s2 ON s1 . sno = s2 . sno ;
表中表 问题
Text Only employee(id, name, age, gender, salary, boss)
属性boss即boss这个employee元组对应的id。
找出比上级更年轻、且和上级性别不同的员工。
用重命名的方式,从同一张表中多次筛选。
SQL SELECT e2 . name
FROM employee AS e1 , employee AS e2
WHERE e1 . id = e2 . boss
AND e1 . age < e2 . age
AND e1 . gender != e2 . gender
SQL SELECT id
FROM employee a
WHERE a . age < (
SELECT b . age
FROM employee b
WHERE b . id = a . boss
)
AND a . gender != (
SELECT b . gender
FROM emplyee b
WHERE b . id = a . boss
)
找出下属数量最多的员工
用聚合操作“反向”找出boss-下属的对应关系,用LIMIT找出最值。
SQL SELECT boss
FROM employee
GROUP BY boss
ORDER BY COUNT ( id ) DESC
LIMIT 1
一个例子
Consider the student database below:
Text Only student(student-no, student-name, sex, age, dept-name)
course(course-no, course-name, credit)
study(student-no, course-no, score)
Find the names of students who have studied course ‘Database System’ and sort results by ascending score.
SQL SELECT student_name
FROM student
JOIN study ON student . student_no = study . student_no
JOIN course ON course . course_no = study . course_no
WHERE course . course_name = 'Database System'
ORDER BY study . score
Find the names of students who get the best score in course ‘Database System’.
SQL -- SELECT student_name
-- FROM student
-- JOIN study ON student.student_no = study.student_no
-- JOIN course ON course.course_no = study.course_no
-- WHERE course.course_name = 'Database System'
-- ORDER BY study.score
-- LIMIT 1; -- 获得最大值的学生可能不止一个,不能只输出一个人
SQL -- SELECT student_name
-- FROM student
-- JOIN study ON student.student_no = study.student_no
-- JOIN course ON course.course_no = study.course_no
-- WHERE course.course_name = 'Database System'
-- AND study.score >= (
-- SELECT MAX(score)
-- FROM study
-- WHERE course_no = (
-- SELECT course_no
-- FROM course
-- WHERE course_name = 'Database System'
-- )
-- );
SQL SELECT s . student_name
FROM student AS s
JOIN study AS st ON s . student_no = st . student_no
JOIN course AS c ON c . course_no = st . course_no
WHERE c . course_name = 'Database System'
AND st . score = (
SELECT MAX ( st2 . score )
FROM study AS st2
JOIN course AS c2 ON c2 . course_no = st2 . course_no
WHERE c2 . course_name = 'Database System'
);
Find the names of courses that have maximum average score.
SQL -- SELECT course_name -- 根据编号返回课程名称
-- FROM course
-- WHERE course_no IN (
-- SELECT course_no -- 返回平均值最大的课程的编号
-- FROM study
-- GROUP BY course_no
-- HAVING AVG(score) >= (
-- SELECT MAX(AVG(score)) -- 返回课程平均值的最大值
-- FROM study
-- GROUP BY course_no
-- )
-- )
SQL WITH t AS ( -- 返回各门课程平均值
SELECT course_no , AVG ( score ) AS avg_score
FROM study
GROUP BY course_no
)
SELECT course_name -- 输出课程名称
FROM course
JOIN t USING ( course_no )
WHERE t . avg_score >= (
SELECT MAX ( avg_score ) -- 返回最大平均值
FROM t
)
又一个例子
Text Only Student(Sid, N ame, Sex, Age)
Participate(P rojectN ame, Sid, Score)
Project(P rojectN ame, Requirement, Starttime, Endtime)
Find the names of students who are in ‘MiniSQL’ project.
SQL SELECT Name
FROM Student
JOIN Participate USING ( Sid )
WHERE Participate . ProjectName = 'MiniSQL'
Find the Sid of students who have not been in any project yet.
SQL SELECT Sid
FROM Student
WHERE Sid NOT IN (
SELECT Sid
FROM Participate
)
更规范的写法(NOT IN在判断null时不返回,可能有问题):
SQL SELECT s . Sid
FROM Student s
WHERE NOT EXISTS (
SELECT 1
FROM Participate p
WHERE p . Sid = s . Sid
)
For each project, find the names of students who have the highest score
SQL WITH t AS (
SELECT ProjectName , MAX ( Score ) AS max_score -- 返回每个项目的最高分数
FROM Participate
GROUP BY ProjectName
)
SELECT Participate . ProjectName , Name
FROM Student
JOIN Participate ON Student . Sid = Participate . Sid
JOIN t ON t . ProjectName = Participate . ProjectName
WHERE Participate . Score >= t . max_score
Find Top-2 projects which have the most female participants.
SQL SELECT ProjectName
FROM Participate
JOIN Student USING ( Sid )
WHERE Sex = 'female'
GROUP BY ProjectName
ORDER BY COUNT ( * ) DESC
LIMIT 2
2026-04-07
2025-09-09
GitHub