ch4 数据库系统概念(第6版)第四章中间SQL
- 格式:pdf
- 大小:3.26 MB
- 文档页数:74
sql 第四范式-概述说明以及解释1.引言1.1 概述第四范式是关系数据库设计中的一个重要概念,它是指在数据库设计中,将非主属性间的关系通过引入新的实体进行拆分,达到消除数据冗余和提高数据完整性的目的。
本文将围绕第四范式展开讨论,并探讨其在实际应用中的挑战。
在传统关系数据库设计中,我们常常会遇到冗余数据的问题。
冗余数据不仅浪费了存储空间,还容易导致数据的不一致性和更新异常。
为了解决这个问题,提出了规范化的概念,其中第四范式就是规范化的最高级别。
第四范式要求数据库中每个非主属性都完全依赖于键,并且不存在非主属性之间的传递依赖。
换句话说,第四范式要求数据库中的每个非主属性都是直接依赖于键的,而不是间接依赖于其他非主属性。
第四范式的优点是显而易见的。
首先,它能够消除数据冗余,减少存储空间的占用。
其次,由于数据的一致性得到了保证,更新异常的风险也大大降低。
此外,第四范式还能够提高查询的效率,因为数据的拆分使得数据的访问更加快速和高效。
然而,第四范式在实际应用中也会面临一些挑战。
首先,拆分数据可能导致查询的复杂性增加。
由于数据被分散存储在不同的表中,查询的时候需要进行多次联结操作,增加了查询的成本。
其次,第四范式对于数据一致性的要求较高,需要在应用层面进行更加复杂的控制和约束,这可能带来额外的开发和维护成本。
最后,第四范式需要根据具体业务需求进行合理的实体拆分,这对于数据库设计师来说可能是一项具有挑战性的任务。
综上所述,第四范式是关系数据库设计中一个重要的概念,它可以消除数据冗余、提高数据完整性和查询效率。
然而,在实际应用中,我们需要权衡其优点和挑战,并根据具体业务需求进行合理的设计和实施。
在下文中,我们将详细探讨第四范式的相关概念和优点,以及在实践中可能遇到的挑战。
1.2文章结构1.2 文章结构本文将按照以下结构展开讨论第四范式的相关内容:1. 引言:首先,我们会对整篇文章进行一个概述,明确我们要讨论的问题和目的,引起读者对文章的兴趣。
Intermediate SQLPractice Exercises4.1Write the following queries in SQL:a.Display a list of all instructors,showing their ID,name,and the num-ber of sections that they have taught.Make sure to show the numberof sections as0for instructors who have not taught any section.Yourquery should use an outerjoin,and should not use scalar subqueries.b.Write the same query as above,but using a scalar subquery,withoutouterjoin.c.Display the list of all course sections offered in Spring2010,alongwith the names of the instructors teaching the section.If a section hasmore than one instructor,it should appear as many times in the resultas it has instructors.If it does not have any instructor,it should stillappear in the result with the instructor name set to“—”.d.Display the list of all departments,with the total number of instructorsin each department,without using scalar subqueries.Make sure tocorrectly handle departments with no instructors.Answer:a.Display a list of all instructors,showing their ID,name,and the num-ber of sections that they have taught.Make sure to show the numberof sections as0for instructors who have not taught any section.Yourquery should use an outerjoin,and should not use scalar subqueries.select ID,name,count(course id,section id,year,semester)as’Number of sections’from instructor natural left outer join teachesgroup by ID,nameThe above query should not be written using count(*)since count*counts null values also.It could be written using count(section id),or1920Chapter4Intermediate SQLany other attribute from teaches which does not occur in instructor,which would be correct although it may be confusing to the reader.(Attributes that occur in instructor would not be null even if the in-structor has not taught any section.)b.Write the same query as above,but using a scalar subquery,withoutouterjoin.select ID,name,(select count(*)as’Number of sections’from teaches T where T.id=I.id)from instructor Ic.Display the list of all course sections offered in Spring2010,alongwith the names of the instructors teaching the section.If a section hasmore than one instructor,it should appear as many times in the resultas it has instructors.If it does not have any instructor,it should stillappear in the result with the instructor name set to“−”.select course id,section id,ID,decode(name,NULL,’−’,name)from(section natural left outer join teaches)natural left outer join instructorwhere semester=’Spring’and year=2010The query may also be written using the coalesce operator,by re-placing decode(..)by coalesce(name,’−’).A more complex versionof the query can be written using union of join result with anotherquery that uses a subquery tofind courses that do not match;refer toexercise4.2.d.Display the list of all departments,with the total number of instructorsin each department,without using scalar subqueries.Make sure tocorrectly handle departments with no instructors.select dept name,count(ID)from department natural left outer join instructorgroup by dept name4.2Outer join expressions can be computed in SQL without using the SQLouter join operation.To illustrate this fact,show how to rewrite each of thefollowing SQL queries without using the outer join expression.a.select*from student natural left outer join takesb.select*from student natural full outer join takesAnswer:a.select*from student natural left outer join takescan be rewritten as:Exercises21 select*from student natural join takesunionselect ID,name,dept name,tot cred,NULL,NULL,NULL,NULL,NULLfrom student S1where not exists(select ID from takes T1where T1.id=S1.id)b.select*from student natural full outer join takescan be rewritten as:(select*from student natural join takes)union(select ID,name,dept name,tot cred,NULL,NULL,NULL,NULL,NULLfrom student S1where not exists(select ID from takes T1where T1.id=S1.id))union(select ID,NULL,NULL,NULL,course id,section id,semester,year,gradefrom takes T1where not exists(select ID from student S1where T1.id=S1.id))4.3Suppose we have three relations r(A,B),s(B,C),and t(B,D),with allattributes declared as not null.Consider the expressions•r natural left outer join(s natural left outer join t),and•(r natural left outer join s)natural left outer join ta.Give instances of relations r,s and t such that in the result of thesecond expression,attribute C has a null value but attribute D has anon-null value.b.Is the above pattern,with C null and D not null possible in the resultof thefirst expression?Explain why or why not.Answer:a.Consider r=(a,b),s=(b1,c1),t=(b,d).The second expression wouldgive(a,b,NULL,d).b.It is not possible for D to be not null while C is null in the result of thefirst expression,since in the subexpression s natural left outer join t,it is not possible for C to be null while D is not null.In the overallexpression C can be null if and only if some r tuple does not have amatching B value in s.However in this case D will also be null.4.4Testing SQL queries:To test if a query specified in English has been cor-rectly written in SQL,the SQL query is typically executed on multiple test22Chapter4Intermediate SQLdatabases,and a human checks if the SQL query result on each test databasematches the intention of the specification in English.a.In Section Section3.3.3The Natural Joinsubsection.3.3.3we saw an ex-ample of an erroneous SQL query which was intended tofind whichcourses had been taught by each instructor;the query computed thenatural join of instructor,teaches,and course,and as a result uninten-tionally equated the dept name attribute of instructor and course.Givean example of a dataset that would help catch this particular error.b.When creating test databases,it is important to create tuples in refer-enced relations that do not have any matching tuple in the referencingrelation,for each foreign key.Explain why,using an example queryon the university database.c.When creating test databases,it is important to create tuples with nullvalues for foreign key attributes,provided the attribute is nullable(SQL allows foreign key attributes to take on null values,as long asthey are not part of the primary key,and have not been declared asnot null).Explain why,using an example query on the universitydatabase.Hint:use the queries from Exercise Exercise4.1Item.138.Answer:a.Consider the case where a professor in Physics department teaches anElec.Eng.course.Even though there is a valid corresponding entryin teaches,it is lost in the natural join of instructor,teaches and course,since the instructors department name does not match the departmentname of the course.A dataset corresponding to the same is:instructor={(12345,’Guass’,’Physics’,10000)}teaches={(12345,’EE321’,1,’Spring’,2009)}course={(’EE321’,’Magnetism’,’Elec.Eng.’,6)}b.The query in question0.a is a good example for this.Instructors whohave not taught a single course,should have number of sections as0in the query result.(Many other similar examples are possible.)c.Consider the queryselect*from teaches natural join instructor;In the above query,we would lose some sections if teaches.ID is al-lowed to be NULL and such tuples exist.If,just because teaches.ID isa foreign key to instructor,we did not create such a tuple,the error inthe above query would not be detected.4.5Show how to define the view student grades(ID,GP A)giving the grade-point average of each student,based on the query in Exercise??;recallthat we used a relation grade points(grade,points)to get the numeric pointsExercises23 associated with a letter grade.Make sure your view definition correctly handles the case of null values for the grade attribute of the takes relation.Answer:We should not add credits for courses with a null grade;further to to correctly handle the case where a student has not completed any course, we should make sure we don’t divide by zero,and should instead return a null value.We break the query into a subquery thatfinds sum of credits and sum of credit-grade-points,taking null grades into account The outer query divides the above to get the average,taking care of divide by0.create view student grades(ID,GP A)asselect ID,credit points/decode(credit sum,0,NULL,credit sum)from((select ID,sum(decode(grade,NULL,0,credits))as credit sum,sum(decode(grade,NULL,0,credits*points))as credit pointsfrom(takes natural join course)natural left outer join grade pointsgroup by ID)unionselect ID,NULLfrom studentwhere ID not in(select ID from takes))The view defined above takes care of NULL grades by considering the creditpoints to be0,and not adding the corresponding credits in credit sum.The query above ensures that if the student has not taken any course with non-NULL credits,and has credit sum=0gets a gpa of NULL.This avoid the division by0,which would otherwise have resulted.An alternative way of writing the above query would be to use student natural left outer join gpa,in order to consider students who have not taken any course.4.6Complete the SQL DDL definition of the university database of Figure Fig-ure4.8Referential Integrityfigcnt.50to include the relations student,takes, advisor,and prereq.Answer:create table student(ID varchar(5),name varchar(20)not null,dept name varchar(20),tot cred numeric(3,0)check(tot cred>=0),primary key(ID),foreign key(dept name)references departmenton delete set null);24Chapter4Intermediate SQLcreate table takes(ID varchar(5),course id varchar(8),section id varchar(8),semester varchar(6),year numeric(4,0),grade varchar(2),primary key(ID,course id,section id,semester,year),foreign key(course id,section id,semester,year)references sectionon delete cascade,foreign key(ID)references studenton delete cascade);create table advisor(i id varchar(5),s id varchar(5),primary key(s ID),foreign key(i ID)references instructor(ID)on delete set null,foreign key(s ID)references student(ID)on delete cascade);create table prereq(course id varchar(8),prereq id varchar(8),primary key(course id,prereq id),foreign key(course id)references courseon delete cascade,foreign key(prereq id)references course);4.7Consider the relational database of Figure Figure4.11figcnt.53.Give an SQLDDL definition of this database.Identify referential-integrity constraintsthat should hold,and include them in the DDL definition.Answer:create table employee(person name char(20),street char(30),city char(30),primary key(person name))Exercises25create table works(person name char(20),company name char(15),salary integer,primary key(person name),foreign key(person name)references employee,foreign key(company name)references company)create table company(company name char(15),city char(30),primary key(company name))pp create table manages(person name char(20),manager name char(20),primary key(person name),foreign key(person name)references employee,foreign key(manager name)references employee)Note that alternative datatypes are possible.Other choices for not nullattributes may be acceptable.4.8As discussed in Section Section4.4.7Complex Check Conditions and Assertionssubsection.4.4we expect the constraint“an instructor cannot teach sections in two differ-ent classrooms in a semester in the same time slot”to hold.a.Write an SQL query that returns all(instructor,section)combinationsthat violate this constraint.b.Write an SQL assertion to enforce this constraint(as discussed in Sec-tion Section4.4.7Complex Check Conditions and Assertionssubsection.4.4.7,current generation database systems do not support such assertions,although they are part of the SQL standard).Answer:a.select ID,name,section id,semester,year,time slot id,count(distinct building,room number)from instructor natural join teaches natural join sectiongroup by(ID,name,section id,semester,year,time slot id)having count(building,room number)>1Note that the distinct keyword is required above.This is to allow twodifferent sections to run concurrently in the same time slot and are26Chapter4Intermediate SQLtaught by the same instructor,without being reported as a constraintviolation.b.create assertion check not exists(select ID,name,section id,semester,year,time slot id,count(distinct building,room number)from instructor natural join teaches natural join sectiongroup by(ID,name,section id,semester,year,time slot id)having count(building,room number)>1)4.9SQL allows a foreign-key dependency to refer to the same relation,as in thefollowing example:create table manager(employee name char(20),manager name char(20),primary key employee name,foreign key(manager name)references manageron delete cascade)Here,employee name is a key to the table manager,meaning that each em-ployee has at most one manager.The foreign-key clause requires that everymanager also be an employee.Explain exactly what happens when a tuplein the relation manager is deleted.Answer:The tuples of all employees of the manager,at all levels,getdeleted as well!This happens in a series of steps.The initial deletion willtrigger deletion of all the tuples corresponding to direct employees ofthe manager.These deletions will in turn cause deletions of second levelemployee tuples,and so on,till all direct and indirect employee tuples aredeleted.4.10SQL-92provides an n-ary operation called coalesce,which is defined asfollows:coalesce(A1,A2,...,A n)returns thefirst nonnull A i in the listA1,A2,...,A n,and returns null if all of A1,A2,...,A n are null.Let a and b be relations with the schemas A(name,address,title)and B(name,address,salary),respectively.Show how to express a natural full outer joinb using the full outer-join operation with an on condition and the coalesceoperation.Make sure that the result relation does not contain two copiesof the attributes name and address,and that the solution is correct even ifsome tuples in a and b have null values for attributes name or address.Answer:Exercises27 select coalesce(,)as name,coalesce(a.address,b.address)as address,a.title,b.salaryfrom a full outer join b on = anda.address=b.address4.11Some researchers have proposed the concept of marked nulls.A markednull⊥i is equal to itself,but if i=j,then⊥i=⊥j.One application of marked nulls is to allow certain updates through views.Consider the view instructor info(Section Section4.2Viewssection.4.2).Show how you can use marked nulls to allow the insertion of the tuple(99999,“Johnson”,“Music”) through instructor info.Answer:To insert the tuple(99999,“(”Johnson),“Music”)into the view instructor info,we can do the following:instructor←(99999,“Johnson”,⊥k,⊥)∪instructordepartment←(⊥k,“Music′′,⊥)∪departmentsuch that⊥k is a new marked null not already existing in the database.Note:“Music”here is the name of a building and may or may not be related to Music department.。
第四版数据库系统概论课后答案(全)第3章关系数据库标准语言SQL1 .试述sQL 语言的特点。
答:(l)综合统一。
sQL 语言集数据定义语言DDL 、数据操纵语言DML 、数据控制语言DCL 的功能于一体。
(2)高度非过程化。
用sQL 语言进行数据操作,只要提出“做什么”,而无需指明“怎么做”,因此无需了解存取路径,存取路径的选择以及sQL 语句的操作过程由系统自动完成。
(3)面向集合的操作方式。
sQL 语言采用集合操作方式,不仅操作对象、查找结果可以是元组的集合,而且一次插入、删除、更新操作的对象也可以是元组的集合。
(4)以同一种语法结构提供两种使用方式。
sQL 语言既是自含式语言,又是嵌入式语言。
作为自含式语言,它能够独立地用于联机交互的使用方式;作为嵌入式语言,它能够嵌入到高级语言程序中,供程序员设计程序时使用。
(5)语言简捷,易学易用。
2 .试述sQL 的定义功能。
sQL 的数据定义功能包括定义表、定义视图和定义索引。
SQL 语言使用cREATE TABLE 语句建立基本表,ALTER TABLE 语句修改基本表定义,DROP TABLE 语句删除基本表;使用CREATE INDEX 语句建立索引,DROP INDEX 语句删除索引;使用CREATE VIEW 语句建立视图,DROP VIEW 语句删除视图。
3 .用sQL 语句建立第二章习题5 中的4 个表。
答:对于S 表:S ( SNO , SNAME , STATUS , CITY ) ; 建S 表:CREATE TABLE S ( Sno C(2) UNIQUE,Sname C(6) ,Status C(2),City C(4));对于P 表:P ( PNO , PNAME , COLOR , WEIGHT );建P 表:CREATE TABLE P(Pno C(2) UNIQUE,Pname C(6),COLOR C(2),WEIGHT INT);对于J 表:J ( JNO , JNAME , CITY);建J 表:CREATE TABLE J(Jno C(2) UNlQUE,JNAME C(8),CITY C(4))对于sPJ 表:sPJ ( sNo , PNo , JNo , QTY);建SPJ 表:SPJ(SNO,PNO,JNO,QTY)CREATE TABLE SPJ(Sno C(2),PnoC(2),JNO C(2),QTY INT))4.针对上题中建立的 4 个表试用sQL 语言完成第二章习题 5 中的查询。
《数据库系统概论》复习笔记期末复习顺便总结下,书本为⾼等教育出版社的《数据库系统概论》。
第⼀章知识点数据库是长期储存之计算机内的、有组织的、可共享的⼤量数据的集合。
1,数据库数据特点 P4永久存储,有组织,可共享。
2,数据独⽴性及其如何保证 P10,P34逻辑独⽴性:⽤户的应⽤程序与数据库的逻辑结构互相独⽴。
(内模式保证)物理独⽴性:⽤户的应⽤程序与存储在磁盘上的数据库中的数据相互(外模式保证)3,数据模型的组成要素 P13数据结构、数据操作、完整性约束。
4,⽤ER图来表⽰概念模型 P17实体、联系和属性。
联系本⾝也是⼀种实体型,也可以有属性。
第⼆章1,关系的相关概念(如关系、候选码、主属性、⾮主属性) P42-P44单⼀的数据结构---- 关系。
现实世界的实体以及实体间的各种联系均⽤关系来表⽰。
域是⼀组具有相同数据类型的值的集合。
若关系中的某⼀属性组的值能唯⼀地标识⼀个元组,则称该属性组为候选码关系模式的所有属性组是这个关系模式的候选码,称为全码若⼀个关系有多个候选码,则选定其中⼀个为主码候选码的诸属性称为主属性不包含在任何侯选码中的属性称为⾮主属性2,关系代数运算符 P52⾃然连接是在⼴义笛卡尔积R×S中选出同名属性上符合相等条件元组,再进⾏投影,去掉重复的同名属性,组成新的关系。
3,关系代数表达式第三章操作对象操作⽅式创建删除修改模式CREATE SCHEMA DROP SCHEMA表CREATE TABLE DROP TABLE ALTER TABLE 视图CREATE VIEW DROP VIEW索引CREATE INDEX DROP INDEX1,SQL的特点P79-P801. 综合统⼀2. ⾼度⾮过程化3. ⾯向集合的操作⽅式4.以同⼀种语法结构提供多种使⽤⽅式5. 语⾔简洁,易学易⽤2,基本表的定义、删除和修改P84-P87PRIMARY KEYPRIMARY KEY (Sno,Cno)UNIQUEFOREIGN KEY (Cpno) REFERENCES Course(Cno)ALTER TABLE <表名>[ ADD <新列名> <数据类型> [ 完整性约束 ] ][ DROP <完整性约束名> ][ ALTER COLUMN<列名> <数据类型> ];DROP TABLE <表名>[ RESTRICT| CASCADE];3,索引的建⽴与删除P89-P90CREATE [UNIQUE] [CLUSTER] INDEX <索引名>ON <表名>(<列名>[<次序>][,<列名>[<次序>] ]…);唯⼀索引 UNIQUE、⾮唯⼀索引或聚簇索引 CLUSTERDROP INDEX <索引名>;4,数据查询P91-P114唯⼀ DISTINCT确定范围 BETWEEN AND,NOT BETWEEN AND确定集合 IN,NOT IN字符匹配 LIKE,NOT LIKE空值 IS NULL,IS NOT NULL多重条件(逻辑运算) AND,OR,NOTORDER BY⼦句升序: ASC;降序: DESC;缺省值为升序聚集函数:计数COUNT([DISTINCT|ALL] *)COUNT([DISTINCT|ALL] <列名>)计算总和SUM([DISTINCT|ALL] <列名>)计算平均值AVG([DISTINCT|ALL] <列名>)最⼤最⼩值MAX([DISTINCT|ALL] <列名>)MIN([DISTINCT|ALL] <列名>)左外连接 LEFT OUT JOIN XXX ON (XX.A = XXX.A)5,数据更新P115-P118INSERTINTO <表名> [(<属性列1>[,<属性列2 >…)]VALUES (<常量1> [,<常量2>] … )/或⼦查询UPDATE <表名>SET <列名>=<表达式>[,<列名>=<表达式>]…[ WHERE <条件>];DELETEFROM <表名>[ WHERE <条件>];6,视图的P118-126CREATE VIEW<视图名> [(<列名> [,<列名>]…)]AS <⼦查询> --⼦查询不允许含有ORDER BY⼦句和DISTINCT短语 [ WITH CHECK OPTION];DROP VIEW <视图名>;第四章、第五章12,数据库⾓⾊P142-P1433,数据库的三类完整性及其实现P152-P158实体完整性CREATE TABLE中⽤PRIMARY KEY定义参照完整性在CREATE TABLE中⽤FOREIGN KEY短语定义哪些列为外码⽤REFERENCES短语指明这些外码参照哪些表的主码⽤户定义的完整性CREATE TABLE时定义列值⾮空(NOT NULL)列值唯⼀(UNIQUE)检查列值是否满⾜⼀个布尔表达式(CHECK)CONSTRAINT 约束CONSTRAINT <完整性约束条件名>[PRIMARY KEY短语|FOREIGN KEY短语|CHECK短语]使⽤ALTER TABLE语句修改表中的完整性限制可以先删除原来的约束条件,再增加新的约束条件ALTER TABLE StudentDROP CONSTRAINT C1;ALTER TABLE StudentADD CONSTRAINT C1 CHECK (Sno BETWEEN 900000 AND 999999)第六章关系模式是⼀个五元组: R(U, D, DOM, F)12,1NF,2NF,3NF P175-P176如果⼀个关系模式R的所有属性都是不可分的基本数据项,则R∈ 1NF第⼀范式是对关系模式的最起码的要求若R∈1NF,且每⼀个⾮主属性完全函数依赖于码,则R∈ 2NF。
C H A P T E R8Relational Database DesignExercises8.1Suppose that we decompose the schema R=(A,B,C,D,E)into(A,B,C)(A,D,E).Show that this decomposition is a lossless-join decomposition if thefollowing set F of functional dependencies holds:A→BCCD→EB→DE→AAnswer:A decomposition{R1,R2}is a lossless-join decomposition ifR1∩R2→R1or R1∩R2→R2.Let R1=(A,B,C),R2=(A,D,E),and R1∩R2= A.Since A is a candidate key(see PracticeExercise8.6),Therefore R1∩R2→R1.8.2List all functional dependencies satis?ed by the relation of Figure8.17.Answer:The nontrivial functional dependencies are:A→B andC→B,and a dependency they logically imply:AC→ B.There are19trivial functional dependencies of the form?→?,where???.Cdoes not functionally determine A because the?rst and third tuples havethe same C but different A values.The same tuples also show B does notfunctionally determine A.Likewise,A does not functionally determineC because the?rst two tuples have the same A value and different Cvalues.The same tuples also show B does not functionally determine C.8.3Explain how functional dependencies can be used to indicate the fol-lowing:910Chapter8Relational Database DesignA one-to-one relationship set exists between entity sets student andinstructor.A many-to-one relationship set exists between entity sets studentand instructor.Answer:Let Pk(r)denote the primary key attribute of relation r.The functional dependencies Pk(student)→Pk(instructor)andPk(instructor)→Pk(student)indicate a one-to-one relationshipbecause any two tuples with the same value for student must havethe same value for instructor,and any two tuples agreeing oninstructor must have the same value for student.The functional dependency Pk(student)→Pk(instructor)indicates amany-to-one relationship since any student value which is repeatedwill have the same instructor value,but many student values mayhave the same instructor value.8.4Use Armstrong’sa xioms to prove the soundness of the union rule.(Hint:Use the augmentation rule to show that,if?→?,then?→??.Apply theaugmentation rule again,using?→?,and then apply the transitivityrule.)Answer:To prove that:if?→?and?→?then?→??Following the hint,we derive:→?given→??augmentation rule→??union of identical sets→?given→??augmentation rule→??transitivity rule and set union commutativity8.5Use Armstrong’sa xioms to prove the soundness of the pseudotransitiv-ity rule.a xioms of the Pseudotransitivity Rule:Answer:Proof using Armstrong’sif?→?and??→?,then??→?.→?given→??augmentation rule and set union commutativity ?→?given→?transitivity rule8.6Compute the closure of the following set F of functional dependenciesfor relation schema R=(A,B,C,D,E).Exercises11A→BCCD→EB→DE→AList the candidate keys for R.Answer:Note:It is not reasonable to expect students to enumerate all of F+.Some shorthand representation of the result should be acceptable as long as the nontrivial members of F+are found.Starting with A→BC,we can conclude:A→B and A→ C.Since A→B and B→D,A→D(decomposition,transitive) Since A→C D and C D→E,A→E(union,decom-position,transi-tive)Since A→A,we have(re?exive)A→ABC DE from the above steps(union)Since E→A,E→ABC DE(transitive)Since C D→E,C D→ABC DE(transitive)Since B→D and BC→C D,BC→ABC DE (augmentative, transitive)Also,C→C,D→D,B D→D,etc.Therefore,any functional dependency with A,E,BC,or C D on the left hand side of the arrow is in F+,no matter which other attributes appear in the FD.Allow*to represent any set of attributes in R,then F+isB D→B,B D→D,C→C,D→D,B D→B D,B→D,B→B,B→B D,and all FDs of the form A?→?,BC?→?,C D?→?,E?→?where?is any subset of{A,B,C,D,E}.Thecandidate keys are A,BC,C D,and E.8.7Using the functional dependencies of Practice Exercise8.6,compute thecanonicalcover F c.Answer:The given set of FDs F is:-A→BCCD→EB→DE→AThe left side of each FD in F is unique.Also none of the attributes in the left side or right side of any of the FDs is extraneous.Therefore the canonical cover F c is equal to F.12Chapter8Relational Database Design8.8Consider the algorithm in Figure8.18to compute?+.Show that thisalgorithm is more ef?cient than the one presented in Figure8.8(Sec-tion8.4.2)and that it computes?+correctly.Answer:The algorithm is correct because:If A is added to result then there is a proof that?→ A.To see this,observe that?→?trivially so?is correctly part of result.IfA∈?is added to result there must be some FD?→?such thatA∈?and?is already a subset of result.(Otherwise f dcountwould be nonzero and the if condition would be false.)A full proofcan be given by induction on the depth of recursion for an executionof addin,but such a proof can be expected only from students witha good mathematical background.If A∈?+,then A is eventually added to result.We prove this byinduction on the length of the proof of?→A using Armstrong’saxioms.First observe that if procedure addin is called with someargument?,all the attributes in?will be added to result.Also if aparticular FD’sfdcount becomes0,all the attributes in its tail willde?nitely be added to result.The base case of the proof,A∈??A∈?+,is obviously true because the?rst call to addinhas the argument?.The inductive hypotheses is that if?→A canbe proved in n steps or less then A∈result.If there is a proof inn+1steps that?→A,then the last step was an application ofeither re?exivity,augmentation or transitivity on a fact?→?proved in n or fewer steps.If re?exivity or augmentation was usedin the(n+1)st step,A must have been in result by the end of the n thstep itself.Otherwise,by the inductive hypothesis??result.Therefore the dependency used in proving?→?,A∈?willhave f dcount set to0by the end of the n th step.Hence A will beadded to result.To see that this algorithm is more ef?cient than the one presented inthe chapter note that we scan each FD once in the main program.Theresulting array appears has size proportional to the size of the givenFDs.The recursive calls to addin result in processing linear in the sizeof appears.Hence the algorithm has time complexity which is linear inthe size of the given FDs.On the other hand,the algorithm given in thetext has quadratic time complexity,as it may perform the loop as manytimes as the number of FDs,in each loop scanning all of them once.8.9Given the database schema R(a,b,c),and a relation r on the schema R,write an SQL query to test whether the functional dependency b→cholds on relation r.Also write an SQL assertion that enforces the func-tional dependency.Assume that no null values are present.(Althoughpart of the SQL standard,such assertions are not supported by anydatabase implementation currently.)Answer:Exercises13a.The query is given below.Its result is non-empty if and only ifb→c does not hold on r.select bfrom rgroup by bhaving count(distinct c)>1b.create assertion b to c check(not exists(select bfrom rgroup by bhaving count(distinct c)>1))8.10Our discussion of lossless-join decomposition implicitly assumed thatattributes on the left-hand side of a functional dependency cannot take on null values.What could go wrong on decomposition,if this property is violated?Answer:The natural join operator is de?ned in terms of the cartesian product and the selection operator.The selection operator,gives unknown for any query on a null value.Thus,the natural join excludes all tuples with null values on the common attributes from the?nal result.Thus, the decomposition would be lossy(in a manner different from the usual case of lossy decomposition),if null values occur in the left-hand side of the functional dependency used to decompose the relation.(Null values in attributes that occur only in the right-hand side of the functionaldependency do not cause any problems.)8.11In the BCNF decomposition algorithm,suppose you use a functional de-pendency?→?to decompose a relation schema r(?,?,?)into r1(?,?) and r2(?,?).a.What primary and foreign-key constraint do you expect to holdon the decomposed relations?b.Give an example of an inconsistency that can arise due to anerroneous update,if the foreign-key constraint were not enforcedon the decomposed relations above.c.When a relation is decomposed into3NF using the algorithm inSection8.5.2,what primary and foreign key dependencies wouldyou expect will hold on the decomposed schema?14Chapter8Relational Database DesignAnswer:a.?should be a primary key for r1,and?should be the foreign keyfrom r2,referencing r1.b.If the foreign key constraint is not enforced,then a deletion of atuple from r1would not have a corresponding deletion from thereferencing tuples in r2.Instead of deleting a tuple from r,thiswould amount to simply setting the value of?to null in sometuples.c.For every schema r i(??)added to the schema because of a rule→?,?should be made the primary key.Also,a candidate key?for the original relation is located in some newly created relationr k,and is a primary key for that relation.Foreign key constraints are created as follows:for each relationr i created above,if the primary key attributes of r i also occur inany other relation r j,then a foreign key constraint is created fromthose attributes in r j,referencing(the primary key of)r i.8.12Let R1,R2,...,R n be a decomposition of schema U.Let u(U)be a rela-(u).Show thattion,and let r i=RIu?r11r21···1r nAnswer:Consider some tuple t in u.(u)implies that t[R i]∈r i,1≤i≤n.Thus,Note that r i=Rit[R1]1t[R2]1...1t[R n]∈r11r21...1r nBy the de?nition of natural join,t[R1]1t[R2]1...1t[R n]=(??(t[R1]×t[R2]×...×t[R n])) where the condition?is satis?ed if values of attributes with the samename in a tuple are equal and where?=U.The cartesian productof single tuples generates one tuple.The selection process is satis?edbecause all attributes with the same name must have the same valuesince they are projections from the same tuple.Finally,the projectionclause removes duplicate attribute names.By the de?nition of decomposition,U=R1∪R2∪...∪R n,which meansthat all attributes of t are in t[R1]1t[R2]1...1t[R n].That is,t is equalto the result of this join.Since t is any arbitrary tuple in u,u?r11r21...1r n8.13Show that the decomposition in Practice Exercise8.1is not a dependency-preserving decomposition.Answer:The dependency B→D is not preserved.F1,the restrictionof F to(A,B,C)is A→ABC,A→AB,A→AC,A→BC,Exercises15 A→B,A→C,A→A,B→B,C→C,AB→AC,AB→ABC,AB→BC,AB→AB,AB→A,AB→B,AB→C,AC(same as AB),BC(same as AB),ABC(same as AB).F2,the restriction of F to (C,D,E)is A→ADE,A→AD,A→AE,A→DE,A→A, A→D,A→E,D→D,E(same as A),AD,AE,DE,ADE(same as A).(F1∪F2)+is easily seen not to contain B→D since the only FD in F1∪F2with B as the left side is B→B,a trivial FD.We shall see in Practice Exercise8.15that B→D is indeed in F+.Thus B→D is not preserved.Note that C D→ABC DE is also not preserved.A simpler argument is as follows:F1contains no dependencies with D on the right side of the arrow.F2contains no dependencies withB on the left side of the arrow.Therefore for B→D to be preserved there mustbe an FD B→?in F+1and?→D in F+2(so B→D would followby transitivity).Since the intersection of the two schemes is A,?= A.Observe that B→A is not in F+1since B+=B D.8.14Show that it is possible to ensure that a dependency-preserving decom-position into3NF is a lossless-join decomposition by guaranteeing that at least one schema contains a candidate key for the schema being decom-posed.(Hint:Show that the join of all the projections onto the schemas of the decomposition cannot have more tuples than the original relation.)Answer:Let F be a set of functional dependencies that hold on a schema R.Let?={R1,R2,...,R n}be a dependency-preserving3NF decompo-sition of R.Let X be a candidate key for R.Consider a legal instance r of R.Let j=X(r)1R1(r)1R2(r) (1)R n(r).We want to prove that r=j.We claim that if t1and t2are two tuples in j such that t1[X]=t2[X],then t1=t2.To prove this claim,we use the following inductive argument–Let F′=F1∪F2∪...∪F n,where each F i is the restriction of F to the schema R i in?.Consider the use of the algorithm given in Figure8.8to compute the closure of X under F′.We use induction on the number of times that the f or loop in this algorithm is executed.Basis:In the?rst step of the algorithm,result is assigned to X,and hence given that t1[X]=t2[X],we know that t1[result]=t2[result] is true.Induction Step:Let t1[result]=t2[result]be true at the end of thek th execution of the f or loop.Suppose the functional dependency considered in the k+1thexecution of the f or loop is?→?,and that??result.??result implies that t1[?]=t2[?]is true.The facts that?→?holds forsome attribute set R i in?,and that t1[R i]and t2[R i]are in R i(r)imply that t1[?]=t2[?]is also true.Since?is now added to result by the algorithm,we know that t1[result]=t2[result]is true at the end of the k+1th execution of the f or loop.16Chapter8Relational Database DesignSince?is dependency-preserving and X is a key for R,all attributes in Rare in result when the algorithm terminates.Thus,t1[R]=t2[R]is true,that is,t1=t2–as claimed earlier.Our claim implies that the size of X(j)is equal to the size of j.Notealso that X(j)=X(r)=r(since X is a key for R).Thus we haveproved that the size of j equals that of ing the result of PracticeExercise8.12,we know that r?j.Hence we conclude that r=j.Note that since X is trivially in3NF,?∪{X}is a dependency-preservinglossless-join decomposition into3NF.8.15Give an example of a relation schema R′and set F′of functional depen-dencies such that there are at least three distinct lossless-join decompo-sitions of R′into BCNF.Answer:Given the relation R′=(A,B,C,D)the set of functionaldependencies F′=A→B,C→D,B→C allows three distinctBCNF decompositions.R1={(A,B),(C,D),(B,C)}is in BCNF as isR2={(A,B),(C,D),(A,C)}R2={(A,B),(C,D),(A,C)}R3={(B,C),(A,D),(A,B)}8.16Let a prime attribute be one that appears in at least one candidate key.Let?and?be sets of attributes such that?→?holds,but?→?does not hold.Let A be an attribute that is not in?,is not in?,and forwhich?→A holds.We say that A is transitively dependent on?.Wecan restate our de?nition of3NF as follows:A relation schema R is in3NF with respect to a set F of functional dependencies if there are nononprime attributes A in R for which A is transitively dependent on akey for R.Show that this new de?nition is equivalent to the original one.Answer:Suppose R is in3NF according to the textbook de?nition.Weshow that it is in3NF according to the de?nition in the exercise.Let A bea nonprime attribute in R that is transitively dependent on a key?forR.Then there exists??R such that?→A,?→?,A∈?,A∈,and?→?does not hold.But then?→A violates the textbookde?nition of3NF sinceA∈?implies?→A is nontrivialSince?→?does not hold,?is not a superkeyA is not any candidate key,since A is nonprimeExercises17Now we show that if R is in3NF according to the exercise de?nition,it is in3NF according to the textbook de?nition.Suppose R is not in3NF according the the textbook de?nition.Then there is an FD?→?that fails all three conditions.Thus?→?is nontrivial.?is not a superkey for R.Some A in?-?is not in any candidate key.This implies that A is nonprime and?→ A.Let?be a candidate key for R.Then?→?,?→?does not hold(since?is not a superkey), A∈?,and A∈?(since A is nonprime).Thus A is transitivelydependent on?,violating the exercise de?nition.8.17A functional dependency?→?is called a partial dependency if thereis a proper subset?of?such that?→?.We say that?is partially dependent on?.A relation schema R is in second normal form(2NF)if each attribute A in R meets one of the following criteria:It appears in a candidate key.It is not partially dependent on a candidate key.Show that every3NF schema is in2NF.(Hint:Show that every partial dependency is a transitive dependency.)Answer:Referring to the de?nitions in Practice Exercise8.16,a relation schema R is said to be in3NF if there is no non-prime attribute A in R for which A is transitively dependent on a key for R.We can also rewrite the de?nition of2NF given here as:“Arelation schema R is in2NF if no non-prime attribute A is partiallydependent on any candidate key for R.”To prove that every3NF schema is in2NF,it suf?ces to show that if a non-prime attribute A is partially dependent on a candidate key?,thenA is also transitively dependent on the key?.Let A be a non-prime attribute in R.Let?be a candidate key for R.Suppose A is partially dependent on?.From the de?nition of a partial dependency,we know that for someproper subset?of?,?→ A.Since???,?→?.Also,?→?does not hold,since?is acandidate key.Finally,since A is non-prime,it cannot be in either?or?.Thus we conclude that?→A is a transitive dependency.Hence we have proved that every3NF schema is also in2NF.8.18Give an example of a relation schema R and a set of dependencies suchthat R is in BCNF but is not in4NF.18Chapter8Relational Database DesignAnswer:R(A,B,C)A→→BExercises19result:=?;/*fdcount is an array whose ith element contains the number of attributes on the left side of the ith FD that arenot yet known to be in?+*/for i:=1to|F|dobeginlet?→?denote the ith FD;fdcount[i]:=|?|;end/*appears is an array with one entry for each attribute.The entry for attribute A is a list of integers.Each integeri on the list indicates that A appears on the left sideof the i th FD*/for each attribute A dobeginappears[A]:=NI L;for i:=1to|F|dobeginlet?→?denote the ith FD;if A∈?then add i to appears[A];endendaddin(?);return(result);procedure addin(?);for each attribute A in?dobeginif A∈result thenbeginresult:=result∪{A};for each element i of appears[A]dobeginfdcount[i]:=fdcount[i]-1;if fdcount[i]:=0thenbeginlet?→?denote the ith FD;addin(?);endendendendFigure8.18.An algorithm to compute?+.。
数据库系统概念第六版课后习题部分答案2sC H A P T E R2Introduction to the Relational ModelPractice Exercises2.1Consider the relational database of Figure??.What are the appropriateprimary keys?Answer:The answer is shown in Figure2.1,with primary keys under-lined.2.2Consider the foreign key constraint from the dept name attribute of in-structor to the department relation.Give examples of inserts and deletes tothese relations,which can cause a violation of the foreign key constraint.Answer:Inserting a tuple:(10111,Ostrom,Economics,110,000)into the instructor table,where the department table does not have thedepartment Economics,would violate the foreign key constraint.Deleting the tuple:(Biology,Watson,90000)from the department table,where at least one student or instructortuple has dept name as Biology,would violate the foreign key con-straint.employee(person name,street,city)works(person name company name,salary)company(company name,city)Figure2.1Relational database for Practice Exercise2.1.12Chapter2Introduction to the Relational Model2.3Consider the time slot relation.Given that a particular time slot can meetmore than once in a week,explain why day and start time are part of theprimary key of this relation,while end time is not.Answer:The attributes day and start time are part of the primary keysince a particular class will most likely meet on several different days,and may even meet more than once in a day.However,end time is notpart of the primary key since a particular class that starts at a particulartime on a particular day cannot end at more than one time.2.4In the instance of instructor shown in Figure??,no two instructors havethe same name.From this,can we conclude that name can be used as asuperkey(or primary key)of instructor?Answer:No.For this possible instance of the instructor table the namesare unique,but in general this may not be always the case(unless theuniversity has a rule that two instructors cannot have the same name,which is a rather unlikey scenario).2.5What is the result of?rst performing the cross product of student andadvisor,and then performing a selection operation on the result with thepredicate s id=ID?(Using the symbolic notation of relational algebra,this query can be written as?s id=I D(student×advisor).)Answer:The result attributes include all attribute values of studentfollowed by all attributes of advisor.The tuples in the result are asfollows.For each student who has an advisor,the result has a rowcontaining that students attributes,followed by an s id attribute identicalto the students ID attribute,followed by the i id attribute containing theID of the students advisor.Students who do not have an advisor will not appear in the result.Astudent who has more than one advisor will appear a correspondingnumber of times in the result.2.6Consider the following expressions,which use the result ofa relationalalgebra operation as the input to another operation.For each expression,explain in words what the expression does.a.?year≥2009(takes)1studentb.?year≥2009(takes1student)c. ID,name,course id(student1takes)Answer:a.For each student who takes at least one course in2009,displaythe students information along with the information about whatcourses the student took.The attributes in the result are:ID,name,dept name,tot cred,course id,section id,semester,year,gradeb.Same as(a);selection can be done before the join operation.c.Provide a list of consisting ofExercises3ID,name,course idof all students who took any course in the university.2.7Consider the relational database of Figure??.Give an expression in therelational algebra to express each of the following queries:a.Find the names of all employees who live in city“Miami”.b.Find the names of all employees whose salary is greater than$100,000.c.Find the names of all employees who live in“Miami”and whosesalary is greater than$100,000.Answer:a. name(?city=“Miami”(employee))b. name(?salary>100000(employee))c. name(?city=“Miami”∧salary>100000(employee))2.8Consider the bank database of Figure??.Give an expression in therelational algebra for each of the following queries.a.Find the names of all branches located in“Chicago”.b.Find the names of all borrowers who have a loan in branch“Down-town”.Answer:a. branch name(?branch city=“Chicago”(branch))b. customer name(?branch name=“Downtown”(borro w er1loan))。