0%

CSCI3170 Midterm Review

ER model

a collection of entities and relationships

Entity

  • an object in the real world that is distinguishable from other objects. ( can be same category)
  • described using a set of attributes.

Entity set is a collection of entities of the same type. All entities in a given entity set have the same attributes ( the values may be different).

Domain: domain of possible values of each attribute in an entity set.

Key

superkey: any set of attributes which can uniquely identify an entity

key (candidate key): a minimal set of attributes whose values can uniquely identify an entity in the set. (should depend on the real life possibility 即使还有entity 加入/删除, 这个key永远identify an entity)

primary key: a candidate key chosen to serve as the key for the entity set

Relationships

an association among two or more entities.

eg. Teach: (John,CENG4567) ,(David, CSCI1234)

relationship set: a set of similar relationships: {(John,CENG4567) ,(David, CSCI1234)}

relationships can also have descriptive attributes

Recursive Relationship: relationship两边entity其实一样

One to many

图片2

Tenary:aggregation把3个变成2个

One to One

similar with one to many, but have two arrows

Key for relationship set
  • Many-many union of the primary keys for E1,…Ek
  • One-many 找many的那个primary key就行。(知道一个孩子的ID就能确定是哪个‘mother-of’)
  • One-one 随便找个primary key就行。
Participation constraint
  • total –each entity in the entity set must participate in at least one relationship. (对应的entity——relation的线加粗)
  • partial- 不需要

Weak entities

strong entity , an entity which has a super key

weak entity 相反

若需要identify一个weak entity: –by considering some of its attributes in conjunction with the primary key of employee (identifying owner).

partial key 例如:教授家属需要identify就需要(教授id,家属名字)

图片3

Class hierarchies

Overlap constraint 两个subclass能否包含相同entity (default:no)

Cover constraint 是否subclass包含了superclass里所有entity ( default:no)

Aggregation

A relationship between a collection of entities and relationships

图片3

Relational Model

main construct: a set of relations

relation consists of relation schema and relation instance

Relation instance

Simply a table with rows and columns

  • row = tuple/record, no two rows are identical
  • column = field/attribute

Relation schema

specifies relation’s name, name of each field(primary key should be underlined), domain of each field

Degree/arity of a relation: number of fields

Cardinality of a relation instance: number of tuples

Relational database : a collection of relations

Instance: a collection of relation instances (one per relation schema).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
CREATE TABLE Students
(sid CHAR(20),
name CHAR(30),
login CHAR(20),
age INTEGER,
gpa REAL)

INSERT
INTO Students (sid, name, login, age, gpa)
VALUES (‘53688’, ‘Smith’, ‘smith@ee’, 18, 3.2)

DELETE
FROM Students S
WHERE S.name = ‘Smith’

UPDATE Students S
SET S.age = S.age + 1, S.gpa = S.gpa – 1
WHERE S.sid =53688

Integrity constraints

restricts the data that can be stored in an instance of the database on a database schema

  • Specified when the schema is defined.
  • **Checked **when relations are modified.

Key constraints

Certain minimal subset of the fields (candidate key) of a relation is a unique identifier for a tuple (instance).

设计者可以自己identify a primary key(比如index)

1
2
3
4
5
6
7
8
CREATE TABLE Students
(sid CHAR(20),
name CHAR(30),
login CHAR(20),
age INTEGER,
gpa REAL,
UNIQUE (name, age),//declare a key
CONSTRAINT StudentsKey PRIMARY KEY ( sid)) //StudentKey: Constraint name(opt)

Foreign key constraints

Sometimes, the information stored in a relation is linked to the information stored in another relation.

Enrolled(sid: string, cid: string, grade: string) sid就是Enrolled这个relation的 foreign key and refer to Students

Formally, a foreign key is a set of fields (the primary key of s) in one relation r that is used to “refer” to a tuple in another relation s.

1
2
3
4
5
6
CREATE TABLE Enrolled (
sid CHAR(20),
cid CHAR(20),
grade CHAR(10),
PRIMARY KEY (sid, cid),
FOREIGN KEY (sid) REFERENCES Students)

four options on DELETE and UPDATE.

  • NO ACTION (DEFAULT) rejected
  • CASCADE(级联) 如果被删除,那么相关联的所有row也全被删除
  • SET DEFAULT 如果被删除,那么相关联的所有row设置为DEFAULT
  • SET NULL

ER to Relational

  • entity sets -> tables

  • relationship sets(without constraints) -> tables,attributes包括

    • 每个entity的primary key
    • relationship set的descriptive attributes图片1
  • with constraints

    • arrow: arrow对应的m个entity,因此又m个candidate key,随便选一个作为primary key
  • weak entity

    将weak entity及对应relation 统一为一个table 并且(ON DELETE CASCADE)

  • Class hierarchies

    • different relations (父母孩子都有)

      more general。

    • (仅有孩子)

      not always possible。can be accessible easily

  • Aggregation

Relational Algebra

Baisc Operators

selection $\sigma$

a boolean combination (an expression using logical connectives $\wedge \vee$)

图片1

Projection $\Pi(fileds)$

extract columns from a relation

duplicated row will be eliminated

图片2

Union $U$

must be union-compatible:

  • same number of fields
  • corresponding fields have the same domains
Set Difference $-$

also union-compatible

Intersection $\cap$

also union-compatible

$R\cap S = R-(R-S)$

Rename $\rho$

$\rho (R(F),E) \ or\ \rho(R,E)$

E: arbitrary relation algebra ,F:renaming list, R= E 除了F rename的东西

Cartesian product (cross product) $\times$

图片3

Join $\Join$

Join can be defined as a cross-product followed by selections and sometimes with projections.

  • Join condition $\Join_c =\sigma_c(R\times S)$
  • Equi join c is R.name1 = S.name2, S.name2 will be droped in the final resulting relation
  • Natural join: C 是R S所有fields相同的

图片4

Example

图片5

Division /

图片6

Example

图片7

SQL

图片1

  1. 先计算 relation list的cross product
  2. 考虑是否满足 qualifications
  3. 选取target list里面的attributes
  4. 如果有 DISTINCT, 把duplicated rows删掉(默认不删)

Natural Join

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
SELECT  S.sname
FROM Sailors S, Reserves R
WHERE S.sid=R.sid AND R.bid=103
=
SELECT S.sname
FROM Sailors S NATURAL JOIN Reserves R
WHERE R.bid=103

#IN operator
SELECT B.bname
FROM Boats B
WHERE B.color IN (‘red’, ‘blue’, ’green’)

#Nested Queries
Find names of sailors who’ve reserved boat #103:
SELECT S.sname
FROM Sailors S
WHERE S.sid IN (SELECT R.sid
FROM Reserves R
WHERE R.bid=103) #inner function can use the relation in outer function
> ANY : 比某一些大就行

Expressions and string

LIKE for string matching. _ = any one character, % = 0 or more arbitrary characters

Aggregate Operators

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
COUNT([DISTINCT] A)
SUM ( [DISTINCT] A)
AVG ([DISTINCT] A)
MAX (A)
MIN (A) A: any column of a relation
# cannot be netsted! MIN(AVG(A)) is not allowed!
eg.Find the name and age of the oldest sailor
SELECT S.sname, S.age
FROM Sailors S
WHERE S.age =
(SELECT MAX (S2.age)
FROM Sailors S2)
eg.Find the names of sailors who are older than the oldest sailor with a rating of 10.
SELECT S.name
FROM Sailors S
WHERE S.age > ( SELECT MAX (S2.age)
FROM Sailors S2
WHERE S2.rating = 10)

GROUP BY and Having

1
2
3
4
5
6
7
8
9
10
eg Find the age of the youngest sailor for each rating level.
SELECT S.rating, MIN (S.age)
FROM Sailors S
GROUP BY S.rating

SELECT [DISTINCT] target-list STEP 2 //target list = attribute names + terms with ahhregate operations, The attribute names must be a subset of grouping-list.
FROM relation-list STEP 1
WHERE qualification STEP 2
GROUP BY grouping-list STEP3
HAVING group-qualification STEP 4

图片2

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
SELECT  R.day
FROM Reserves R
GROUP BY R.day
HAVING COUNT(DISTINCT R.sid) = 1

Example: Find the average age of sailors for each rating level that has at least two sailors.
SELECT S.rating, AVG(S.age) AS avgage
FROM Sailor S
GROUP BY S.rating
HAVING COUNT (*) > 1

Example:Find those ratings for which the average age is the minimum over all ratings
SELECT Temp.rating, Temp.avgage
FROM (SELECT S.rating, AVG (S.age) AS avgage
FROM Sailors S
GROUP BY S.rating) AS Temp
WHERE Temp.avgage = (SELECT MIN (Temp.avgage)
FROM Temp)

Division

图片3

CREATE VIEW

图片4

Delete view:DROP VIEW Temp

Outer Join

S LEFT OUTER JOIN R: S 中不对应 R的也会加进去。 只是R的field就是 NULL

FULL OUTER JOIN: both