ER model
a collection of entities and relationships
Entity
- an object in the real world that is distinguishable from other objects. ( can be same category)
- described using a set of attributes.
Entity set is a collection of entities of the same type. All entities in a given entity set have the same attributes ( the values may be different).
Domain: domain of possible values of each attribute in an entity set.
Key
superkey: any set of attributes which can uniquely identify an entity
key (candidate key): a minimal set of attributes whose values can uniquely identify an entity in the set. (should depend on the real life possibility 即使还有entity 加入/删除, 这个key永远identify an entity)
primary key: a candidate key chosen to serve as the key for the entity set
Relationships
an association among two or more entities.
eg. Teach: (John,CENG4567) ,(David, CSCI1234)
relationship set: a set of similar relationships: {(John,CENG4567) ,(David, CSCI1234)}
relationships can also have descriptive attributes
Recursive Relationship: relationship两边entity其实一样
One to many
Tenary:aggregation把3个变成2个
One to One
similar with one to many, but have two arrows
Key for relationship set
- Many-many union of the primary keys for E1,…Ek
- One-many 找many的那个primary key就行。(知道一个孩子的ID就能确定是哪个‘mother-of’)
- One-one 随便找个primary key就行。
Participation constraint
- total –each entity in the entity set must participate in at least one relationship. (对应的entity——relation的线加粗)
- partial- 不需要
Weak entities
strong entity , an entity which has a super key
weak entity 相反
若需要identify一个weak entity: –by considering some of its attributes in conjunction with the primary key of employee (identifying owner).
partial key 例如:教授家属需要identify就需要(教授id,家属名字)
Class hierarchies
Overlap constraint 两个subclass能否包含相同entity (default:no)
Cover constraint 是否subclass包含了superclass里所有entity ( default:no)
Aggregation
A relationship between a collection of entities and relationships
Relational Model
main construct: a set of relations
relation consists of relation schema and relation instance
Relation instance
Simply a table with rows and columns
- row = tuple/record, no two rows are identical
- column = field/attribute
Relation schema
specifies relation’s name, name of each field(primary key should be underlined), domain of each field
Degree/arity of a relation: number of fields
Cardinality of a relation instance: number of tuples
Relational database : a collection of relations
Instance: a collection of relation instances (one per relation schema).
1 | CREATE TABLE Students |
Integrity constraints
restricts the data that can be stored in an instance of the database on a database schema
- Specified when the schema is defined.
- **Checked **when relations are modified.
Key constraints
Certain minimal subset of the fields (candidate key) of a relation is a unique identifier for a tuple (instance).
设计者可以自己identify a primary key(比如index)
1 | CREATE TABLE Students |
Foreign key constraints
Sometimes, the information stored in a relation is linked to the information stored in another relation.
Enrolled(sid: string, cid: string, grade: string)
sid就是Enrolled这个relation的 foreign key and refer to Students
Formally, a foreign key is a set of fields (the primary key of s) in one relation r that is used to “refer” to a tuple in another relation s.
1 | CREATE TABLE Enrolled ( |
four options on DELETE and UPDATE.
- NO ACTION (DEFAULT) rejected
- CASCADE(级联) 如果被删除,那么相关联的所有row也全被删除
- SET DEFAULT 如果被删除,那么相关联的所有row设置为DEFAULT
- SET NULL
ER to Relational
entity sets -> tables
relationship sets(without constraints) -> tables,attributes包括
- 每个entity的primary key
- relationship set的descriptive attributes
with constraints
- arrow: arrow对应的m个entity,因此又m个candidate key,随便选一个作为primary key
weak entity
将weak entity及对应relation 统一为一个table 并且(ON DELETE CASCADE)
Class hierarchies
different relations (父母孩子都有)
more general。
(仅有孩子)
not always possible。can be accessible easily
Aggregation
Relational Algebra
Baisc Operators
selection $\sigma$
a boolean combination (an expression using logical connectives $\wedge \vee$)
Projection $\Pi(fileds)$
extract columns from a relation
duplicated row will be eliminated
Union $U$
must be union-compatible:
- same number of fields
- corresponding fields have the same domains
Set Difference $-$
also union-compatible
Intersection $\cap$
also union-compatible
$R\cap S = R-(R-S)$
Rename $\rho$
$\rho (R(F),E) \ or\ \rho(R,E)$
E: arbitrary relation algebra ,F:renaming list, R= E 除了F rename的东西
Cartesian product (cross product) $\times$
Join $\Join$
Join can be defined as a cross-product followed by selections and sometimes with projections.
- Join condition $\Join_c =\sigma_c(R\times S)$
- Equi join c is R.name1 = S.name2, S.name2 will be droped in the final resulting relation
- Natural join: C 是R S所有fields相同的
Example
Division /
Example
SQL
- 先计算 relation list的cross product
- 考虑是否满足 qualifications
- 选取target list里面的attributes
- 如果有 DISTINCT, 把duplicated rows删掉(默认不删)
Natural Join
1 | SELECT S.sname |
Expressions and string
LIKE for string matching. _
= any one character, %
= 0 or more arbitrary characters
Aggregate Operators
1 | COUNT([DISTINCT] A) |
GROUP BY and Having
1 | eg Find the age of the youngest sailor for each rating level. |
1 | SELECT R.day |
Division
CREATE VIEW
Delete view:DROP VIEW Temp
Outer Join
S LEFT OUTER JOIN R: S 中不对应 R的也会加进去。 只是R的field就是 NULL
FULL OUTER JOIN: both