0% found this document useful (0 votes)
3 views44 pages

Dbms Notes

A database is a structured collection of inter-related data managed by a Database Management System (DBMS), which facilitates efficient data operations such as retrieval, insertion, and deletion. DBMS provides various functionalities including data definition, updating, retrieval, and user administration, while also ensuring data security and consistency. The document also discusses the differences between DBMS and traditional file systems, various data models, and the structure of DBMS, highlighting the importance of data abstraction and independence.

Uploaded by

gunigantibhanu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views44 pages

Dbms Notes

A database is a structured collection of inter-related data managed by a Database Management System (DBMS), which facilitates efficient data operations such as retrieval, insertion, and deletion. DBMS provides various functionalities including data definition, updating, retrieval, and user administration, while also ensuring data security and consistency. The document also discusses the differences between DBMS and traditional file systems, various data models, and the structure of DBMS, highlighting the importance of data abstraction and independence.

Uploaded by

gunigantibhanu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 44

What is Database

The database is a collection of inter-related data which is used to retrieve, insert and delete the
data efficiently. It is also used to organize the data in the form of a table, schema, views, and
reports, etc.

For example: The college Database organizes the data about the admin, staff, students and
faculty etc.

Using the database, you can easily retrieve, insert, and delete the information.

Database Management System


o Database management system is a software which is used to manage the database. For
example: MySQL, Oracle, etc are a very popular commercial database which is used in different
applications.
o DBMS provides an interface to perform various operations like database creation, storing data in
it, updating data, creating a table in the database and a lot more.
o It provides protection and security to the database. In the case of multiple users, it also maintains
data consistency.

DBMS allows users the following tasks:

o Data Definition: It is used for creation, modification, and removal of definition that defines the
organization of data in the database.
o Data Updation: It is used for the insertion, modification, and deletion of the actual data in the
database.
o Data Retrieval: It is used to retrieve the data from the database which can be used by
applications for various purposes.
o User Administration: It is used for registering and monitoring users, maintain data integrity,
enforcing data security, dealing with concurrency control, monitoring performance and
recovering information corrupted by unexpected failure.

Characteristics of DBMS
o It uses a digital repository established on a server to store and manage the information.
o It can provide a clear and logical view of the process that manipulates data.
o DBMS contains automatic backup and recovery procedures.
o It contains ACID properties which maintain data in a healthy state in case of failure.
o It can reduce the complex relationship between data.
o It is used to support manipulation and processing of data.
o It is used to provide security of data.
o It can view the database from different viewpoints according to the requirements of the user.

Advantages of DBMS
o Controls database redundancy: It can control data redundancy because it stores all the data in
one single database file and that recorded data is placed in the database.
o Data sharing: In DBMS, the authorized users of an organization can share the data among
multiple users.
o Easily Maintenance: It can be easily maintainable due to the centralized nature of the database
system.
o Reduce time: It reduces development time and maintenance need.
o Backup: It provides backup and recovery subsystems which create automatic backup of data
from hardware and software failures and restores the data if required.
o multiple user interface: It provides different types of user interfaces like graphical user
interfaces, application program interfaces

Disadvantages of DBMS
o Cost of Hardware and Software: It requires a high speed of data processor and large memory size
to run DBMS software.
o Size: It occupies a large space of disks and large memory to run them efficiently.
o Complexity: Database system creates additional complexity and requirements.

Higher impact of failure: Failure is highly impacted the database because in most
of the organization, all the data stored in a single database and if the database is
damaged due to electric failure or database corruption then the data may be lost
forever. Difference between File System and DBMS

File System Approach


File based systems were an early attempt to computerize the manual system. It is
also called a traditional based approach in which a decentralized approach was
taken where each department stored and controlled its own data with the help of a
data processing specialist. The main role of a data processing specialist was to
create the necessary computer file structures, and also manage the data within
structures and design some application programs that create reports based on file
data.

In the above figure:

Consider an example of a student's file system. The student file will contain information
regarding the student (i.e. roll no, student name, course etc.). Similarly, we have a subject file
that contains information about the subject and the result file which contains the information
regarding the result.

Some fields are duplicated in more than one file, which leads to data redundancy. So to
overcome this problem, we need to create a centralized system, i.e. DBMS approach.

DBMS:
A database approach is a well-organized collection of data that are related in a meaningful way
which can be accessed by different users but stored only once in a system. The various
operations performed by the DBMS system are: Insertion, deletion, selection, sorting etc.
In the above figure,

In the above figure, duplication of data is reduced due to centralization of data.

There are the following differences between DBMS and File systems:

Basis DBMS Approach File System Approach

Meaning DBMS is a collection of data. In DBMS, The file system is a collection of data. In th
the user is not required to write the system, the user has to write the procedure
procedures. for managing the database.

Sharing of data Due to the centralized approach, data Data is distributed in many files, and it ma
sharing is easy. be of different formats, so it isn't easy t
share data.

Data Abstraction DBMS gives an abstract view of data that The file system provides the detail of the dat
hides the details. representation and storage of data.

Security and DBMS provides a good protection It isn't easy to protect a file under the fil
Protection mechanism. system.

Recovery DBMS provides a crash recovery The file system doesn't have a cras
Mechanism mechanism, i.e., DBMS protects the user mechanism, i.e., if the system crashes whil
from system failure. entering some data, then the content of th
file will be lost.

Manipulation DBMS contains a wide variety of The file system can't efficiently store an
Techniques sophisticated techniques to store and retrieve the data.
retrieve the data.

Concurrency DBMS takes care of Concurrent access of In the File system, concurrent access ha
Problems data using some form of locking. many problems like redirecting the file whil
deleting some information or updating som
information.

Where to use Database approach used in large systems File system approach used in large system
which interrelate many files. which interrelate many files.

Cost The database system is expensive to design. The file system approach is cheaper t
design.

Data Redundancy Due to the centralization of the database, In this, the files and application programs ar
and Inconsistency the problems of data redundancy and created by different programmers so tha
inconsistency are controlled. there exists a lot of duplication of data whic
may lead to inconsistency.

Structure The database structure is complex to The file system approach has a simpl
design. structure.

Data In this system, Data Independence exists, In the File system approach, there exists n
Independence and it can be of two types. Data Independence.
o Logical Data Independence
o Physical Data Independence

Integrity Integrity Constraints are easy to apply. Integrity Constraints are difficult t
Constraints implement in file system.

Data Models In the database approach, 3 types of data In the file system approach, there is n
models exist: concept of data models exists.
o Hierarchal data models
o Network data models
o Relational data models

Flexibility Changes are often a necessity to the content The flexibility of the system is less a
of the data stored in any system, and these compared to the DBMS approach.
changes are more easily with a database
approach.

Examples Oracle, SQL Server, Sybase etc. Cobol, C++ etc.

Data Models
Data Model is the modeling of the data description, data semantics, and consistency constraints
of the data. It provides the conceptual tools for describing the design of a database at each level
of data abstraction. Therefore, there are following four data models used for understanding the
structure of the database:

Relational Model
The most popular data model in DBMS is the Relational Model. It is more scientific a model
than others. This model is based on first-order predicate logic and defines a table as an n-ary
relation.

The main highlights of this model are −

 Data is stored in tables called relations.


 Relations can be normalized.
 In normalized relations, values saved are atomic values.
 Each row in a relation contains a unique value.
 Each column in a relation contains values from a same domain.
Entity-Relationship Model
Entity-Relationship (ER) Model is based on the notion of real-world entities and relationships
among them. While formulating real-world scenario into the database model, the ER Model
creates entity set, relationship set, general attributes and constraints.
ER Model is best used for the conceptual design of a database.
ER Model is based on −
 Entities and their attributes.
 Relationships among entities.
These concepts are explained below.

 Entity − An entity in an ER Model is a real-world entity having properties


called attributes. Every attribute is defined by its set of values called domain. For
example, in a school database, a student is considered as an entity. Student has various
attributes like name, age, class, etc.
 Relationship − The logical association among entities is called relationship.
Relationships are mapped with entities in various ways. Mapping cardinalities define the
number of association between two entities.
Mapping cardinalities −
o one to one
o one to many
o many to one
o many to many

hierarchical model,

In a hierarchical model, data are viewed as a collection of tables, or we can say


segments that form a hierarchical relation. In this, the data is organized into a tree-like
structure where each record consists of one parent record and many children. Even if
the segments are connected as a chain-like structure by logical associations, then the
instant structure can be a fan structure with multiple branches. We call the illogical
associations as directional associations.

2. Network Model

The Network Model was formalized by the Database Task group in the 1960s. This
model is the generalization of the hierarchical model. This model can consist of
multiple parent segments and these segments are grouped as levels but there exists a
logical association between the segments belonging to any level. Mostly, there exists a
many-to-many logical association between any of the two segments.

3. Object-Oriented Data Model

Object oriented data model is based upon real world situations. These situations are represented
as objects, with different attributes. All these object have multiple relationships between them.
Elements of Object oriented data model
Objects
The real world entities and situations are represented as objects in the Object oriented database
model.
Attributes and Method
Every object has certain characteristics. These are represented using Attributes. The behaviour
of the objects is represented using Methods.
Class
Similar attributes and methods are grouped together using a class. An object can be called as an
instance of the class.
Inheritance
A new class can be derived from the original class. The derived class contains attributes and
methods of the original class as well as its own.

Example
An Example of the Object Oriented data model is −

Shape, Circle, Rectangle and Triangle are all objects in this model.
Circle has the attributes Center and Radius.
Rectangle has the attributes Length and Breath
Triangle has the attributes Base and Height.
The objects Circle, Rectangle and Triangle inherit from the object Shape.

Levels of abstraction
Data Abstraction is a process of hiding unwanted or irrelevant details from the end user. It
provides a different view and helps in achieving data independence which is used to enhance the
security of data.
The database systems consist of complicated data structures and relations. For users to access
the data easily, these complications are kept hidden, and only the relevant part of the database is
made accessible to the users through data abstraction.

Levels of abstraction for DBMS


Database systems include complex data-structures. In terms of retrieval of data, reduce
complexity in terms of usability of users and in order to make the system efficient, developers
use levels of abstraction that hide irrelevant details from the users. Levels of abstraction simplify
database design.
Mainly there are three levels of abstraction for DBMS, which are as follows −

 Physical or Internal Level


 Logical or Conceptual Level
 View or External Level
These levels are shown in the diagram below –
Data Independence
o Data independence can be explained using the three-schema architecture.
o Data independence refers characteristic of being able to modify the schema at one level of the
database system without altering the schema at the next higher level.

There are two types of data independence:

1. Logical Data Independence


o Logical data independence refers characteristic of being able to change the conceptual schema
without having to change the external schema.
o Logical data independence is used to separate the external level from the conceptual view.
o If we do any changes in the conceptual view of the data, then the user view of the data would not
be affected.
o Logical data independence occurs at the user interface level.

2. Physical Data Independence


o Physical data independence can be defined as the capacity to change the internal schema without
having to change the conceptual schema.
o If we do any changes in the storage size of the database system server, then the Conceptual
structure of the database will not be affected.
o Physical data independence is used to separate conceptual levels from the internal levels.
o Physical data independence occurs at the logical interface level.
Structure of DBMS
DBMS
DBMS means Database Management System, which is a tool or software used to create the
database or delete or manipulate the database. A software programme created to store, retrieve,
query, and manage data is known as a Database Management System (DBMS). Data can be
generated, read, updated, and destroyed by authorized entities thanks to user interfaces (UIs).

Because they give programmers, Database Managers, and end users a consolidated view of the
data, Database Management Systems are crucial because they relieve applications and end users
of the need to comprehend the physical location of the data. Application Programme Interfaces
(APIs) manage internet requests and responses for particular sorts of data.

In marketing materials, the phrase "database as a service" (DBaaS) may be used to refer to both
relational and non-relational DBMS components that are given via the internet.

Users of DBMSs include application programmers, Database Administrators (DBAs), and end
users.

Database Administrators are typically the only people who work directly with a DBMS. Today,
end users read and write to databases using front-end interfaces made by programmers, while
programmers use cloud APIs to connect with DBMSs.
hree Parts that make up the Database System are:

o Query Processor
o Storage Manager
o Disk Storage

The explanations for these are provided below:

1. Query Processor
The query processing is handled by the query processor, as the name implies. It executes the
user's query, to put it simply. In this way, the query processor aids the database system in making
data access simple and easy. The query processor's primary duty is to successfully execute the
query. The Query Processor transforms (or interprets) the user's application program-provided
requests into instructions that a computer can understand.

Components of the Query Processor


o DDL Interpreter:

Data Definition Language is what DDL stands for. As implied by the name, the DDL Interpreter
interprets DDL statements like those used in schema definitions (such as create, remove, etc.).
This interpretation yields a set of tables that include the meta-data (data of data) that is kept in
the data dictionary. Metadata may be stored in a data dictionary. In essence, it is a part of the disc
storage that will be covered in a later section of this article.

o DML Compiler:

Compiler for DML Data Manipulation Language is what DML stands for. In keeping with its
name, the DML Compiler converts DML statements like select, update, and delete into low-level
instructions or simply machine-readable object code, to enable execution. The optimization of
queries is another function of the DML compiler. Since a single question can typically be
translated into a number of evaluation plans. As a result, some optimization is needed to select
the evaluation plan with the lowest cost out of all the options. This process, known as query
optimization, is exclusively carried out by the DML compiler. Simply put, query optimization
determines the most effective technique to carry out a query.

o Embedded DML Pre-compiler:

Before the query evaluation, the embedded DML commands in the application program (such as
SELECT, FROM, etc., in SQL) must be pre-compiled into standard procedural calls (program
instructions that the host language can understand). Therefore, the DML statements which are
embedded in an application program must be converted into routine calls by the Embedded DML
Pre-compiler.

o Query Optimizer:
It starts by taking the evaluation plan for the question, runs it, and then returns the result. Simply
said, the query evaluation engine evaluates the SQL commands used to access the database's
contents before returning the result of the query. In a nutshell, it is in charge of analyzing the
queries and running the object code that the DML Compiler produces. Apache Drill, Presto, and
other Query Evaluation Engines are a few examples.

2. Storage Manager:
An application called Storage Manager acts as a conduit between the queries made and the data
kept in the database. Another name for it is Database Control System. By applying the
restrictions and running the DCL instructions, it keeps the database's consistency and integrity. It
is in charge of retrieving, storing, updating, and removing data from the database.

Components of Storage Manager


Following are the components of Storage Manager:

o Integrity Manager:

Whenever there is any change in the database, the Integrity manager will manage the integrity
constraints.

o Authorization Manager:

Authorization manager verifies the user that he is valid and authenticated for the specific query
or request.

o File Manager:

All the files and data structure of the database are managed by this component.

o Transaction Manager:

It is responsible for making the database consistent before and after the transactions. Concurrent
processes are generally controlled by this component.

o Buffer Manager:

The transfer of data between primary and main memory and managing the cache memory is done
by the buffer manager.

3. Disk Storage
A DBMS can use various kinds of Data Structures as a part of physical system implementation in
the form of disk storage.

omponents of Disk Storage


Following are the components of Disk Manager:

o Data Dictionary:

It contains the metadata (data of data), which means each object of the database has some
information about its structure. So, it creates a repository which contains the details about the
structure of the database object.

o Data Files:

o This component stores the data in the files.


o Indices: These indices are used to access and retrieve the data in a very fast and efficient way.

Database Designing
The next step involves designing the database considering the user-based requirements and
splitting them out into various models so that load or heavy dependencies on a single aspect are
not imposed. Therefore, there has been some model-centric approach and that's where logical
and physical models play a crucial role.

Physical Model - The physical model is concerned with the practices and implementations of the
logical model.

Logical Model - This stage is primarily concerned with developing a model based on the
proposed requirements. The entire model is designed on paper without any implementation or
adopting DBMS considerations.

Implementation
The last step covers the implementation methods and checking out the behavior that matches our
requirements. It is ensured with continuous integration testing of the database with different data
sets and conversion of data into machine understandable language. The manipulation of data is
primarily focused on these steps where queries are made to run and check if the application is
designed satisfactorily or not.

Data conversion and loading - This section is used to import and convert data from the old to
the new system.
Testing - This stage is concerned with error identification in the newly implemented system.
Testing is a crucial step because it checks the database directly and compares the requirement
specifications.

Database Design Process


The process of designing a database carries various conceptual approaches that are needed to be
kept in mind. An ideal and well-structured database design must be able to:

1. Save disk space by eliminating redundant data.


2. Maintains data integrity and accuracy.
3. Provides data access in useful ways.
4. Comparing Logical and Physical data models.

Logical
A logical data model generally describes the data in as many details as possible, without having
to be concerned about the physical implementations in the database. Features of logical data
model might include:

1. All the entities and relationships amongst them.


2. Each entity has well-specified attributes.
3. The primary key for each entity is specified.
4. Foreign keys which are used to identify a relationship between different entities are specified.
5. Normalization occurs at this level.

A logical model can be designed using the following approach:

1. Specify all the entities with primary keys.


2. Specify concurrent relationships between different entities.
3. Figure out each entity attributes
4. Resolve many-to-many relationships.
5. Carry out the process of normalization.

Also, one important factor after following the above approach is to critically examine the design
based on requirement gathering. If the above steps are strictly followed, there are chances of
creating a highly efficient database design that follows the native approach.
To understand these points, see the image below to get a clear picture.

If we compare the logical data model as shown in the figure above with some sample data in the
diagram, we can come up with facts that in a conceptual data model there are no presence of a
primary key whereas a logical data model has primary keys for all of its attributes. Also, logical
data model the cover relationship between different entities and carries room for foreign keys to
establish relationships among them.

Physical
A Physical data mode generally represents how the approach or concept of designing the
database. The main purpose of the physical data model is to show all the structures of the table
including the column name, column data type, constraints, keys(primary and foreign), and
the relationship among tables. The following are the features of a physical data model:

1. Specifies all the columns and tables.


2. Specifies foreign keys that usually define the relationship between tables.
3. Based on user requirements, de-normalization might occur.
4. Since the physical consideration is taken into account so there will straightforward reasons for
difference than a logical model.
5. Physical models might be different for different RDBMS. For example, the data type column may
be different in MySQL and SQL Server.

While designing a physical data model, the following points should be taken into consideration:

1. Convert the entities into tables.


2. Convert the defined relationships into foreign keys.
3. Convert the data attributes into columns.
4. Modify the data model constraints based on physical requirements.

Comparing this physical data model with the logical with the previous logical model, we might conclude
the differences that in a physical database entity names are considered table names and attributes are
considered column names. Also, the data type of each column is defined in the physical model depending
on the actual database used.
ER (Entity Relationship) Diagram in DBMS
o ER model stands for an Entity-Relationship model. It is a high-level data model. This model is
used to define the data elements and relationship for a specified system.
o It develops a conceptual design for the database. It also develops a very simple and easy to design
view of data.
o In ER modeling, the database structure is portrayed as a diagram called an entity-relationship
diagram.

For example, Suppose we design a school database. In this database, the student will be an
entity with attributes like address, name, id, age, etc. The address can be another entity with
attributes like city, street name, pin code, etc and there will be a relationship between them.

Component of ER Diagram
1. Entity:
An entity may be any object, class, person or place. In the ER diagram, an entity can be
represented as rectangles.

Consider an organization as an example- manager, product, employee, department etc. can be


taken as an entity.
a. Weak Entity

An entity that depends on another entity called a weak entity. The weak entity doesn't contain
any key attribute of its own. The weak entity is represented by a double rectangle.

2. Attribute
The attribute is used to describe the property of an entity. Eclipse is used to represent an attribute.

For example, id, age, contact number, name, etc. can be attributes of a student.

a. Key Attribute

The key attribute is used to represent the main characteristics of an entity. It represents a primary
key. The key attribute is represented by an ellipse with the text underlined.
b. Composite Attribute

An attribute that composed of many other attributes is known as a composite attribute. The
composite attribute is represented by an ellipse, and those ellipses are connected with an ellipse.

c. Multivalued Attribute

An attribute can have more than one value. These attributes are known as a multivalued attribute.
The double oval is used to represent multivalued attribute.

For example, a student can have more than one phone number.
d. Derived Attribute

An attribute that can be derived from other attribute is known as a derived attribute. It can be
represented by a dashed ellipse.

For example, A person's age changes over time and can be derived from another attribute like
Date of birth.

3. Relationship
A relationship is used to describe the relation between entities. Diamond or rhombus is used to
represent the relationship.
Types of relationship are as follows:

a. One-to-One Relationship

When only one instance of an entity is associated with the relationship, then it is known as one to
one relationship.

For example, A female can marry to one male, and a male can marry to one female.

b. One-
to-many relationship

When only one instance of the entity on the left, and more than one instance of an entity on the
right associates with the relationship then this is known as a one-to-many relationship.

For example, Scientist can invent many inventions, but the invention is done by the only
specific scientist

c. Many-to-one relationship

When more than one instance of the entity on the left, and only one instance of an entity on the
right associates with the relationship then it is known as a many-to-one relationship.

For example, Student enrolls for only one course, but a course can have many students.
d. Many-to-many relationship

When more than one instance of the entity on the left, and more than one instance of an entity on
the right associates with the relationship then it is known as a many-to-many relationship.

For example, Employee can assign by many projects and project can have many employees.

Cardinality in DBMS (Mapping Constraints)


DBMS
DBMS stands for Database Management System, which is a tool, or a software used to do
various operations on a Database like the Creation of the Database, Deletion of the Database, or
Updating the current Database. To simplify processing and data querying, the most popular types
of Databases currently in use typically model their data as rows and columns in a set of tables.
The data may then be handled, updated, regulated, and structured with ease. For writing and
querying data, most Databases employ Structured Query Language (SQL).

Cardinality
Cardinality means how the entities are arranged to each other or what is the relationship structure
between entities in a relationship set. In a Database Management System, Cardinality represents
a number that denotes how many times an entity is participating with another entity in a
relationship set. The Cardinality of DBMS is a very important attribute in representing the
structure of a Database. In a table, the number of rows or tuples represents the Cardinality.

Cardinality Ratio
Cardinality ratio is also called Cardinality Mapping, which represents the mapping of one
entity set to another entity set in a relationship set. We generally take the example of a binary
relationship set where two entities are mapped to each other.

Cardinality is very important in the Database of various businesses. For example, if we want to
track the purchase history of each customer then we can use the one-to-many cardinality to find
the data of a specific customer. The Cardinality model can be used in Databases by Database
Managers for a variety of purposes, but corporations often use it to evaluate customer or
inventory data.

There are four types of Cardinality Mapping in Database Management Systems:

1. One to one
2. Many to one
3. One to many
4. Many to many

One to One
One to one cardinality is represented by a 1:1 symbol. In this, there is at most one relationship
from one entity to another entity. There are a lot of examples of one-to-one cardinality in real life
databases.

For example, one student can have only one student id, and one student id can belong to only
one student. So, the relationship mapping between student and student id will be one to one
cardinality mapping.

Another example is the relationship between the director of the school and the school because
one school can have a maximum of one director, and one director can belong to only one school.

There are four types of Cardinality Mapping in Database Management Systems:


1. One to one
2. Many to one
3. One to many
4. Many to many
One to One
One to one cardinality is represented by a 1:1 symbol. In this, there is at most one
relationship from one entity to another entity. There are a lot of examples of one-to-one
cardinality in real life databases.
For example, one student can have only one student id, and one student id can belong
to only one student. So, the relationship mapping between student and student id will be
one to one cardinality mapping.
Another example is the relationship between the director of the school and the school
because one school can have a maximum of one director, and one director can belong
to only one school. .

Many to One Cardinality:


In many to one cardinality mapping, from set 1, there can be multiple sets that can make
relationships with a single entity of set 2. Or we can also describe it as from set 2, and one entity
can make a relationship with more than one entity of set 1.

One to one Cardinality is the subset of Many to one Cardinality. It can be represented by M:1.
For example, there are multiple patients in a hospital who are served by a single doctor, so the
relationship between patients and doctors can be represented by Many to one Cardinality.

One to Many Cardinalities:


In One-to-many cardinality mapping, from set 1, there can be a maximum single set that can
make relationships with a single or more than one entity of set 2. Or we can also describe it as
from set 2, more than one entity can make a relationship with only one entity of set

ne to one cardinality is the subset of One-to-many Cardinality. It can be represented by 1: M.

For Example, in a hospital, there can be various compounders, so the relationship between the
hospital and compounders can be mapped through One-to-many Cardinality.
Many to Many Cardinalities:
In many, many cardinalities mapping, there can be one or more than one entity that can associate
with one or more than one entity of set 2. In the same way from the end of set 2, one or more
than one entity can make a relation with one or more than one entity of set 1.

It is represented by M: N or N: M.

One to one cardinality, One to many cardinalities, and Many to one cardinality is the subset of
the many to many cardinalities.

For Example, in a college, multiple students can work on a single project, and a single student
can also work on multiple projects. So, the relationship between the project and the student can
be represented by many to many cardinalities.
Entity Set: An Entity is an object of Entity Type and a set of all entities is called an
entity set. For Example, E1 is an entity having Entity Type Student and the set of all
students is called Entity Set. In ER diagram, Entity Type is represented as:
Entity Set

Relationship Type and Relationship Set


A Relationship Type represents the association between entity types. For example,
‘Enrolled in’ is a relationship type that exists between entity type Student and Course.
In ER diagram, the relationship type is represented by a diamond and connecting the
entities with lines.

Entity-Relationship Set

A set of relationships of the same type is known as a relationship set. The following
relationship set depicts S1 as enrolled in C2, S2 as enrolled in C1, and S3 as registered
in C3.
Relationship Set

Degree of a Relationship Set


The number of different entity sets participating in a relationship set is called the degree
of a relationship set.
1. Unary Relationship: When there is only ONE entity set participating in a relation,
the relationship is called a unary relationship. For example, one person is married to
only one person.

Unary Relationship

2. Binary Relationship: When there are TWO entities set participating in a relationship,
the relationship is called a binary relationship. For example, a Student is enrolled in a
Course.
Binary Relationship

3. n-ary Relationship: When there are n entities set participating in a relation, the
relationship is called an n-ary relationship.
Participation Constraint
Participation Constraint is applied to the entity participating in the relationship set.
1. Total Participation – Each entity in the entity set must participate in the relationship.
If each student must enroll in a course, the participation of students will be total. Total
participation is shown by a double line in the ER diagram.
2. Partial Participation – The entity in the entity set may or may NOT participate in
the relationship. If some courses are not enrolled by any of the students, the
participation in the course will be partial.
The diagram depicts the ‘Enrolled in’ relationship set with Student Entity set having
total participation and Course Entity set having partial participation.

Symbols Used in ER Model


ER Model is used to model the logical view of the system from a data perspective
which consists of these symbols:
o Rectangles: Rectangles represent Entities in ER Model.
o Ellipses: Ellipses represent Attributes in ER Model.
o Diamond: Diamonds represent Relationships among Entities.
o Lines: Lines represent attributes to entities and entity sets with other relationship
types.
o Double Ellipse: Double Ellipses represent Multi-Valued Attributes.
o Double Rectangle: Double Rectangle represents a Weak Entity.

Keys
o Keys play an important role in the relational database.
o It is used to uniquely identify any record or row of data from the table. It is also used to establish
and identify relationships between tables.

For example, ID is used as a key in the Student table because it is unique for each student. In the
PERSON table, passport_number, license_number, SSN are keys since they are unique for each
person.

Types of keys:
1. Primary key
o It is the first key used to identify one and only one instance of an entity uniquely. An entity can
contain multiple keys, as we saw in the PERSON table. The key which is most suitable from
those lists becomes a primary key.
o In the EMPLOYEE table, ID can be the primary key since it is unique for each employee. In the
EMPLOYEE table, we can even select License_Number and Passport_Number as primary keys
since they are also unique.
o For each entity, the primary key selection is based on requirements and developers.

2. Candidate key
o A candidate key is an attribute or set of attributes that can uniquely identify a tuple.
o Except for the primary key, the remaining attributes are considered a candidate key. The
candidate keys are as strong as the primary key.

For example: In the EMPLOYEE table, id is best suited for the primary key. The rest of the
attributes, like SSN, Passport_Number, License_Number, etc., are considered a candidate key.

3. Super Key
Super key is an attribute set that can uniquely identify a tuple. A super key is a superset of a
candidate key.
For example: In the above EMPLOYEE table, for(EMPLOEE_ID, EMPLOYEE_NAME), the
name of two employees can be the same, but their EMPLYEE_ID can't be the same. Hence, this
combination can also be a key.

PauseNext

Unmute

Current Time 0:14

Duration 18:10

Loaded: 5.50%
Â

Fullscreen

The super key would be EMPLOYEE-ID (EMPLOYEE_ID, EMPLOYEE-NAME), etc.

4. Foreign key
o Foreign keys are the column of the table used to point to the primary key of another table.
o Every employee works in a specific department in a company, and employee and department are
two different entities. So we can't store the department's information in the employee table. That's
why we link these two tables through the primary key of one table.
o We add the primary key of the DEPARTMENT table, Department_Id, as a new attribute in the
EMPLOYEE table.
o In the EMPLOYEE table, Department_Id is the foreign key, and both the tables are related.
5. Alternate key
There may be one or more attributes or a combination of attributes that uniquely identify each
tuple in a relation. These attributes or combinations of the attributes are called the candidate keys.
One key is chosen as the primary key from these candidate keys, and the remaining candidate
key, if it exists, is termed the alternate key. In other words, the total number of the alternate
keys is the total number of candidate keys minus the primary key. The alternate key may or may
not exist. If there is only one candidate key in a relation, it does not have an alternate key.

For example, employee relation has two attributes, Employee_Id and PAN_No, that act as
candidate keys. In this relation, Employee_Id is chosen as the primary key, so the other candidate
key, PAN_No, acts as the Alternate key.

6. Composite key
Whenever a primary key consists of more than one attribute, it is known as a composite key.
This key is also known as Concatenated Key.
For example, in employee relations, we assume that an employee may be assigned multiple
roles, and an employee may work on multiple projects simultaneously. So the primary key will
be composed of all three attributes, namely Emp_ID, Emp_role, and Proj_ID in combination. So
these attributes act as a composite key since the primary key comprises more than one attribute.

7. Artificial key
The key created using arbitrarily assigned data are known as artificial keys. These keys are
created when a primary key is large and complex and has no relationship with many other
relations. The data values of the artificial keys are usually numbered in a serial order.

For example, the primary key, which is composed of Emp_ID, Emp_role, and Proj_ID, is large
in employee relations. So it would be better to add a new virtual attribute to identify each tuple in
the relation uniquely.
Generalization
o Generalization is like a bottom-up approach in which two or more entities of lower level combine
to form a higher level entity if they have some attributes in common.
o In generalization, an entity of a higher level can also combine with the entities of the lower level
to form a further higher level entity.
o Generalization is more like subclass and superclass system, but the only difference is the
approach. Generalization uses the bottom-up approach.
o In generalization, entities are combined to form a more generalized entity, i.e., subclasses are
combined to make a superclass.

For example, Faculty and Student entities can be generalized and create a higher level entity
Person.

Specialization
o Specialization is a top-down approach, and it is opposite to Generalization. In specialization, one
higher level entity can be broken down into two lower level entities.
o Specialization is used to identify the subset of an entity set that shares some distinguishing
characteristics.
o Normally, the superclass is defined first, the subclass and its related attributes are defined next,
and relationship set are then added.

For example: In an Employee management system, EMPLOYEE entity can be specialized as


TESTER or DEVELOPER based on what role they play in the company.

Aggregation
In aggregation, the relation between two entities is treated as a single entity. In aggregation,
relationship with its corresponding entities is aggregated into a higher level entity.

For example: Center entity offers the Course entity act as a single entity in the relationship
which is in a relationship with another entity visitor. In the real world, if a visitor visits a
coaching center then he will never enquiry about the Course only or just about the Center instead
he will ask the enquiry about both
UNIT 2
Integrity Constraints
o Integrity constraints are a set of rules. It is used to maintain the quality of information.
o Integrity constraints ensure that the data insertion, updating, and other processes have to
be performed in such a way that data integrity is not affected.
o Thus, integrity constraint is used to guard against accidental damage to the database.

You might also like