Course code: ITI131 Course title : Database PART: 1 Prof. Taymoor Mohamed Nazmy Dept. of computer science, faculty of computer science, Ain Shams uni. Ex-vice dean of post graduate studies and research Cairo, Egypt 1
Course Description • 1-Data, database, • 2-DBMS, Model architecture, • 3-Database languages and models, • 4-Relational database, • 5-SQL, • 6,7-E/R digram • 8-Database design, • 9-Database integrity • 10-11Normalization • 12-Review • Required text: • Database system concept, 6th edition , McGraw-hill, 2011. 2
About the lecturer - Prof. of computer science since 2006, - Director of Ain Shams information network, - Vice dean of post graduate studies and research, - Vice dean of environmental and social affairs, - Member in editorial board of many Int. journals, - Member in Scientific committee of many int. conferences - Executive chair of int. conf. on information and intelligent systems, - Published more than 60 scientific papers in int. journals and conference, Supervised more than 20 master and Ph. D thesis. 3
About the course • The materials of this course were collected from many resources, include the reference book, other books, online courses, presentations, and web sites. • There are many details will be given to simplify topics in this course, however at the end of the course the most important topics will be highlighted. • The delivered materials through this presentation is your main resource, • This presentation will be delivered to you. • There will be term exams 40% • Lab 20% • The final exam 40% • The exam will include subjective and objective questions, • Through this course try to extends your knowledge by improving your self learning capabilities. 4
ntaymoor19600@gmail.com 5
Important • Try to continues improve your language by adding new vocabulary (word meaning), • Read the lecture more than one time, • You may got more details from references, • There are think to remember , and others to understand. 6
Understand a DBMS and define its components. Understand the architecture of a DBMS and its levels. Distinguish between different database models. Understand the concept of relational database operations on a relation. Course objectives Use Structured Query Language (SQL) to define simple relations. Use E/R diagram for modeling relations 7
Structure of complete Course Database Fundamentals Relational Model Normalization Conceptual Modeling Query Languages Advanced SQL Transaction Management Java DB Applications – JDBC Data Mining 0. Intro 1. Design 3. Applications 4. Advanced Topics Newbie Users ProfessionalsDesigners Database course 2. Querying Developers 8
9 Database Prehistory Data entry Storage and retrieval Query processing Sorting 9
Stages of Information System  Stage 0: Manual Information System • Records • Files • Index Cards  Stage 1: Sequential Information Systems • Tapes • Files • slow, non-interactive, redundancy,... .  Stage 2: File Based Information Systems • Disk (direct access) • application program has its own file • data redundancy  Stage 3: DBMS based Information Systems • Generalized data management software • Transaction processing 10
Office automation system • Refers to the varied computer machinery and software used to digitally create, collect, store, manipulate, and relay office information needed for accomplishing basic tasks. Its objectives are: • Paperless work environment • Simplify operations and minimize computational errors • Optimal utilization of resources • Better information sharing and transparency • Enhanced security and recovery of data 1111
Office automation system It includes many applications such as:  document management  word processing  desktop publishing  electronic communications  electronic scheduling  data management  project management 12
Document Management System A document management system (DMS) is a computer system (or set of computer programs) used to track and store electronic documents and/or images of paper documents. Document management is a process taken with document within an organization, with respect to the creation, distribution and deletion of documents 13
Document Management System benefits 1. Improve staff productivity 2. Reduce costs associated with manual document 3. Promote sharing of knowledge and information 4. Enhance corporate transparency and governance 5. E-mail and fax files instantly 6. Access documents while traveling 7. Publish documents to CD, DVD or the web, as appropriate 14
Data, database, DBMS definition 15
16 Databases are a Rich Area for Computer Science • Programming languages and software engineering, • Data structures and algorithms, • Logic, discrete maths, computation theory, • It is a common course for students from different department in computer and engineering science.
• Creation more databases has great impact towards better environment of the world, how? 17
Advantages of Using the Database Approach • More information from given data • Ad hoc queries can be performed • Redundancy can be reduced • Inconsistency can be avoided • Security restriction can be applied 18
Advantages of Using the Database Approach - 2 • Sharing of data among multiple users. • Providing backup and recovery services. • Providing multiple interfaces to different classes of users. • Representing complex relationships among data 19
What is a database?  Database  a collection of related records, organized into tables  Table  a set of data elements (values) organized by records (horizontal rows) and fields (vertical columns)  Record (or row)  a single, structured data item in a table  Field (or column)  a set of data values of a particular simple type, one for each row of the table 20
Slide 21 Database Tables: The Big Picture To calculate the number of entries in a database--multiply the number of fields by the number of records. Ex. 5 fields(columns)*26 rows (records) = 130 entries in this database. 2. ENTRIES (cells) 4. RECORDS (rows) 3. FIELDS (columns) 1. Table 3. Field 2. Entries 4. Record
Comparison of Terminology Formal (logical)relational term Informal (physical)equivalents relation table tuple row or record cardinality number of rows attribute column or field degree Primary key number of columns (unique) identifier domain pool of legal values 22
23 What is a Database? • It contains data specific to part of the real world, a miniworld. • Construction: populating tables with data. • Manipulation: queries, updates, deletes, etc. • It is maintained for specific applications. 23
Why do businesses use databases? • A database is a tool used by businesses to manage the huge amount of data required to run the business. – For example, a business may use several databases to keep track of employee information, customers, products, and prices of products. – All of the information can be easily and efficiently managed and shared by database software. Slide 24
Business Rules • What – Brief, precise, and unambiguous descriptions of operations in an organization • based on policies, procedures, or principles within a specific organization • Why – Enhance understanding & facilitate communication • Standardize company’s view of data • Constitute a communications tool between users and designers • How (sources) – Interviews • Company managers • Policy makers • Department managers • End users – Written documentation • Procedures, Standards, Operations manuals – Observation • Business operations 2525
What is a database tasks? • Database tasks: – Retrieving – Sorting – Summarizing – Inserting – Updating – Deleting 26
Types of Databases and Database Applications • Numeric and Textual Databases (e.g. IRS CADE) • Multimedia Databases (e.g. Cortina) • Geographic Information Systems (GIS) • Data Warehouses • Real-time and Active Databases 27
Examples of Databases • Web database: • Online library card catalog –Media Center information –Encyclopedias –Dictionaries • eBay and other online • businesses –Inventory –Keep track of customers, products, vendors, pay pal accounts Slide 28 ► Bank accounts ► Students records ► Hospital record ► Payroll • Stock market • Barcode scanner keeps • Real estate listings • Fingerprint database
Top 5 Largest Databases in the World • Library of Congress: 130 million items • The Central Intelligence Agency (CIA) • Amazon: 59 million active customers • YouTube: 100 million videos watched per day • ChoicePoint: Information on 250 million people 29
30
Database Users • End users – Use the database system to achieve some goal • Application developers – Write software to allow end users to interface with the database system • Database Administrator (DBA) – Designs & manages the database system • Database systems programmer – Writes the database software itself 31
Database users 3232
Which is better Database System or File System? 3333
Billing Program Customer file Accounts receivable file Accounts_Payable Program Vendor file Invoice file Sales Order Processing Program Customer file Inventory file Payroll Program Employee file Purchasing Program Buyer file Inventory file Vendor file File Processing Systems 34
File System: Weakness • File system was an old system to handle data, it can be used on small scale of data, but with larger data it become ineffective. • Weakness – “Islands of data” in scattered file systems. • Problems – Duplication • same data may be stored in multiple files – Inconsistency • same data may be stored by different names in different format – Rigidity • requires customized programming to implement any changes • cannot do ad-hoc queries • Implications – Waste of space – Data inaccuracies – High overhead of data manipulation and maintenance 3535
Evolution of Data Models • A data model (or datamodel is an abstract model that organizes elements of data and standardizes how they relate to one another and to properties of the real world entities. 36 1960s 1970s 1980s 1990s 2000+ File-based Hierarchical Network Relational Object-oriented Web-based Entity-Relationship 36
Hierarchical model 37
Network model: graph 38
Relational model 39
Database design process • Determine the purpose of the database - This helps prepare for the remaining steps. • Find and organize the information required - Gather all of the types of information to record in the database, such as product name and order number. • Divide the information into tables - Divide information items into major entities or subjects, such as Products or Orders. Each subject then becomes a table. • Turn information items into columns - Decide what information needs to be stored in each table. Each item becomes a field, and is displayed as a column in the table. For example, an Employees table might include fields such as Last Name and Hire Date. 40
• Specify primary keys - Choose each table’s primary key. The primary key is a column, or a set of columns, that is used to uniquely identify each row. An example might be Product ID or Order ID. • Set up the table relationships - Look at each table and decide how the data in one table is related to the data in other tables. Add fields to tables or create new tables to clarify the relationships, as necessary. • Refine the design - Analyze the design for errors. Create tables and add a few records of sample data. Check if results come from the tables as expected. Make adjustments to the design, as needed. • Apply the normalization rules - Apply the data normalization rules to see if tables are structured correctly. Make adjustments to the tables, as needed. 41
Relational Data Structure • Data is stored in relations (tables) • Each relation has a scheme (pattern) (heading) • The scheme defines the relation’s attributes (columns) • Data takes the form of tuples (rows) Name Age John 23 Mary 20 Mark 18 Jane 21 Attributes Scheme Tuples 42
Relational Data Structure • More formally - – A scheme is a set of attributes – A tuple assigns a value to each attribute in its scheme – A relation is a set of tuples with the same scheme { { (Name, John), (Age, 23) }, { (Name, Mary), (Age, 20) }, { (Name, Mark), (Age, 18) }, { (Name, Jane), (Age, 21) } } Name Age John 23 Mary 20 Mark 18 Jane 21 43
Relations ID Name Salary Department M139 John Smith 18,000 Marketing M140 Mary Jones 22,000 Marketing A368 Jane Brown 22,000 Accounts P222 Mark Brown 24,000 Personnel A367 David Jones 20,000 Accounts Attributes are ID, Name, Salary, and Department Degree is 4 Cardinality is 5 Tuples, e.g. { (ID, A368), (Name, Jane Brown), (Salary, 22,000), (Department, Accounts) } Scheme is {ID, Name, Salary, Department} 44
45
A relational database management & data model 46
A relational database management system (RDBMS) • A software system used to maintain relational databases is a relational database management system (RDBMS). Virtually all relational database systems use SQL (Structured Query Language) for querying and maintaining the database. • A relational database management system (RDBMS) is a database management system (DBMS) that is based on the relational model as invented by Edgar F.Codd, of IBM's San Jose Research Laboratory, in 1970s. • In 2017, many of the databases in widespread use are based on the relational database model. • Each row in a table has its own unique key( primary key). Rows in a table can be linked to rows in other tables by adding a column for the unique key of the linked row 47
DBMS components  Hardware – the physical computer system that allows physical access to data.  Software – the actual program that allows users to access, maintain, and update physical data.  Data – stored physically on the storage devices  Users –  End users - Normal user and DBA (Database Administrator)  Application programs  Procedures – a set of rules that should be clearly defined and followed by the users.
50 DBMS Functions • Data dictionary management • Data storage management • Data transformation and presentation • Security management • Multiuser access control • Backup and recovery management • Data integrity management • Database language and • application programming interfaces • Database communication interfaces
51 DBMS functions 51 DBMS engine
52 The Database for a small company 52
The leading vendors (sellers) of RDBMS • There 10s 0f RDBMS vendors the leading ones are: **** RDBMS Vendors **** RDBMS Product **** Computer Associates INGRES IBM DB2 INFORMIX Software INFORMIX Oracle Corporation Oracle Microsoft Corporation MS Access Microsoft Corporation SQL Server MySQL AB MySQL NCR Teradata PostgreSQL Dvlp Grp PostgreSQL Sybase Sybase 11 53
RDBMS vendor market share 54 In 2016 the revenue reached 35 billion dollars for the Oracle.
Which relational DBMS is best for your company? • If you're a large organization with a mainframe and want • For Unix and Linux installations, your primary choices are Oracle and DB2. • Oracle is the market leader on those platforms, though IBM has a solid presence there, too. • For Windows development, Microsoft is the clear leader on its own OS. Also, , Microsoft has announced plans to support SQL Server on Linux. 55
Database design • Database design is the process of producing a detailed data model of database. • This data model contains all the needed logical and physical design choices and physical storage parameters needed to generate a design in a data definition language, which can then be used to create a database. • A fully attributed data model contains detailed attributes for each entity.
57 DBMS STANDARDIZATION ANSI ARCHITECTURE
ER model -- Database Modeling  Database designs include ER (entity-relationship model) diagrams. An ER diagram is a diagram that helps to design databases in an efficient way.  The ER data model employs three basic concepts:  entity sets,  relationship sets,  attributes.  The ER model can express the overall logical structure of a database graphically. 58
Basics of E-R Diagrams • Entity – represented by a rectangle with its name in capital letters. • Relationships – represented by an active or passive verb inside the diamond that connects the related entities. • Connectivities – i.e., types of relationship – written next to each entity box. 5959
60 Schemas • Relation schema = relation name and attribute list.  Optionally: types of attributes.  Example: Beers(name, manf) or Beers(name: string, manf: string) • Database = collection of relations. • Database schema = set of all relation schemas in the database. 60
Jan 29, 2002 61
Slide 62 COMPANY Database Schema 62
Slide 63 Example of a Relation STUDENT (Name, SSN, Home_phone, Addr, Office_phone, Age, GPA) 63
64 Example: University Database using E/R diagram • Professors work for only one department • Departments have many professors • Each course is taught by only one professor • Students make a plan or program of study • A course could meet in several rooms/times • Graduate students must have an advisor • Cities are categorized as resident/BirthPlace • Visiting prof. need an End/Start date
65 E/R diagram of a University Database BIRTH_PLACE CITY Name State PERSON LastName Age PROFESSOR TenureStatus Title STUDENT DEPARTMENT Name Phone Belongs_To ENROLLED Semester Grade PLANNED Semester GRADUATE_STUDENT ADVISED_BY VISITING_ PROFESSOR Start_Appointment End_Appointment COURSE Title Description Credits TAUGHT_BY Semester MEETS ROOM RoomNumber Building TIME Day Hour RESIDENT 1 N N N N N N N 1 N N N 1 1 N N 1
Types of Data 1. User data 2. Metadata 3. Indexes 4. Application Metadata 66
1. User Data • A table of data is called a relation • Columns are fields or attributes • Rows are specific instances • Relations must be structured properly 67
2. Metadata or Data Dictionary “a description of the structure of the database” System tables store metadata, including:  table names  field names  primary key fields  data types  field size  data constraints (e.g., input masks, and validation rules)  relationships between tables 68
3- index
4. Application Metadata Stores the structure and format of: – forms – reports – Queries:a query is a form of questioning using query language – other application components 71
Relations Keys “In rational database the relations are connects through unique identifier called key. It is a group of one or more attributes that uniquely identifies a row”
Types of Keys • A (primary) key is one (or more) columns of a relation that is (are) used to identify a row • A composite key is a key that contains two or more attributes – For a key to be unique, it is sometimes necessary to use a composite key – Can also generate a set of unique values to serve as the key (for example University ID numbers) 73
A Foreign Key • A table may be related to other tables (i.e., a relationship) • To create relationships, you may need to create a foreign key • A foreign key is a primary key from one table placed into another table • Referential integrity - every value of a foreign key must match a value of an existing primary key 75
Foreign Key Example 76
Properties of a Relation (table) 1. There are no duplicate tuples (rows). –The body of the relation is a mathematical set (i.e., a set of tuples), and sets in mathematics by definition do not include duplicate elements. –If a "relation" contains duplicate tuples, then it is not a relation. 77
Properties of a Relation 2. Tuples (rows) are unordered (top to bottom). – Sets in mathematics are not ordered. So, even if a relation A's tuples are reversely ordered, it is still the same relation. – Thus, there is no such thing as "the 5th tuple" or the last tuple. In other words, there is no concept of positional addressing. 78
Properties of a Relation 3. Attributes (columns) are unordered (left to right). –The heading of a relation is also defined as a set. –There is no such thing as "5th attribute (column)" or the last attribute. 79
Properties of a Relation 4. All attribute values are atomic. – At every row-and-column position within the table, there always exists precisely one value, never a list of values. Or equivalently, relations do not contain repeating groups. – A relation satisfying this condition is said to be in First Normal Form. 80
Entity Integrity Rule • Guarantees that each entity will have a unique identity and ensures that foreign key values can properly reference primary key values. • Requirement – No component of the primary key is allowed to accept nulls. – By "null" here, we mean that information is missing for some reason. 81
Foreign Key • An attribute in one table whose values must either match the primary key in another table or be null. • Attribute FK of base relation R2 is a foreign key if and only if it satisfies the following two time- • independent properties: – Each value of FK is either wholly null or wholly non- null. – Each non-null value of FK is identical to the value of PK in some tuple of R1. 82
Referential Integrity Enforcement • Referential Integrity (RI) is a database concept used to ensure that the relationships between your database tables remains synchronized during data modifications, you may choose among those items: • Restriction: – does not allow any deletion • Nullification: – if value is deleted, reference value will set to be null. • Cascading: – if value is deleted, reference value will also be deleted. – if value is updated, reference value will also be updated. • Default value: – if value is deleted, reference value will be have default value, which is provided by the system. 83
Referential Integrity Enforcement • Cascade Update Related Fields – Change of PK values in primary table  automatic change of FK values • Cascade Delete Related Fields – Delete of a record in the primary table  automatic delete of all records in the related table that have a matching FK value 84
Database languages
Data Definition Language (DDL) • : DDL is used for specifying the database schema. Let’s take SQL for instance to categorize the statements that comes under DDL. • To create the database instance – CREATE • To alter the structure of database – ALTER • To drop database instances – DROP • To delete tables in a database instance – TRUNCATE • To rename database instances – RENAME • All these commands specify or update the database schema that’s why they come under Data Definition language. 87
Data Manipulation Language (DML) • DML is used for accessing and manipulating data in a database • To read records from table(s) – SELECT • To insert record(s) into the table(s) – INSERT • Update the data in table(s) – UPDATE • Delete all the records from the table – DELETE 88
Data Control language (DCL) • DCL is used for granting and revoking user access on a database – • To grant access to user – GRANT • To revoke access from user – REVOKE • In practical data definition language, data manipulation language and data control languages are not separate language; rather they are the parts of a single database language such as SQL. 89
End part 1

Database part1-

  • 1.
    Course code: ITI131 Coursetitle : Database PART: 1 Prof. Taymoor Mohamed Nazmy Dept. of computer science, faculty of computer science, Ain Shams uni. Ex-vice dean of post graduate studies and research Cairo, Egypt 1
  • 2.
    Course Description • 1-Data,database, • 2-DBMS, Model architecture, • 3-Database languages and models, • 4-Relational database, • 5-SQL, • 6,7-E/R digram • 8-Database design, • 9-Database integrity • 10-11Normalization • 12-Review • Required text: • Database system concept, 6th edition , McGraw-hill, 2011. 2
  • 3.
    About the lecturer -Prof. of computer science since 2006, - Director of Ain Shams information network, - Vice dean of post graduate studies and research, - Vice dean of environmental and social affairs, - Member in editorial board of many Int. journals, - Member in Scientific committee of many int. conferences - Executive chair of int. conf. on information and intelligent systems, - Published more than 60 scientific papers in int. journals and conference, Supervised more than 20 master and Ph. D thesis. 3
  • 4.
    About the course •The materials of this course were collected from many resources, include the reference book, other books, online courses, presentations, and web sites. • There are many details will be given to simplify topics in this course, however at the end of the course the most important topics will be highlighted. • The delivered materials through this presentation is your main resource, • This presentation will be delivered to you. • There will be term exams 40% • Lab 20% • The final exam 40% • The exam will include subjective and objective questions, • Through this course try to extends your knowledge by improving your self learning capabilities. 4
  • 5.
  • 6.
    Important • Try tocontinues improve your language by adding new vocabulary (word meaning), • Read the lecture more than one time, • You may got more details from references, • There are think to remember , and others to understand. 6
  • 7.
    Understand a DBMSand define its components. Understand the architecture of a DBMS and its levels. Distinguish between different database models. Understand the concept of relational database operations on a relation. Course objectives Use Structured Query Language (SQL) to define simple relations. Use E/R diagram for modeling relations 7
  • 8.
    Structure of completeCourse Database Fundamentals Relational Model Normalization Conceptual Modeling Query Languages Advanced SQL Transaction Management Java DB Applications – JDBC Data Mining 0. Intro 1. Design 3. Applications 4. Advanced Topics Newbie Users ProfessionalsDesigners Database course 2. Querying Developers 8
  • 9.
    9 Database Prehistory Data entry Storageand retrieval Query processing Sorting 9
  • 10.
    Stages of InformationSystem  Stage 0: Manual Information System • Records • Files • Index Cards  Stage 1: Sequential Information Systems • Tapes • Files • slow, non-interactive, redundancy,... .  Stage 2: File Based Information Systems • Disk (direct access) • application program has its own file • data redundancy  Stage 3: DBMS based Information Systems • Generalized data management software • Transaction processing 10
  • 11.
    Office automation system •Refers to the varied computer machinery and software used to digitally create, collect, store, manipulate, and relay office information needed for accomplishing basic tasks. Its objectives are: • Paperless work environment • Simplify operations and minimize computational errors • Optimal utilization of resources • Better information sharing and transparency • Enhanced security and recovery of data 1111
  • 12.
    Office automation system Itincludes many applications such as:  document management  word processing  desktop publishing  electronic communications  electronic scheduling  data management  project management 12
  • 13.
    Document Management System Adocument management system (DMS) is a computer system (or set of computer programs) used to track and store electronic documents and/or images of paper documents. Document management is a process taken with document within an organization, with respect to the creation, distribution and deletion of documents 13
  • 14.
    Document Management System benefits 1.Improve staff productivity 2. Reduce costs associated with manual document 3. Promote sharing of knowledge and information 4. Enhance corporate transparency and governance 5. E-mail and fax files instantly 6. Access documents while traveling 7. Publish documents to CD, DVD or the web, as appropriate 14
  • 15.
    Data, database, DBMSdefinition 15
  • 16.
    16 Databases are aRich Area for Computer Science • Programming languages and software engineering, • Data structures and algorithms, • Logic, discrete maths, computation theory, • It is a common course for students from different department in computer and engineering science.
  • 17.
    • Creation moredatabases has great impact towards better environment of the world, how? 17
  • 18.
    Advantages of Usingthe Database Approach • More information from given data • Ad hoc queries can be performed • Redundancy can be reduced • Inconsistency can be avoided • Security restriction can be applied 18
  • 19.
    Advantages of Usingthe Database Approach - 2 • Sharing of data among multiple users. • Providing backup and recovery services. • Providing multiple interfaces to different classes of users. • Representing complex relationships among data 19
  • 20.
    What is adatabase?  Database  a collection of related records, organized into tables  Table  a set of data elements (values) organized by records (horizontal rows) and fields (vertical columns)  Record (or row)  a single, structured data item in a table  Field (or column)  a set of data values of a particular simple type, one for each row of the table 20
  • 21.
    Slide 21 Database Tables:The Big Picture To calculate the number of entries in a database--multiply the number of fields by the number of records. Ex. 5 fields(columns)*26 rows (records) = 130 entries in this database. 2. ENTRIES (cells) 4. RECORDS (rows) 3. FIELDS (columns) 1. Table 3. Field 2. Entries 4. Record
  • 22.
    Comparison of Terminology Formal(logical)relational term Informal (physical)equivalents relation table tuple row or record cardinality number of rows attribute column or field degree Primary key number of columns (unique) identifier domain pool of legal values 22
  • 23.
    23 What is aDatabase? • It contains data specific to part of the real world, a miniworld. • Construction: populating tables with data. • Manipulation: queries, updates, deletes, etc. • It is maintained for specific applications. 23
  • 24.
    Why do businessesuse databases? • A database is a tool used by businesses to manage the huge amount of data required to run the business. – For example, a business may use several databases to keep track of employee information, customers, products, and prices of products. – All of the information can be easily and efficiently managed and shared by database software. Slide 24
  • 25.
    Business Rules • What –Brief, precise, and unambiguous descriptions of operations in an organization • based on policies, procedures, or principles within a specific organization • Why – Enhance understanding & facilitate communication • Standardize company’s view of data • Constitute a communications tool between users and designers • How (sources) – Interviews • Company managers • Policy makers • Department managers • End users – Written documentation • Procedures, Standards, Operations manuals – Observation • Business operations 2525
  • 26.
    What is adatabase tasks? • Database tasks: – Retrieving – Sorting – Summarizing – Inserting – Updating – Deleting 26
  • 27.
    Types of Databasesand Database Applications • Numeric and Textual Databases (e.g. IRS CADE) • Multimedia Databases (e.g. Cortina) • Geographic Information Systems (GIS) • Data Warehouses • Real-time and Active Databases 27
  • 28.
    Examples of Databases •Web database: • Online library card catalog –Media Center information –Encyclopedias –Dictionaries • eBay and other online • businesses –Inventory –Keep track of customers, products, vendors, pay pal accounts Slide 28 ► Bank accounts ► Students records ► Hospital record ► Payroll • Stock market • Barcode scanner keeps • Real estate listings • Fingerprint database
  • 29.
    Top 5 LargestDatabases in the World • Library of Congress: 130 million items • The Central Intelligence Agency (CIA) • Amazon: 59 million active customers • YouTube: 100 million videos watched per day • ChoicePoint: Information on 250 million people 29
  • 30.
  • 31.
    Database Users • Endusers – Use the database system to achieve some goal • Application developers – Write software to allow end users to interface with the database system • Database Administrator (DBA) – Designs & manages the database system • Database systems programmer – Writes the database software itself 31
  • 32.
  • 33.
    Which is better DatabaseSystem or File System? 3333
  • 34.
  • 35.
    File System: Weakness •File system was an old system to handle data, it can be used on small scale of data, but with larger data it become ineffective. • Weakness – “Islands of data” in scattered file systems. • Problems – Duplication • same data may be stored in multiple files – Inconsistency • same data may be stored by different names in different format – Rigidity • requires customized programming to implement any changes • cannot do ad-hoc queries • Implications – Waste of space – Data inaccuracies – High overhead of data manipulation and maintenance 3535
  • 36.
    Evolution of DataModels • A data model (or datamodel is an abstract model that organizes elements of data and standardizes how they relate to one another and to properties of the real world entities. 36 1960s 1970s 1980s 1990s 2000+ File-based Hierarchical Network Relational Object-oriented Web-based Entity-Relationship 36
  • 37.
  • 38.
  • 39.
  • 40.
    Database design process •Determine the purpose of the database - This helps prepare for the remaining steps. • Find and organize the information required - Gather all of the types of information to record in the database, such as product name and order number. • Divide the information into tables - Divide information items into major entities or subjects, such as Products or Orders. Each subject then becomes a table. • Turn information items into columns - Decide what information needs to be stored in each table. Each item becomes a field, and is displayed as a column in the table. For example, an Employees table might include fields such as Last Name and Hire Date. 40
  • 41.
    • Specify primarykeys - Choose each table’s primary key. The primary key is a column, or a set of columns, that is used to uniquely identify each row. An example might be Product ID or Order ID. • Set up the table relationships - Look at each table and decide how the data in one table is related to the data in other tables. Add fields to tables or create new tables to clarify the relationships, as necessary. • Refine the design - Analyze the design for errors. Create tables and add a few records of sample data. Check if results come from the tables as expected. Make adjustments to the design, as needed. • Apply the normalization rules - Apply the data normalization rules to see if tables are structured correctly. Make adjustments to the tables, as needed. 41
  • 42.
    Relational Data Structure •Data is stored in relations (tables) • Each relation has a scheme (pattern) (heading) • The scheme defines the relation’s attributes (columns) • Data takes the form of tuples (rows) Name Age John 23 Mary 20 Mark 18 Jane 21 Attributes Scheme Tuples 42
  • 43.
    Relational Data Structure •More formally - – A scheme is a set of attributes – A tuple assigns a value to each attribute in its scheme – A relation is a set of tuples with the same scheme { { (Name, John), (Age, 23) }, { (Name, Mary), (Age, 20) }, { (Name, Mark), (Age, 18) }, { (Name, Jane), (Age, 21) } } Name Age John 23 Mary 20 Mark 18 Jane 21 43
  • 44.
    Relations ID Name SalaryDepartment M139 John Smith 18,000 Marketing M140 Mary Jones 22,000 Marketing A368 Jane Brown 22,000 Accounts P222 Mark Brown 24,000 Personnel A367 David Jones 20,000 Accounts Attributes are ID, Name, Salary, and Department Degree is 4 Cardinality is 5 Tuples, e.g. { (ID, A368), (Name, Jane Brown), (Salary, 22,000), (Department, Accounts) } Scheme is {ID, Name, Salary, Department} 44
  • 45.
  • 46.
    A relational databasemanagement & data model 46
  • 47.
    A relational databasemanagement system (RDBMS) • A software system used to maintain relational databases is a relational database management system (RDBMS). Virtually all relational database systems use SQL (Structured Query Language) for querying and maintaining the database. • A relational database management system (RDBMS) is a database management system (DBMS) that is based on the relational model as invented by Edgar F.Codd, of IBM's San Jose Research Laboratory, in 1970s. • In 2017, many of the databases in widespread use are based on the relational database model. • Each row in a table has its own unique key( primary key). Rows in a table can be linked to rows in other tables by adding a column for the unique key of the linked row 47
  • 49.
    DBMS components  Hardware– the physical computer system that allows physical access to data.  Software – the actual program that allows users to access, maintain, and update physical data.  Data – stored physically on the storage devices  Users –  End users - Normal user and DBA (Database Administrator)  Application programs  Procedures – a set of rules that should be clearly defined and followed by the users.
  • 50.
    50 DBMS Functions • Datadictionary management • Data storage management • Data transformation and presentation • Security management • Multiuser access control • Backup and recovery management • Data integrity management • Database language and • application programming interfaces • Database communication interfaces
  • 51.
  • 52.
    52 The Database fora small company 52
  • 53.
    The leading vendors(sellers) of RDBMS • There 10s 0f RDBMS vendors the leading ones are: **** RDBMS Vendors **** RDBMS Product **** Computer Associates INGRES IBM DB2 INFORMIX Software INFORMIX Oracle Corporation Oracle Microsoft Corporation MS Access Microsoft Corporation SQL Server MySQL AB MySQL NCR Teradata PostgreSQL Dvlp Grp PostgreSQL Sybase Sybase 11 53
  • 54.
    RDBMS vendor marketshare 54 In 2016 the revenue reached 35 billion dollars for the Oracle.
  • 55.
    Which relational DBMSis best for your company? • If you're a large organization with a mainframe and want • For Unix and Linux installations, your primary choices are Oracle and DB2. • Oracle is the market leader on those platforms, though IBM has a solid presence there, too. • For Windows development, Microsoft is the clear leader on its own OS. Also, , Microsoft has announced plans to support SQL Server on Linux. 55
  • 56.
    Database design • Databasedesign is the process of producing a detailed data model of database. • This data model contains all the needed logical and physical design choices and physical storage parameters needed to generate a design in a data definition language, which can then be used to create a database. • A fully attributed data model contains detailed attributes for each entity.
  • 57.
  • 58.
    ER model --Database Modeling  Database designs include ER (entity-relationship model) diagrams. An ER diagram is a diagram that helps to design databases in an efficient way.  The ER data model employs three basic concepts:  entity sets,  relationship sets,  attributes.  The ER model can express the overall logical structure of a database graphically. 58
  • 59.
    Basics of E-RDiagrams • Entity – represented by a rectangle with its name in capital letters. • Relationships – represented by an active or passive verb inside the diamond that connects the related entities. • Connectivities – i.e., types of relationship – written next to each entity box. 5959
  • 60.
    60 Schemas • Relation schema= relation name and attribute list.  Optionally: types of attributes.  Example: Beers(name, manf) or Beers(name: string, manf: string) • Database = collection of relations. • Database schema = set of all relation schemas in the database. 60
  • 61.
  • 62.
  • 63.
    Slide 63 Example ofa Relation STUDENT (Name, SSN, Home_phone, Addr, Office_phone, Age, GPA) 63
  • 64.
    64 Example: University Databaseusing E/R diagram • Professors work for only one department • Departments have many professors • Each course is taught by only one professor • Students make a plan or program of study • A course could meet in several rooms/times • Graduate students must have an advisor • Cities are categorized as resident/BirthPlace • Visiting prof. need an End/Start date
  • 65.
    65 E/R diagram ofa University Database BIRTH_PLACE CITY Name State PERSON LastName Age PROFESSOR TenureStatus Title STUDENT DEPARTMENT Name Phone Belongs_To ENROLLED Semester Grade PLANNED Semester GRADUATE_STUDENT ADVISED_BY VISITING_ PROFESSOR Start_Appointment End_Appointment COURSE Title Description Credits TAUGHT_BY Semester MEETS ROOM RoomNumber Building TIME Day Hour RESIDENT 1 N N N N N N N 1 N N N 1 1 N N 1
  • 66.
    Types of Data 1.User data 2. Metadata 3. Indexes 4. Application Metadata 66
  • 67.
    1. User Data •A table of data is called a relation • Columns are fields or attributes • Rows are specific instances • Relations must be structured properly 67
  • 68.
    2. Metadata or DataDictionary “a description of the structure of the database” System tables store metadata, including:  table names  field names  primary key fields  data types  field size  data constraints (e.g., input masks, and validation rules)  relationships between tables 68
  • 70.
  • 71.
    4. Application Metadata Storesthe structure and format of: – forms – reports – Queries:a query is a form of questioning using query language – other application components 71
  • 72.
    Relations Keys “In rationaldatabase the relations are connects through unique identifier called key. It is a group of one or more attributes that uniquely identifies a row”
  • 73.
    Types of Keys •A (primary) key is one (or more) columns of a relation that is (are) used to identify a row • A composite key is a key that contains two or more attributes – For a key to be unique, it is sometimes necessary to use a composite key – Can also generate a set of unique values to serve as the key (for example University ID numbers) 73
  • 75.
    A Foreign Key •A table may be related to other tables (i.e., a relationship) • To create relationships, you may need to create a foreign key • A foreign key is a primary key from one table placed into another table • Referential integrity - every value of a foreign key must match a value of an existing primary key 75
  • 76.
  • 77.
    Properties of aRelation (table) 1. There are no duplicate tuples (rows). –The body of the relation is a mathematical set (i.e., a set of tuples), and sets in mathematics by definition do not include duplicate elements. –If a "relation" contains duplicate tuples, then it is not a relation. 77
  • 78.
    Properties of aRelation 2. Tuples (rows) are unordered (top to bottom). – Sets in mathematics are not ordered. So, even if a relation A's tuples are reversely ordered, it is still the same relation. – Thus, there is no such thing as "the 5th tuple" or the last tuple. In other words, there is no concept of positional addressing. 78
  • 79.
    Properties of aRelation 3. Attributes (columns) are unordered (left to right). –The heading of a relation is also defined as a set. –There is no such thing as "5th attribute (column)" or the last attribute. 79
  • 80.
    Properties of aRelation 4. All attribute values are atomic. – At every row-and-column position within the table, there always exists precisely one value, never a list of values. Or equivalently, relations do not contain repeating groups. – A relation satisfying this condition is said to be in First Normal Form. 80
  • 81.
    Entity Integrity Rule •Guarantees that each entity will have a unique identity and ensures that foreign key values can properly reference primary key values. • Requirement – No component of the primary key is allowed to accept nulls. – By "null" here, we mean that information is missing for some reason. 81
  • 82.
    Foreign Key • Anattribute in one table whose values must either match the primary key in another table or be null. • Attribute FK of base relation R2 is a foreign key if and only if it satisfies the following two time- • independent properties: – Each value of FK is either wholly null or wholly non- null. – Each non-null value of FK is identical to the value of PK in some tuple of R1. 82
  • 83.
    Referential Integrity Enforcement •Referential Integrity (RI) is a database concept used to ensure that the relationships between your database tables remains synchronized during data modifications, you may choose among those items: • Restriction: – does not allow any deletion • Nullification: – if value is deleted, reference value will set to be null. • Cascading: – if value is deleted, reference value will also be deleted. – if value is updated, reference value will also be updated. • Default value: – if value is deleted, reference value will be have default value, which is provided by the system. 83
  • 84.
    Referential Integrity Enforcement •Cascade Update Related Fields – Change of PK values in primary table  automatic change of FK values • Cascade Delete Related Fields – Delete of a record in the primary table  automatic delete of all records in the related table that have a matching FK value 84
  • 85.
  • 87.
    Data Definition Language(DDL) • : DDL is used for specifying the database schema. Let’s take SQL for instance to categorize the statements that comes under DDL. • To create the database instance – CREATE • To alter the structure of database – ALTER • To drop database instances – DROP • To delete tables in a database instance – TRUNCATE • To rename database instances – RENAME • All these commands specify or update the database schema that’s why they come under Data Definition language. 87
  • 88.
    Data Manipulation Language (DML) •DML is used for accessing and manipulating data in a database • To read records from table(s) – SELECT • To insert record(s) into the table(s) – INSERT • Update the data in table(s) – UPDATE • Delete all the records from the table – DELETE 88
  • 89.
    Data Control language(DCL) • DCL is used for granting and revoking user access on a database – • To grant access to user – GRANT • To revoke access from user – REVOKE • In practical data definition language, data manipulation language and data control languages are not separate language; rather they are the parts of a single database language such as SQL. 89
  • 90.