Advanced Database Management Systems
Name Zainab Ehsan
Roll Number 347
Semester 4
Section I
Date 15-June-2025
Assignment Number 2
Submitted To Ma’am Farwa Javed
Lahore Garrison University
Department of Computer Science
1
Question 1
Case Study: RetailMart Sales Data Warehouse
RetailMart, a national retail chain, maintains a Sales Data Warehouse to analyze product
performance across its various stores. The data is organized in an OLAP cube with the
following dimensions:
• Time: Year → Quarter → Month → Day
• Region: Country → State → City
• Product: Category → Subcategory → Product Name
• Measure: Total Sales, Quantity Sold
RetailMart’s management uses this data to make strategic decisions such as promotional
planning, stock replenishment, and store performance evaluation.
Making an OLAP Cube
Time
Region
Product
2
Scenario 1:
A manager wants to compare Total Sales for each Product Category in Q1 2025.
Q1. What OLAP operation is this?
A. Identifying OLAP Operations
This is a Slice & Dice (though only we’re slicing data) and Roll-Up operation.
• Slice & Dice: As we’re Filtering/Selecting data for a specific time (Q1 2025).
• Roll-Up: Aggregating data up to the Product Category level from the Product
dimension.
Q2. What dimensions are involved?
A. Dimensions Involved
• Time Dimension – Q1 2025.
• Product – Category Comparison
• Measure – Total Sales.
Q3. What part of the cube is being selected?
A. Cube Selection
The following are the parts of the cube being selected:
• Time (Q1 2025)
• Product (Category)
• Region (All)
• Measure (Total Sales)
3
Scenario 2:
A regional analyst wants to break down the sales in California into more detail, viewing by
city level instead of state.
Q1. What OLAP operation is being applied?
A. Drill-down OLAP operation
As we’re selecting “City” instead of “State”.
Q2. From which level to which level is the transition happening?
A. Transition
From State → City in the Region dimension.
4
Scenario 3:
The head office wants to summarize total sales across all U.S. states into a single value by
year.
Q1. What OLAP operation does this represent?
A. Roll-Up OLAP Operation
As we’re selecting “Total Sales” into a single value by “Year” and from “State” level to
“Country” level.
Q2. Which dimension is being rolled up?
A. Region Dimension
The Region dimension from State level → to Country level (United States) is being rolled up.
Additionally, the Time dimension is being viewed at the Year level.
5
Scenario 4:
A category manager wants to view product performance by rotating the OLAP cube to see
regions as columns and product categories as rows.
Q1. Which OLAP operation is applied here?
A. Pivot OLAP Operation
As we’re rotating the cube making regions as columns and product categories as rows.
Product Categories
Regions
Region Product Categories
Before Rotation After Rotation
Q2. What is the benefit of this operation?
A. Benefit of Pivot OLAP Operation
• It reorganizes the view to make analysis easier to interpret.
• Makes comparison between dimensions more visual and accessible.
• Help reveal patterns by changing perspective.
6
Scenario 5:
A store supervisor filters the data cube to focus only on "Electronics" sold in "New York"
during December 2024.
Q1. What OLAP operation(s) are involved?
A. Slice and Dice OLAP Operations
As we’re using more than one filter such as filtering the data by "Electronics" to a specific
country i.e. "New York" to being specific about time i.e. December 2024.
Q2. Is this a slice, dice, or both? Justify
A. Both OLAP Operation
• Slice: Fixing one value per dimension — e.g., Time (December 2024).
• Dice: Applying multiple filters across dimensions — Product (Electronics), Region
(New York).
Justification: Because more than one dimension is being filtered at specific values/ranges.
7
Question 2
Star Schema & Snowflake Schema
Part A: Conceptual Understanding
Q1. Define the following terms:
a) Fact Table
A. Basically, a Fact Table is the central table in a data warehouse schema that contains
quantitative data (measures) such as sales, revenue, stock or quantity. It includes foreign keys
about dimension tables.
b) Dimension Table
A. A Dimension Table contains descriptive, textual, or categorical information related to
dimensions of a fact contained by fact table (like product, time, region, colors). It helps slice
and dice the data in different ways.
c) Star Schema
A. A Star Schema is shaped like a star. It organizes data into a central fact table surrounded by
dimension tables, hence resembling a star shape. It helps simplify queries, improve
performance, enhance readability and handle large volumes of data.
Figure 1: Star Schema Example
8
d) Snowflake Schema
A. A Snowflake Schema is a normalized version of the star schema, where dimension tables
are normalized into multiple related tables, resembling a snowflake's branching structure. It
has a Fact Table, Dimension Table, Normalized Dimensions, Hierarchical Structure. It helps
reduce data redundancy, improve data integrity, allows for more details analysis by drilling
down into different levels of dimensions. Though it does require more complex joins to retrieve
data from multiple dimension tables and while storage is efficient, the joins needed for querying
can sometimes slow down performance.
Figure 2: Snowflake Schema Example
e) Normalization in the context of data warehousing
A. Normalization in the context of data warehousing means organizing data to reduce
redundancy and improve consistency by dividing tables into related sub-tables and linking
them via keys.
9
Part B: Diagram-Based Analysis
Refer to the diagrams given above. Answer the following questions:
Q2. Based on Diagram 1 (Snowflake Schema - Vehicle Sales):
a) Identify the fact table and write its name and attributes.
A. Sales Fact Table & Its Attributes
Fact Table: Sales Fact Table
Attributes:
• Time_key (FK)
• Item_key (FK)
• Branch_key (FK)
• Location_key (FK)
• Dollars_sold (Measure)
• Units_sold (Measure)
b) Name any 3 dimension tables and explain the type of data they store.
A. 3 dimension tables
1. Time Dimension Table
10
Stores: Temporal data to analyze sales over time.
Attributes: Time_key, Day, Month, Quarter, Year
2. Item Dimension Table
Stores: Product-level descriptive data.
Attributes: Item_key, Item_name, Brand, Type, Supplier_key
3. Branch Dimension Table
Stores: Information about store branches.
Attributes: Branch_key, Branch_name, Branch_type
c) Explain how normalization is applied between the Dealer, Location, and Country tables.
A. Normalization between Tables
• Instead of storing the full city, state, and country data repeatedly in the Location table,
it's stored in a separate City table.
• The Location table just keeps a reference via City_key (foreign key).
• This is a clear application of 3rd Normal Form (3NF), reducing redundancy and
ensuring data consistency.
11
Q3. Based on Diagram 2 (Sales Star/Snowflake Schema - Retail Sales):
a) Is this schema a star or snowflake schema? Justify your answer.
A. Snowflake Schema
This is a Snowflake Schema as it shows normalized dimension tables.
• Dealer table references Location_ID and Country_ID instead of storing full location
and country names.
• Product table references Variant_ID from a separate Variant table.
There are multiple levels of hierarchy in dimension tables, hence, making it a snowflake
schema.
b) Describe the relationship between the Location Dimension Table and Country
Dimension Table.
A. Relationship between the Location Table and Country Table
Dealer Table links to both Location_ID and Country_ID.
But there is no direct relationship between them.
It reflects a hierarchical normalization, Country holds broader geographical info, Location
specifies the region, Dealer acts as the bridge linking both.
12
c) How does the schema help in analyzing supplier-wise and city-wise sales?
A. Supplier-Wise Sales and City-Wise Sales
Supplier-Wise Sales
Using the Revenue fact table, Dealer dimension via Dealer_ID can be reached.
Dealer connects to Country via Country_ID, which leads to Country_Name.
This enables analysis like:
• Total sales per country
• Year-over-year revenue by country
• Comparing regional performance across countries
City-Wise Sales
Supplier table doesn’t exist
13
Part C: Application and Comparison
Q4. Application Scenario:
You are a data analyst for a chain of retail stores. Your manager wants to understand how
sales vary by time, product, supplier, and branch.
a) Which schema (star or snowflake) would you recommend for faster reporting and why?
A. Recommendations for a Schema
I would recommend Star Schema for faster reporting as faster query performance meaning
less joins required. Moreover, all data is denormalized.
b) Which schema is more suitable for minimizing redundancy and ensuring consistency?
A. More Suitable Schema
I would recommend Snowflake Schema for minimizing redundancy and ensuring
consistency as repetitive data is stored only once and reduces costs. Moreover, dimensions
are normalized.
14
Q5. Comparison Table:
Fill in the blanks in the following comparison table:
Feature Star Schema Snowflake Schema
Table Normalization Denormalized Normalized
Query Performance Faster Slower
Storage Requirement More Less
Maintenance Complexity Less More
15
Q6. Star Schema
Draw a simple star schema for a “Library Management System” including:
Fact table: Book Issues
Dimension tables: Book, Member, Time, Librarian
A. Star Schema for Library Management System
Book
Member
Fact Table Dimension Table
Dimension Table
Book Issues
Librarian Time
Dimension Table Dimension Table
16