0% found this document useful (0 votes)
7 views16 pages

ADBMS Assignment 2

The document is an assignment on Advanced Database Management Systems, focusing on a case study of RetailMart's Sales Data Warehouse. It covers OLAP operations, star and snowflake schemas, and includes definitions, diagrams, and application scenarios for data analysis. The assignment also compares the two schemas in terms of performance, redundancy, and complexity.

Uploaded by

eshayyy321
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views16 pages

ADBMS Assignment 2

The document is an assignment on Advanced Database Management Systems, focusing on a case study of RetailMart's Sales Data Warehouse. It covers OLAP operations, star and snowflake schemas, and includes definitions, diagrams, and application scenarios for data analysis. The assignment also compares the two schemas in terms of performance, redundancy, and complexity.

Uploaded by

eshayyy321
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Advanced Database Management Systems

Name Zainab Ehsan

Roll Number 347

Semester 4

Section I

Date 15-June-2025

Assignment Number 2

Submitted To Ma’am Farwa Javed

Lahore Garrison University

Department of Computer Science

1
Question 1

Case Study: RetailMart Sales Data Warehouse

RetailMart, a national retail chain, maintains a Sales Data Warehouse to analyze product
performance across its various stores. The data is organized in an OLAP cube with the
following dimensions:
• Time: Year → Quarter → Month → Day
• Region: Country → State → City
• Product: Category → Subcategory → Product Name
• Measure: Total Sales, Quantity Sold
RetailMart’s management uses this data to make strategic decisions such as promotional
planning, stock replenishment, and store performance evaluation.

Making an OLAP Cube

Time

Region
Product

2
Scenario 1:

A manager wants to compare Total Sales for each Product Category in Q1 2025.

Q1. What OLAP operation is this?

A. Identifying OLAP Operations

This is a Slice & Dice (though only we’re slicing data) and Roll-Up operation.

• Slice & Dice: As we’re Filtering/Selecting data for a specific time (Q1 2025).
• Roll-Up: Aggregating data up to the Product Category level from the Product
dimension.

Q2. What dimensions are involved?

A. Dimensions Involved

• Time Dimension – Q1 2025.


• Product – Category Comparison
• Measure – Total Sales.

Q3. What part of the cube is being selected?

A. Cube Selection

The following are the parts of the cube being selected:

• Time (Q1 2025)


• Product (Category)
• Region (All)
• Measure (Total Sales)

3
Scenario 2:

A regional analyst wants to break down the sales in California into more detail, viewing by
city level instead of state.

Q1. What OLAP operation is being applied?

A. Drill-down OLAP operation

As we’re selecting “City” instead of “State”.

Q2. From which level to which level is the transition happening?

A. Transition

From State → City in the Region dimension.

4
Scenario 3:

The head office wants to summarize total sales across all U.S. states into a single value by
year.

Q1. What OLAP operation does this represent?

A. Roll-Up OLAP Operation

As we’re selecting “Total Sales” into a single value by “Year” and from “State” level to
“Country” level.

Q2. Which dimension is being rolled up?

A. Region Dimension

The Region dimension from State level → to Country level (United States) is being rolled up.
Additionally, the Time dimension is being viewed at the Year level.

5
Scenario 4:

A category manager wants to view product performance by rotating the OLAP cube to see
regions as columns and product categories as rows.

Q1. Which OLAP operation is applied here?

A. Pivot OLAP Operation

As we’re rotating the cube making regions as columns and product categories as rows.
Product Categories

Regions

Region Product Categories

Before Rotation After Rotation

Q2. What is the benefit of this operation?

A. Benefit of Pivot OLAP Operation

• It reorganizes the view to make analysis easier to interpret.


• Makes comparison between dimensions more visual and accessible.
• Help reveal patterns by changing perspective.

6
Scenario 5:

A store supervisor filters the data cube to focus only on "Electronics" sold in "New York"
during December 2024.

Q1. What OLAP operation(s) are involved?

A. Slice and Dice OLAP Operations

As we’re using more than one filter such as filtering the data by "Electronics" to a specific
country i.e. "New York" to being specific about time i.e. December 2024.

Q2. Is this a slice, dice, or both? Justify

A. Both OLAP Operation

• Slice: Fixing one value per dimension — e.g., Time (December 2024).
• Dice: Applying multiple filters across dimensions — Product (Electronics), Region
(New York).

Justification: Because more than one dimension is being filtered at specific values/ranges.

7
Question 2

Star Schema & Snowflake Schema

Part A: Conceptual Understanding

Q1. Define the following terms:

a) Fact Table

A. Basically, a Fact Table is the central table in a data warehouse schema that contains
quantitative data (measures) such as sales, revenue, stock or quantity. It includes foreign keys
about dimension tables.

b) Dimension Table

A. A Dimension Table contains descriptive, textual, or categorical information related to


dimensions of a fact contained by fact table (like product, time, region, colors). It helps slice
and dice the data in different ways.

c) Star Schema

A. A Star Schema is shaped like a star. It organizes data into a central fact table surrounded by
dimension tables, hence resembling a star shape. It helps simplify queries, improve
performance, enhance readability and handle large volumes of data.

Figure 1: Star Schema Example

8
d) Snowflake Schema

A. A Snowflake Schema is a normalized version of the star schema, where dimension tables
are normalized into multiple related tables, resembling a snowflake's branching structure. It
has a Fact Table, Dimension Table, Normalized Dimensions, Hierarchical Structure. It helps
reduce data redundancy, improve data integrity, allows for more details analysis by drilling
down into different levels of dimensions. Though it does require more complex joins to retrieve
data from multiple dimension tables and while storage is efficient, the joins needed for querying
can sometimes slow down performance.

Figure 2: Snowflake Schema Example

e) Normalization in the context of data warehousing

A. Normalization in the context of data warehousing means organizing data to reduce


redundancy and improve consistency by dividing tables into related sub-tables and linking
them via keys.

9
Part B: Diagram-Based Analysis

Refer to the diagrams given above. Answer the following questions:

Q2. Based on Diagram 1 (Snowflake Schema - Vehicle Sales):

a) Identify the fact table and write its name and attributes.

A. Sales Fact Table & Its Attributes

Fact Table: Sales Fact Table

Attributes:

• Time_key (FK)
• Item_key (FK)
• Branch_key (FK)
• Location_key (FK)
• Dollars_sold (Measure)
• Units_sold (Measure)

b) Name any 3 dimension tables and explain the type of data they store.

A. 3 dimension tables

1. Time Dimension Table

10
Stores: Temporal data to analyze sales over time.

Attributes: Time_key, Day, Month, Quarter, Year

2. Item Dimension Table

Stores: Product-level descriptive data.

Attributes: Item_key, Item_name, Brand, Type, Supplier_key

3. Branch Dimension Table

Stores: Information about store branches.

Attributes: Branch_key, Branch_name, Branch_type

c) Explain how normalization is applied between the Dealer, Location, and Country tables.

A. Normalization between Tables

• Instead of storing the full city, state, and country data repeatedly in the Location table,
it's stored in a separate City table.
• The Location table just keeps a reference via City_key (foreign key).
• This is a clear application of 3rd Normal Form (3NF), reducing redundancy and
ensuring data consistency.

11
Q3. Based on Diagram 2 (Sales Star/Snowflake Schema - Retail Sales):

a) Is this schema a star or snowflake schema? Justify your answer.

A. Snowflake Schema

This is a Snowflake Schema as it shows normalized dimension tables.

• Dealer table references Location_ID and Country_ID instead of storing full location
and country names.
• Product table references Variant_ID from a separate Variant table.

There are multiple levels of hierarchy in dimension tables, hence, making it a snowflake
schema.

b) Describe the relationship between the Location Dimension Table and Country
Dimension Table.

A. Relationship between the Location Table and Country Table

Dealer Table links to both Location_ID and Country_ID.

But there is no direct relationship between them.

It reflects a hierarchical normalization, Country holds broader geographical info, Location


specifies the region, Dealer acts as the bridge linking both.

12
c) How does the schema help in analyzing supplier-wise and city-wise sales?

A. Supplier-Wise Sales and City-Wise Sales

Supplier-Wise Sales

Using the Revenue fact table, Dealer dimension via Dealer_ID can be reached.

Dealer connects to Country via Country_ID, which leads to Country_Name.

This enables analysis like:

• Total sales per country


• Year-over-year revenue by country
• Comparing regional performance across countries

City-Wise Sales

Supplier table doesn’t exist

13
Part C: Application and Comparison

Q4. Application Scenario:

You are a data analyst for a chain of retail stores. Your manager wants to understand how
sales vary by time, product, supplier, and branch.

a) Which schema (star or snowflake) would you recommend for faster reporting and why?

A. Recommendations for a Schema

I would recommend Star Schema for faster reporting as faster query performance meaning
less joins required. Moreover, all data is denormalized.

b) Which schema is more suitable for minimizing redundancy and ensuring consistency?

A. More Suitable Schema

I would recommend Snowflake Schema for minimizing redundancy and ensuring


consistency as repetitive data is stored only once and reduces costs. Moreover, dimensions
are normalized.

14
Q5. Comparison Table:

Fill in the blanks in the following comparison table:

Feature Star Schema Snowflake Schema

Table Normalization Denormalized Normalized


Query Performance Faster Slower
Storage Requirement More Less
Maintenance Complexity Less More

15
Q6. Star Schema

Draw a simple star schema for a “Library Management System” including:

Fact table: Book Issues

Dimension tables: Book, Member, Time, Librarian

A. Star Schema for Library Management System

Book
Member
Fact Table Dimension Table
Dimension Table

Book Issues

Librarian Time

Dimension Table Dimension Table

16

You might also like