Types of Big Data
Dr. Anil Kumar Dubey
Associate Professor,
Computer Science & Engineering Department,
ABES EC, Ghaziabad
Affiliated to Dr. A.P.J. Abdul Kalam Technical University, Uttar
Pradesh, Lucknow
Types of Big Data
Following are the types of Big Data:
Structured
Unstructured
Semi-structured
According to Merrill Lynch, 80–90% of business
data is either unstructured or semi-structured.
Gartner also estimates that unstructured data
constitutes 80% of the whole enterprise data.
Conti…
Types of Big Data: Structured
Any data that can be stored, accessed and
processed in the form of fixed format is termed
as a ‘structured’ data.
1021 bytes = 1 zettabyte
or one billion
terabytes forms a zettabyte.
Data stored in a relational database
management system is one example of
Conti…
Structured is the data which is in an
organized form (e.g., in rows and columns)
and can be easily used by a computer
program.
Relationships exist between entities of data,
such as classes and their objects.
Data stored in databases is an example of
structured data.
Conti…
Structured Data Come from…
Structured V/s Semi-structured Data
Structured Data
Structured Data Retrieval
Structured Data Example
An ‘Employee’ table in a database is an
example of Structured Data.
Employee_ID Employee_Name Gender Department Salary_In_lacs
2365 Rajesh Kulkarni Male Finance 650000
3398 Pratibha Joshi Female Admin 650000
7465 Shushil Roy Male Admin 500000
7500 Shubhojit Das Male Finance 500000
7699 Priya Sane Female Finance 550000
Types of Big Data: Unstructured
Any data with unknown form or the structure is
classified as unstructured data.
In addition to the size being huge, un-structured
data poses multiple challenges in terms of its
processing for deriving value out of it.
A typical example of unstructured data is a
heterogeneous data source containing a
combination of simple text files, images, videos
etc.
Conti…
Unstructured data refers to the data that
lacks any specific form or structure
whatsoever.
This makes it very difficult and time-
consuming to process and analyze
unstructured data.
Email is an example of unstructured data.
Unstructured data
Unstructured data Come from…
Store Unstructured data
Conti…
Extract information from Unstructured data
Conti…
Unstructured Data: Example
The output returned by ‘Google Search’
Types of Big Data: Semi-structured
Semi-structured data can contain both the
forms of data.
We can see semi-structured data as a
structured in form but it is actually not
defined with e.g. a table definition in
relational DBMS.
Example of semi-structured data is a data
represented in an XML file.
Conti…
Semi-structured data pertains to the data
containing both the formats structured and
unstructured data.
To be precise, it refers to the data that
although has not been classified under a
particular repository (database), yet contains
vital information or tags that segregate
individual elements within the data.
Semi-structured data
Semi-structured data come from…
Manage Semi-structured Data
Store Semi-structured Data
Conti…
Extract Information from Semi-structured data
Conti…
Semi-structured Data example
Personal data stored in an XML file
<rec><name>Anil
Dubey</name><sex>Male</sex><age>33</age></rec>
<rec><name>Rohit
Rastogi</name><sex>Male</sex><age>43</age></rec>
<rec><name>Sikha</name><sex>Female</sex><age>31</age>
</rec>
THANK
YOU