Header Ads Widget

Introduction to Big Data

Big Data is a collection of data that is huge in volume, yet growing exponentially with time. It is data with so large size and complexity that none of the traditional data management tools can store it or process it efficiently. Big data is also data but with a huge size.

Types of digital data

DIGITAL DATA

Digital data is information stored on a computer system as a series of 0’s and 1’s in a binary language. Digital data jumps from one value to the next in a step-by-step sequence.

Example: Whenever we send an email, read a social media post, or take pictures with our digital camera, we are working with digital data.

Digital data can be classified into three forms:


a. Unstructured Data: The data which does not conform to a data model or is not in a form that can be used easily by a computer program is categorized as unstructured data. About 80—90% data of an organization is in this format.

Examples: Memos, chat rooms, PowerPoint presentations, images, videos, letters, research, white papers, the body of an email, etc.

The output returned by ‘Google Search’

Example Of Unstructured Data

Example Of Unstructured Data


b. Semi-Structured Data: The data which does not conform to a data model but has some structure is categorized as semi-structured data. However, it is not in a form that can be used easily by a computer program.

Example : Emails, XML, markup languages like HTML, etc. Metadata for this data is available but is not sufficient.

Personal data stored in an XML file-

<rec><name>Prashant Rao</name><sex>Male</sex><age>35</age></rec>
<rec><name>Seema R.</name><sex>Female</sex><age>41</age></rec>
<rec><name>Satish Mane</name><sex>Male</sex><age>29</age></rec>
<rec><name>Subrato Roy</name><sex>Male</sex><age>26</age></rec>
<rec><name>Jeremiah J.</name><sex>Male</sex><age>35</age></rec>

c. Structured Data: The data which is in an organized form (ie. in rows and columns) and can be easily used by a computer program is categorized as semi-structured data. Relationships exist between entities of data, such as classes and their objects.

Example: Data stored in databases.

An ‘Employee’ table in a database is an example of Structured Data

Employee_ID

Employee_Name

Gender

Department

Salary_In_lacs

2365 

Rajesh Kulkarni 

Male 

Finance

650000

3398 

Pratibha Joshi 

Female 

Admin 

650000

7465 

Shushil Roy 

Male 

Admin 

500000

7500 

Shubhojit Das 

Male 

Finance 

500000

7699 

Priya Sane 

Female 

Finance 

550000

Post a Comment

0 Comments