This document provides an introduction and overview of data structures and algorithms. It begins by outlining the topics that will be covered, including data structures, algorithms, abstract data types, and object-oriented programming. It then defines what a data structure is and provides examples. Different types of data structures are discussed, including linear structures like lists, queues, and stacks, as well as non-linear structures like trees, graphs, and hash tables. The document also defines what an algorithm is and discusses why algorithms are important. It provides examples of successful algorithms and discusses the need for correctness and efficiency in algorithms. The relationship between programs, data structures, and algorithms is also briefly explained.
Data Structures &Algorithms Lecturer: Dr. Muhammad M Iqbal Ph.D (DCU), M.Phil (GCU) 1
2.
Outline • Introduction toData Structures • What are data structures? • What data structures do we study? • Why we need Algorithms? • What are Abstract Data Types? • Why Object-Oriented Programming (OOP) and Java for data structures? • How do I choose the right data structures? 2
Introduction to DataStructure 4 • Data Structure – The logical or mathematical model of a particular organization of data • A data structure is a way to logically organize data that specifies: • A set of data elements i.e., a data object and, • A set of operations which may legally be applied to elements of this data object. Driver and a License Plate
5.
Applications of DataStructure 5 Graphics Compiler Construction Database Management Systems Statistical analysis package Numerical Analysis Financial Modelling Artificial Intelligence Simulations
6.
Object Oriented Terminology (Revision) •Class: A class is nothing but a blueprint or a template for creating different objects which defines its properties and behaviors. A class can contain fields and methods to describe the behavior of an object. • Object: An object is an instance of a class created using a new operator. The new operator returns a reference to a new instance of a class. • Methods: Methods are nothing but members of a class that provide a service for an object or perform some business logic. • Interface: An interface is a reference type in Java. It is similar to class. It is a collection of abstract methods. A class implements an interface, thereby inheriting the abstract methods of the interface 6
7.
Data Structures inJava • A collection is a container object that holds a group of objects, often referred to as elements. • The Java Collections Framework supports three types of collections, named lists, sets, and maps. 7
8.
Data Structures inJava • Several data structures provided by the Java utility packages and the major DS are • Enumeration: It allows retrieval of successive elements from other data structures (acts as an interface). • BitSet: Collection of bits (used as flags) with the ability to set or clear as appropriate. • Vector: Variable length array (dynamic array) • Stack: Collection of objects with a defined order 8https://www.cs.princeton.edu/courses/archive/spr96/cs333/java/tutorial/tools/packages/java.util.html
9.
Data Structures inJava • Dictionary: Dictionary (map, association list) is a data structure, which is generally an association of unique keys with some values. One may bind a value to a key, delete a key (and naturally an associated value) and lookup for a value by the key. • Hashtable: Hashtable stores key/value pairs in a hash table. Keys are used to specify the objects along with their link to values. The keys are hashed and resulting hash code is used as the index at which the value is stored within the table. • Properties: Properties is a subclass of Hashtable. It is used to maintain the list of values in which the key is a String and the value is also a String. 9
10.
What is DataStructure? • A method of organizing information so that the information can be stored and retrieved efficiently. • The structure not only stores data, but also supports operations for accessing and manipulating the data. 10
11.
11 Why study algorithms? •Theoretical importance – the core of computer science – Operating systems used algorithms for several operations • Practical importance – A practitioner’s toolkit of known algorithms – Framework for designing and analyzing algorithms for new problems – Helps in decision making for finding the best solution for the problem https://www.youtube.com/watch?v=-q-3b_093do
12.
Development of Algorithm •Archimedes invented a method for determining the volume of an object with an irregular shape. • A votive crown of pure gold for a temple had been made for King Hiero II. • Archimedes was asked to determine whether some silver had been substituted by the goldsmith or not. 12 Archimedes of Syracuse Source: en.wikipedia.org • The challenge for Archimedes was to solve the problem without damaging crown. • He solved this problem during bath and he noticed that the level of the water in the tub rose as he got in. • He used this effect to determine the volume of the crown. • By dividing the mass of the crown by the volume of water displaced, the density of the crown could be obtained.
13.
13 What is analgorithm? An algorithm is a sequence of unambiguous instructions for solving a problem, i.e., for obtaining a required output for any legitimate input in a finite amount of time. “computer” problem algorithm input output Classes of algorithms on data structures: • Search • Sort • Insert • Update • Delete
14.
Algorithms • The algorithmsare independent of programming languages. • We can always use features of a language to enhance the performance of an algorithm. • Designing of an algorithm is an art and we are going to learn the skills of designing the algorithms in this module. • Whenever we design an algorithm, the upper and lower limit of input size for that algorithm should be clearly known in advance before the practical implementation for any problem. 14
15.
Features of Algorithm •Finiteness: An algorithm must always terminate after a finite number of steps. Similar procedures which differ only in that they do not terminate can be described as computational methods. • Definiteness: Each step of an algorithm must be precisely defined; the actions to be carried out must be rigorously and unambiguously specified for each case. • Input: An algorithm has zero or more inputs: quantities that are given to it initially before the algorithm begins, or dynamically as the algorithm runs. • Output: An algorithm has one or more outputs: quantities that have a specified relation to the inputs. • Effectiveness: An algorithm is also generally expected to be effective, in the sense that its operations must all be sufficiently basic that they can in principle be done 15
16.
Observations about TeaMachine • Can perform only tasks machine’s interface presents. • You must understand these tasks • Cannot access the inside of the machine • You can use the machine even though you do not know what happens inside. • Can perform only tasks specific to DS • Must follow to the specifications of the operations implemented in the DS • Cannot access data inside the DS without interface • Use DS application, even if you don’t know how data is stored • Usable even with new implementation. 16 Tea Machine Now, some old vending machines have the facility of credit/ debit cards
17.
Challenge for SolvingProblems • To solve a computational problem, we write code/ programs. • Programs are made up of components, data storage mechanisms and ways to manipulate/ interpret/use the stored data during the processing. • It is not good enough to do this in any way, shape or form, we must do it efficiently. 17
18.
Successful Examples ofAlgorithms • Consider large data industries: Amazon AWS • How do we handle networking traffic and rerouting if there is a problem somewhere in the world (Data Analytics algorithms). Facebook • Maintenance of hardware, keep track of problems automatically and deal with them automatically (Edge rank, Machine learning algorithms) 18 Google • How to manage large scale indexing to make it useful (Page Rank) Page C has a higher PageRank than Page E, even though there are fewer links to C
19.
What do weneed? Correctness and Efficiency Correctness can be described as • We can be more confident in the logic behind our programs if we can be sure of and agree on the logic of the data structures used • A given data structure has defined access and modification methods 19 Efficiency can be considered in terms of multiple resources • These include time and space considerations • We will mostly consider time but at times we may have to compromise with regard to space
20.
Maintainable Code • Weneed to take the following into account: i. Long functions ii. Multiple responsibilities iii. Don't repeat yourself iv. Naming conventions v. Use assertions vi. Refactoring vii. Testing 20
21.
i. Long Functions •Long functions can complicate the code • It becomes unclear what the purpose of the function is • Additionally, it is difficult to debug longer functions • Distribute the load of programming logic into small parts to obtain better efficiency 21
22.
ii. Multiple Responsibilities • Acrossthe board, ensure that the methods, classes and modules are only responsible for what they need to be responsible for • For example, the same class should not process data, display data, and handle I/O • This should all be separated out • Get and Set functions • High Cohesion = clearly defined responsibility 22
23.
iii. Don't RepeatYourself • You should know this from first year • If the same code is in multiple places, the updates need to be applied to all locations • Calling a suitable function rather than inserting a block of code increases legibility 23
24.
iv. Naming Conventions •Consider names as a form of documentation • Selecting good names and following a specific style makes everyone's life easier • For both naming conventions and code style we will be following the Google Style guides throughout this course 24
vi. Refactoring • Sometimesyou need to update your own code • Larger scale refactoring in a company is a big deal and prone to damage • Ideally, do this is small steps • Best to do so in conjunction with appropriate tests • Do not add new functionality while refactoring • Small steps, take a large function and break down into smaller ones one at a time 26
27.
vii. Testing • Testingis vital throughout your programming career • Used for refactoring • Used as test-driven development • Unit tests are considered as useful to the testing of a code 27https://netbeans.org/kb/docs/java/junit-intro.html
28.
28 Analysis of algorithms •How good is the algorithm? – time efficiency – space efficiency • Does there exist a better algorithm? – lower bounds – optimality
29.
Computer Program • Program= Data Structure + Algorithm 29 “computer” Program Data Structure + Algorithm input output Ways of Sorting Data Recipes on Operating
30.
Data Structures andAlgorithms ● An array is an aggregation of entries that are arranged in contiguous fashion with the provision of a single-step random access to any entry. – There are numerous other situations where more sophisticated data structures are required. – Data structure categories: i. Linear ii. Non-linear – Category is based on how the data is conceptually organized or aggregated. 30
31.
Data Structures andAlgorithms ● Linear structures – List, Queue and Stack are linear collections, each of them serves as a repository in which entries may be added or removed at will. ● Differ in how these entries may be accessed once they are added. 31
32.
Data Structures andAlgorithms ● List – The List is a linear collection of entries in which entries may be added, removed, and searched for without restrictions. ● Two kinds of list: – Ordered List (Sorted) – Unordered List (Not sorted) 32
33.
Data Structures andAlgorithms ● Queue – Entries may only be removed in the order in which they are added. ● First out (FIFO) data structures ● No search for an entry in the Queue ● Stack – Entries may only be removed in the reverse order in which they are added. ● Last In, First Out (LIFO) ● No search for an entry in the Stack. https://www.youtube.com/watch?v=BFMdRrjGCRw 33
34.
Data Structures andAlgorithms ● Trees – Non-linear arrangement. – There are various tree structures. ● Binary Tree – Consists of entries each of which contributes to the tree as a whole based on its position in the tree. ● Moving an entry from one position to another changes the meaning of the Binary Tree. 34
35.
Data Structures andAlgorithms 35 ● Binary Search Tree – A binary search tree (BST) is a binary tree where each node has a Comparable key (and an associated value) and satisfies the restriction that the key in any node is larger than the keys in all nodes in that node's left subtree and smaller than the keys in all nodes in that node's right subtree. – Arranged (effectively) in sorted order: tree analogue of the Ordered List.
36.
Data Structures andAlgorithms ● General Tree – Models a hierarchy such as the organizational structure of a company, or a family tree. ● A non-linear arrangement of entries, it is a generalization of the binary tree structure, hence the name. 36
37.
Data Structures andAlgorithms ● Heap as a Priority Queue – A priority queue is a specialization of the FIFO Queue. ● Entries are assigned priorities. ● The entry with the highest priority is the one to leave first. 37
38.
Data Structures andAlgorithms ● Hash Table – Stores entries with the sole aim of enabling efficient search. ● Requires a sound knowledge of certain mathematical properties of numbers, and so-called hash functions that manipulate numbers. 38
39.
Data Structures andAlgorithms – A general tree is a special kind of graph, since a hierarchy is a special system of relationships among entities. ● Graphs may be used to model systems of physical connections such as computer networks, airline routes, etc., as well as abstract relationships such as course pre-requisite structures. ● Standard graph algorithms answer certain questions we may ask of the system. ● Two kinds of graphs: – Directed Graph—asymmetric relationship – Undirected Graph—a symmetric relationship Graphs 39
40.
Abstract Data Types 40 •Our goals is to write a code that can be reused in many different applications. • One way to make code reusable is to encapsulate the data elements together with the methods that operate on that data. • An abstract data type (ADT) is a mathematical model for data types, where a data type is defined by its behavior (semantics) from the point of view of a user of the data and possible operations on data of this type, and the behavior of these operations.
41.
What are AbstractData Types? ● Software developers struggle to write code that is robust, easy to maintain, and reusable. – A data structure is an abstract data type. ● i.e. The primitive data types built into a language integer, real, character and boolean. – We do not at all worry about its internal representation. ● We know we can perform certain operations on integers that are guaranteed to work the way they are intended to. – “+”, “-”, “*”, “/” language designers and compiler writers were responsible for designing the interface for these data types, as well as implementing them. 41
42.
What are AbstractData Types? ● Exactly how these operations are implemented by the compiler in the machine code is of no concern. – On a different machine, the behavior of an integer does not change, even though it's internal representations may change. – As the programmers were concerned, the primitive data types were abstract entries. 42
43.
What are AbstractData Types? ● Viewers see the TV in terms of its volume, contrast, brightness, and other controls. – Going one level lower, the TV itself is built out of various components. – One of these may be programmable chips and then transistors. 43
44.
What are AbstractData Types? ● A data structure consisting of several layers of abstraction. – A Stack (of integers) may be built using a List (of integers), which in turn may be built using an array (of integers). ● Integer and array are language-defined types. ● Stack and List are user-defined. 44
45.
What are AbstractData Types? • Since an ADT makes a clean separation between interface and implementation, the user only sees the interface and therefore does not need tamper with the implementation. – The responsibility of maintaining the implementation is separated from the responsibility of maintaining the code that uses an ADT. This makes the code easier to maintain. – A List ADT may be used directly in the application code, or may be used to build another ADT, such as the Stack. In Java: An interface is like a contract that tells the applications programmer precisely what methods are available and describes the operations they perform.45
46.
What are AbstractData Types? ● A stack may be built using a List ADT. ● A stack interface defines four operations: – push – pop – get-Size – isEmpty 46 ● The stack object contains a List object which implements its state, and the behavior of the Stack object is implemented in terms of the List object's behavior.
47.
Why OOP andJava for Data Structures? 47 • OOP languages are designed to overcome the traditional problems of procedural languages. • The basic unit of OOP is a class, which encapsulates both the static properties and dynamic operations within a "box", and specifies the public interface for using these boxes. Actually, OOP combines the data structures and algorithms of a software entity inside the same box. • OOP languages permit higher level of abstraction for solving real-life problems. The OOP languages (such as Java, C++ and C#) let you think in the problem space, and use objects to represent and abstract entities of the problem space to solve the problem.
48.
How Do Ichoose the Right Data Structures? ● Example – Implementing a printer queue, requires a queue data structure. ● Maintains a collection of entries in no particular order. ● An unordered list would be the appropriate data structure in this case. – It is not too difficult to fit the requirements of the application to the operations supported by a data structure. ● It is more difficult to choose from a set of candidate data structures that all meet the operational requirements. 48
49.
How Do Ichoose the Right Data Structures? ● When we have more than one data structure implementation whose interfaces satisfy our requirements, we may have to select one based on comparing the running times of the interface operations. ● Time is traded off for space, – i.e. more space is consumed to increase speed, or a reduction in speed is traded for a reduction in the space consumption. 49
50.
How Do Ichoose the Right Data Structures? ● Time-space tradeoff – We are looking to “buy” the best implementation of a stack. ● StackA. Does not provide a getSize operation. i.e., there is not single operation that a client can use to get the number of entries in StackA. ● StackB. Provides a getSize operation, implemented in the manner, transferring entries back and forth between two stacks. ● StackC. Provides a getSize operation, implemented as follows: a variable called size is maintained that is incremented every time an entry is pushed, and decremented every time an entry is popped. 50
51.
51 Summary • Introduction toData Structures and OOP Terminology in JAVA • Availability of Default Data Structures is JAVA • Understanding of the Definition of Algorithm • Why we need algorithms in different areas of industry • Pseudo Code for the algorithm (Example) • Data Structures and Algorithms • What is meant by Abstract data type • How do we choose the right algorithm for the problem
52.
Resources/ References • Introductionto Algorithms, Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest and Clifford Stein 3rd, 2009, MIT Press and McGraw-Hill. • Data Structures: Abstraction and Design Using Java, Elliott B. Koffmann, 2nd, 2010, Wiley. • Data Structures and algorithm analysis in java, Mark A. Weiss, 3rd, 2011, Prentice Hall. • Frank M. Carrano, Data Structures and Abstractions with Java™, 3rd Edition, Prentice Hall, PEARSON, ISBN-10: 0-13-610091-0. • This lecture used some images and videos from the Google search repository to raise the understanding level of students. https://www.google.ie/search? Dr. Muhammad M Iqbal*