Beginner’s Guide to Concepts of NOSQL and MongoDB Documented By: - Maulin Shah
Purpose Purpose of writing this document is to guide beginners to NoSQL concepts and MongoDB database. Introduction In recent years as internet and new technologies has become more accessible to people amount of data that was being generated in 1990’s in one year is now in 2016 being generated in one hour and maybe less time than that. And this rate is increasing rapidly every day. But, traditional methods and technologies of collecting and processing data (Like SQL, and different frameworks) were not designed to handle this huge amount of data. So, new methods and technologies has become necessity of time. Because of that we are getting to see new technologies like Hadoop, NoSQL, etc. which are specially designed to handle these amount of data. History of database technologies What is database and Database Management System (DBMS)? Database technologies are collection of organized data. In this type of software data is collected as schemas and tables. Database management system is an application that interacts with users. User types their requests in form of query and DBMS process it and give appropriate output to user. Three main era based models in database technology 1) Navigational (1960-1970) 2) Relational(SQL)(1970 - 2000) 3) Post-relational (NoSQL) (2000 - ongoing) Relational database (RDBMS) was most successful data model till 2000s. Shortcomings of RDBMS 1) Inability to handle unstructured/semi-structured data. Because as internet has expanded year over year more unstructured/semi-structured data is being produced and RDBMS cannot handle that. 2) CRUD operations are not fast enough to give results and are costly operations as it has to deal with joins and maintaining relationship among different data. 3) Because of Schema structure it is hard to scale out RDBMS. To overcome these shortcomings NoSQL databases were introduced NoSQL (Not Only SQL) NoSQL database is provided for distributed data stores where there is need for large scale of data storing needed. Because, they do not require fixed schemas, avoid join operations, and scale data horizontally. In NoSQL database tables stored as ASCII files, tuple represented as row and fields are separated with tabs. Type of NoSQL databases 1) Key-Value Oriented (Radis, Riak ) 2) Column Oriented (HBase, Cassandra) 3) Document Oriented (MongoDB, CouchDB)
4) Graph data Oriented (Neo4j) Key-Value Oriented In this type of database client get, put or delete value for a key. Here Value is Binary Large Object which only cares about data and not inside it, it is responsibility of application to understand what exactly is stored. Column Oriented In this type of database, databases are based on column and every column is considered individually. Here values of single column are stored adjacently. Data maintained by columns are in the form of column – specific files. Document Oriented In this type of database documents are mainly stored in value part of key/value store. These databases are hierarchical tree data structures that can have maps, collections and scalar values. Graph Oriented It uses graph structure to generate output of given query. It is mainly used for storing node data and relationships between these nodes. Defining and finding relationship is very quick and easy in this type of database. CAP Theorem It states that any distributed system; we should have three aspects C (consistency), A (Availability), P (Partition tolerance). But unfortunately we can have any two at same time in a distributed system. Consistency – every user should be able to see same data after execution of an operation. Availability – Less or no downtime. Partition Tolerance – system should work properly even though communication among server is not reliable. Sharding in NoSQL database We can define it as a partitioning scheme for large databases distributed across various servers and responsible for giving high performance and scalability. It Divides database into smaller parts (shards) and replicates it across different servers.
MongoDB Learning MongoDB is quite easy and fun. MongoDB is Document-Oriented database; it is open source NoSQL database. We can use it as alternative of RDBMS. It can give high performance by using with more specialized NoSQL databases. Conceptual Understanding of MongoDB 1) MongoDB has same concept like schema in Oracle SQL; within a MongoDB instance (like SQL schema) you can have zero or more database, each acting as high level container for everything else. 2) Collection in MongoDB is same as tables in RDBMS. MongoDB can have zero or more collection in it. 3) Document is MongoDB can be seen as row in RDBMS. Collections are made of zero or more documents. 4) Fields in MongoDB can be seen as Column in RDBMS. Document can have zero or more fields in it. 5) Indexes concept is same as in SQL. 6) Cursor in MongoDB is new concept which is used when we ask MongoDB for data it returns a pointer to a result set which is called cursor. Basic operations in MongoDB Let’s start playing with MongoDB. Insert() db.User.insert[{_id=1,FName:”Maulin”,Lname:”Shah”,address:{street:”Shyamalcrossroad”, city: “Ahmedabad”, zipCode:380015},Phone:[8980162257,9409032647]}, {_id=2,FName:”Paresh”,Lname:”Patel”,address:{street:”kharadigate”, city: “Surat”, zipCode:300018},Phone:[8089612275]}] If we do not have any collection insert() will create one.(!!! Interesting?), here, if we don’t specify id field MongoDB will generate id by itself as Object id. (wow!!) DATABASE COLLECTIONS DOCUMENTS FIELDS
Find() It works like select statement in SQL. Example db.example.find() -> select * from example; db.example.find({Age:24}) -> select * from users where Age=24; comparison operators List of operators $ne -> not equal to, $gt -> greater than, $gte -> greater than equal to, $lt -> less than, $lte -> less than equal to , $in, $nin -> not in, $mod, $exists example db.example.find({Name:{$ne:”Maulin”}}) above query will return all documents from user collection where name is Not Maulin. Logical operators List of operators $And, $or Example Db.example.find({Name:”Maulin”,Age:24}) -> It will return all documents where Name is Maulin AND Age is 24. Easy!!! Right. So, this is it from my side. Learning new technology is not always hard. ** ALL THE BEST ** ** HAPPY LEARNING **

Beginner's guide to Mongodb and NoSQL

  • 1.
    Beginner’s Guide toConcepts of NOSQL and MongoDB Documented By: - Maulin Shah
  • 2.
    Purpose Purpose of writingthis document is to guide beginners to NoSQL concepts and MongoDB database. Introduction In recent years as internet and new technologies has become more accessible to people amount of data that was being generated in 1990’s in one year is now in 2016 being generated in one hour and maybe less time than that. And this rate is increasing rapidly every day. But, traditional methods and technologies of collecting and processing data (Like SQL, and different frameworks) were not designed to handle this huge amount of data. So, new methods and technologies has become necessity of time. Because of that we are getting to see new technologies like Hadoop, NoSQL, etc. which are specially designed to handle these amount of data. History of database technologies What is database and Database Management System (DBMS)? Database technologies are collection of organized data. In this type of software data is collected as schemas and tables. Database management system is an application that interacts with users. User types their requests in form of query and DBMS process it and give appropriate output to user. Three main era based models in database technology 1) Navigational (1960-1970) 2) Relational(SQL)(1970 - 2000) 3) Post-relational (NoSQL) (2000 - ongoing) Relational database (RDBMS) was most successful data model till 2000s. Shortcomings of RDBMS 1) Inability to handle unstructured/semi-structured data. Because as internet has expanded year over year more unstructured/semi-structured data is being produced and RDBMS cannot handle that. 2) CRUD operations are not fast enough to give results and are costly operations as it has to deal with joins and maintaining relationship among different data. 3) Because of Schema structure it is hard to scale out RDBMS. To overcome these shortcomings NoSQL databases were introduced NoSQL (Not Only SQL) NoSQL database is provided for distributed data stores where there is need for large scale of data storing needed. Because, they do not require fixed schemas, avoid join operations, and scale data horizontally. In NoSQL database tables stored as ASCII files, tuple represented as row and fields are separated with tabs. Type of NoSQL databases 1) Key-Value Oriented (Radis, Riak ) 2) Column Oriented (HBase, Cassandra) 3) Document Oriented (MongoDB, CouchDB)
  • 3.
    4) Graph dataOriented (Neo4j) Key-Value Oriented In this type of database client get, put or delete value for a key. Here Value is Binary Large Object which only cares about data and not inside it, it is responsibility of application to understand what exactly is stored. Column Oriented In this type of database, databases are based on column and every column is considered individually. Here values of single column are stored adjacently. Data maintained by columns are in the form of column – specific files. Document Oriented In this type of database documents are mainly stored in value part of key/value store. These databases are hierarchical tree data structures that can have maps, collections and scalar values. Graph Oriented It uses graph structure to generate output of given query. It is mainly used for storing node data and relationships between these nodes. Defining and finding relationship is very quick and easy in this type of database. CAP Theorem It states that any distributed system; we should have three aspects C (consistency), A (Availability), P (Partition tolerance). But unfortunately we can have any two at same time in a distributed system. Consistency – every user should be able to see same data after execution of an operation. Availability – Less or no downtime. Partition Tolerance – system should work properly even though communication among server is not reliable. Sharding in NoSQL database We can define it as a partitioning scheme for large databases distributed across various servers and responsible for giving high performance and scalability. It Divides database into smaller parts (shards) and replicates it across different servers.
  • 4.
    MongoDB Learning MongoDB isquite easy and fun. MongoDB is Document-Oriented database; it is open source NoSQL database. We can use it as alternative of RDBMS. It can give high performance by using with more specialized NoSQL databases. Conceptual Understanding of MongoDB 1) MongoDB has same concept like schema in Oracle SQL; within a MongoDB instance (like SQL schema) you can have zero or more database, each acting as high level container for everything else. 2) Collection in MongoDB is same as tables in RDBMS. MongoDB can have zero or more collection in it. 3) Document is MongoDB can be seen as row in RDBMS. Collections are made of zero or more documents. 4) Fields in MongoDB can be seen as Column in RDBMS. Document can have zero or more fields in it. 5) Indexes concept is same as in SQL. 6) Cursor in MongoDB is new concept which is used when we ask MongoDB for data it returns a pointer to a result set which is called cursor. Basic operations in MongoDB Let’s start playing with MongoDB. Insert() db.User.insert[{_id=1,FName:”Maulin”,Lname:”Shah”,address:{street:”Shyamalcrossroad”, city: “Ahmedabad”, zipCode:380015},Phone:[8980162257,9409032647]}, {_id=2,FName:”Paresh”,Lname:”Patel”,address:{street:”kharadigate”, city: “Surat”, zipCode:300018},Phone:[8089612275]}] If we do not have any collection insert() will create one.(!!! Interesting?), here, if we don’t specify id field MongoDB will generate id by itself as Object id. (wow!!) DATABASE COLLECTIONS DOCUMENTS FIELDS
  • 5.
    Find() It works likeselect statement in SQL. Example db.example.find() -> select * from example; db.example.find({Age:24}) -> select * from users where Age=24; comparison operators List of operators $ne -> not equal to, $gt -> greater than, $gte -> greater than equal to, $lt -> less than, $lte -> less than equal to , $in, $nin -> not in, $mod, $exists example db.example.find({Name:{$ne:”Maulin”}}) above query will return all documents from user collection where name is Not Maulin. Logical operators List of operators $And, $or Example Db.example.find({Name:”Maulin”,Age:24}) -> It will return all documents where Name is Maulin AND Age is 24. Easy!!! Right. So, this is it from my side. Learning new technology is not always hard. ** ALL THE BEST ** ** HAPPY LEARNING **