MySQL vs MongoDB (NoSQL) database
MongoDB vs MySQL: Comparison Between Relational and Document Oriented Database
Both are used for storing the data and free to use that is both comes under open-sources software.
MongoDB might be unfamiliar at least to some of us as it is a relatively new compared to other established databases such as MySQL, PostgreSQL or Oracle.
The main and most important difference between these two are MySQL comes under RDBMS(relational database management system) whereas MongoDB not i.e relations between data in different tables is not possible using MongoDB.
Let’s discuss one by one both of these in terms of differences and features that makes these so useful for us/developers.
MySQL is the most popular relational database management system(due to its speed, robustness, and ease of use and obviously open-source) since 1995. As a database system, RDBMS stores data in a database object called table. Table is a structure of data set consist of row and column. In order to access the data stored in RDBMS, SQL language is used. The SQL use statement or commands as SELECT, INSERT, UPDATE, DELETE to manipulate, insert or delete the data stored in database.
In the modern business world, most data we have is non-structured, means the data that cannot be stored in a traditional table structure, known as structured data. Structured data is data which have a predefined data model that can be stored in SQL row and column format. Unstructured data refers to text-extensive data which may have numbers and structured data in the text, but it cannot be stored in the traditional SQL database.
Therefore, most of the data in our life cannot be stored in SQL format, then the term NoSQL http://en.wikipedia.org/wiki/NoSQL (Not only SQL) comes in existence.
NoSQL and MongoDB
With the huge amounts of data in the web, the need to have a database to store documents in becomes imperative. The initiative of creating a database capable of storing unstructured data began with NoSQL.
The idea of NoSQL has been started in 1998. It have a stream-operator paradigm to access data instead of SQL syntax, Stream operator paradigm is a database operator which resemble a mathematical operation in order to access data stored in database. Nevertheless, NoSQL is still using RDBMS model, therefore it has SQL equivalent in NoSQL operator and it does not really support the need to store document in database. NoSQL RDBMS is only qualified as “NoSQL RDBMS.”
MongoDB came with the aim of providing the new way of data storage. Therefore database can store document for the world wide web. Began in 2007, MongoDB is built to store data as an object in a dynamic schema, instead of a tabular database like SQL. The data in MongoDB is stored as object notation based on the format of JSON (Java Script Object Notation). JSON is a standard for data transfer over the network between the server and web application using human readable format. Prior to JSON, the XML is used for that purpose. MongoDB modified the JSON format into its own BSON, which store the object in a binary format instead of human readable format. Hence the acronym BSON stands for Binary JSON. BSON, due to its binary format provide more reliable and more efficient in storage space and speed.
The popularity of MongoDB gains when users love to use MongoDB for it delivers its promise as a document-oriented database.
The emergence of web-oriented data has shifted the database landscape with the arrival of Big Data. Big Data is the data which due to its nature can not be stored in a tabular model of relational database management system as exist today with the SQL syntax to manipulate the data.
Lets have a quick view over Big Data
Big data has grown due to 3-V i.e volume (the amount of data), velocity (the speed of data transfer), and variety (the range of data types and sources) has come to our everyday life.
Just imagine, we are now dealing with terabyte of data every day in data transfer, these were hardly found in 5-10 years ago. We also deal with the rapid increasing speed of data transfer through the advancement of computing technology which happens real fast. We also deal with the various source of data. The data we are analyzing is from multiple source(web iphone ipad tablets) in a multiple forms (many types video, audio, images or text). We practically can tap data anywhere we want it and any time we want to in a various format, with multiple size and multiple source.
This kind of unstructured data with its massive size in a single file hence the name Big Data is given to them, require a different approach to handle. We need more than just traditional SQL which is extremely good to manage structured data in its table format of row and column. However the unstructured data which take up to 90% of data we manage requires a non-tabular storage. We simply cannot rely on the SQL and traditional relational database management system to store and manage the data in form of multimedia file, image, blog and articles. We need a document oriented data which is MongoDB came from instead of MySQL to manage different kind of data: unstructured rather than structured data.
Now come on how MongoDB Stores Data?
MongoDB is a document-oriented database. Instead of storing your data in tables made out of individual rows, it stores your data in collections made out of individual documents. In MongoDB, a document is a big JSON blob with no particular format or schema.
There are a number of ways you could model this data. In a typical relational store, each of these boxes would be a table. You’d have a tv_shows table, a seasons table with a foreign key into tv_shows, an episodes table with a foreign key into seasons, and reviews and cast_members tables with foreign keys into episodes. So to get all the information for a TV show, you’re looking at a five-table join.
We could also model this data as a set of nested hashes. The set of information about a particular TV show is one big nested key/value data structure. Inside a TV show, there’s an array of seasons, each of which is also a hash. Within each season, an array of episodes, each of which is a hash, and so on. This is how MongoDB models the data. Each TV show is a document that contains all the information we need for one show.
All of the data we need for a TV show is under one document, so it’s very fast to retrieve all this information at once, even if the document is very large. There’s a TV show here in the US called “General Hospital” that has aired over 12,000 episodes over the course of 50+ seasons. PostgreSQL takes about a minute to get denormalized data for 12,000 episodes, while retrieval of the equivalent document by ID in MongoDB takes a fraction of a second.
So in many ways, this application presented the ideal use case for a document store.
Another example is social data?
When you come to a social networking site, there’s only one important part of the page: your activity stream. The activity stream query gets all of the posts from the people you follow, ordered by most recent. Each of those posts have nested information within them, such as photos, likes, re-shares, and comments.
The nested structure of activity stream data looks very similar to what we were looking at with the TV shows.
Users have friends, friends have posts, posts have comments and likes, each comment has one commenter and each like has one liker. Relationship-wise, it’s not a whole lot more complicated than TV shows. And just like with TV shows, we want to pull all this data at once, right after the user logs in. Furthermore, in a relational store, with the data fully normalized, it would be a seven-table join to get everything out.
Seven-table joins. Ugh. Suddenly storing each user’s activity stream as one big denormalized nested data structure, rather than doing that join every time, seems pretty attractive.
Till now we have gone through the basic features of MySQL and NoSQL obviously the MongoDB now the main and important question is when building a custom web application you need to consider the type of database that best suits the data, what will you choose and why as now we have two either table based database and document based first lets have a quick view on the differences between MySQL (Relational) and MongoDB (Non-Relational / NoSQL).
As in 2004, Ruby on Rails first came out and popularized web application frameworks. It also popularized ORM (Object-Relational Mapping) layers with its ActiveRecord object. An ORM layer basically provides an object oriented interface to a relational database. That means that instead of writing a query to insert or update a record, you assign some properties to an object and call a save method. Instead of writing queries with joins, you can access related data through properties of an object.I think you are well familiar with these if most of the PHP frameworks it is basically used.
For example, if you have a “post” object that represents a blog post, you can access it’s comments through the property “post.comments”. At first appearance, this might seem delightfully convenient, and to some extent it is. Fast forward a few years, and just about every web application framework out there implements an ORM layer. It lets you access relational data in a way that is convenient for your application. The problem, however, is that this is horribly inefficient and it teaches developers to ignore how the underlying database works. In fact, it’s so inefficient that it just about killed relational database systems altogether.
Let’s look at some of the differences between MongoDB and RDBMS like MySQL.
General Terms/Concepts in both database
SQL Terms/Concepts | MongoDB Terms/Concepts |
database | database |
table | collection |
row | document or BSON document |
column | field |
index | index |
table joins | embedded documents and linking |
primary key | primary key |
Data Representation
MySQL represents data in tables and rows, we all already know about this
Select Statement Comparison
Select * from Cust_Det.
This selects all rows and columns from table Cust_Det.
MongoDB represents data as collections of JSON documents.
The corresponding MongoDB documentation for the same would be:
db.Cust_Det.find();
This code literally finds all the documents and fields in Cust_Det collection.
Insert Statement Comparison
MySQL: INSERT INTO Cust_Det(call,radio,band) VALUES(“4XJHI”,”4XJHI/5″,144)
MongoDB: db.Cust_Det.insert({call:”4XJHI”, radio:”4XJHI/5″, band:”144″})
Note that the terms documents, fields, and collections used for MongoDB are equivalent to the terms rows, columns, and tables in MySQL.
If you think about it, a JSON document is very much like what you would be working with in your application layer. If you are using JavaScript, it’s exactly what you’re working with. If you’re using PHP, it’s just like an associative array. If you’re using python, its just like a dictionary object.
Querying
The SQL in MySQL stands for Structured Query Language. That’s because you have to put together a string in this query language that is parsed by the database system.
In MySQL, SQL Injection is possible due to the fact that it is a structured query language and you have to put together a string that is parsed by the database system. This is one of the major issues faced by MySQL, which makes it vulnerable to attacks from hackers and intruders. This problem has been known for a while and languages like PHP have a built-in function to get rid of these attacks. All you have to do is make use of mysql_real_escape_string function which takes up a string in a MySQL query and return the string with injection attempts safely escaped. This function escapes special characters in a string for use in an SQL statement. It replaces the troublesome (‘’) a user might enter, with an escape character, the backslash (\).
MongoDB uses a different mechanism called Object Querying, where we pass a document to explain what we are querying for. This significantly reduces the risk of being attacked by intruders and hackers. There isn’t any language to parse. If you’re already familiar with SQL, it’ll take a little bit of time to wrap your brain around this concept, but once you figure it out, it feels a lot more intuitive.
Relationships
One of the best things about MySQL and relational databases in general is the almighty JOIN operation. This allows you to perform queries across multiple tables.
MongoDB does not support joins, but it does multidimensional data types such as arrays and even other documents. Placing one document inside another is referred to as embedding. For example, if you were to create a blog using MySQL, you would have a table for posts and a table for comments. In MongoDB you might have a single collection of posts, and an array of comments within each post.
Transactions
Another great thing about MySQL is its support for atomic transactions. The ability to contain multiple operations within a transaction and roll back the whole thing as if it were a single operation.
MongoDB does not support transactions, but single operations are atomic.
Schema Definition
MySQL requires you to define your tables and columns before you can store anything, and every row in a table must have the same columns.
One of my favorite things about MongoDB is that you don’t define the schema. You just drop in documents, and two documents within a collection don’t even need to have the same fields.
Schema Design and Normalization
In MySQL there is really isn’t much flexibility in how you structure your data if you follow normalization standards. The idea is not to prefer any specific application pattern.
In MongoDB, you have to use embedding and linking instead of joins and you don’t have transactions. This means you have to optimize your schema based on how your application will access the data. This is probably pretty scary to MySQL experts, but if you continue reading, you’ll see there is a place for both MySQL and MongoDB.
Indexing
Indexes are used to find rows with specific column values quickly. In MySQL, if indexes are not defined, it must begin from the very first row and then scan through the entire table contents to find the relevant rows. Similarly, MongoDB also must scan every document in a collection to select those documents that match the query statement. Larger the table, larger the time it takes to fetch the desired output.
In MongoDB, defining indexes is an important task. Missing indexes can have a huge impact and could be potentially dangerous for your program. As MongoDB uses instance-wide-lock for writing queries, it’s very important to define indexes more precisely than it is in MySQL. Mostly, MySQL indexes (PRIMARY KEY, UNIQUE, INDEX, and FULLTEXT) are formatted in B-trees. MongoDB indexes (DEFAULT_ID, SINGLE FIELD, COMPOUND INDEX, MULTIKEY INDEX, GEOSPATIAL INDEX, TEXT INDEX, HASH INDEX) are also stored in B-trees.
Performance
MySQL often gets blamed for poor performance. Well if you are using an ORM, performance will likely suffer. If you are using a simple database wrapper and you’ve indexed your data correctly, you’ll get good performance
By sacrificing things like joins and providing excellent tools for performance analysis, MongoDB can perform much better than a relational database. You still need to index your data and the truth is that the vast majority applications out there don’t have enough data to notice the difference.
This scenario accounts that overall MySQL is much better for select/read operations.MySQL performs better by a 2-5x speed for select operations. Notice the good insert speed on MongoDB: it is 2-3x faster than MySQL.
Is something make you more clear on when to use MySQL and MongoDB?
When should I use MySQL?
MySQL is well known for its scalability, flexibility, high performance, high availability, robust transaction support, strong data protection and management ease. If your database structure requires a number of tables and rows, MySQL will deal with your data with extreme robustness and ease of interaction. If you are just indexing your data properly, performance will not be a concern and there is a good chance that you won’t really need MongoDB. Furthermore, MySQL offers a powerful transactional database engine in the market. It features ACID (atomic, consistent, isolated, durable) transaction support and row-level locking transaction support, which prevent readers from blocking writers and vice-versa.
If your data structure fits nicely into tables and rows, MySQL will offer you robust and easy interaction with your data. If it’s performance that is your concern, there is a good chance you don’t really need MongoDB. Most likely, you just need to index your data properly. If you require SQL or transactions, you’ll have to stick with MySQL.
Another important feature present in MySQL is its ability to do complex JOIN operation queries. Using Inner, Outer, Left and Right JOIN operations, it is easy to fetch exactly the data you want from two or more tables with just one query. This literally reduces the overhead time required for writing multiple queries, thereby increasing accessibility and performance. Imagine you are developing an application in which you use a database having “n” number of tables. Let us say you need to perform a search for data that is stored in different tables. You can achieve this by writing multiple queries or by writing one JOIN operation query. If you write multiple queries, you will need to extract all matching rows for all queries and do whatever manipulation that are required for getting the same result.
When should I use MongoDB?
If your data seems complex to model in a relational database system, or if you find yourself DE-normalizing your database schema or coding around performance issues you should consider using MongoDB. If you find yourself trying to store serialized arrays or JSON objects, that’s a good sign that you are better off MongoDB. If you can’t pre-define your schema or you want to store records in the same collection that have different fields, that’s another good reason.
When documents are declared as independent units as in a document database, application logic can be written more easily. Also, unstructured data can be easily stored and retrieved because documents (objects) map nicely to programming language data types. Thus, for handling a vast amount of data in the web, the need to have a database that stores data as documents becomes a necessity. MongoDB is a document database and these products are intentionally designed to meet challenges.
You may consider MongoDB when or best practices for MongoDB:
**You expect a high write load
If you write tons of data with a low business value for each, MongoDB would be the best option. It prefers high insert rate over transaction safety. Adding new columns to RDBMS can lock up your entire database or create a major performance degradation in other. This happens particularly when your table size is larger than 10GB. Imagine you are using MySQL on a table that adds up several terabytes in a single table. Obviously, you cannot rely on such an environment where your entire data is in a risky atmosphere. As MongoDB is schema-less, adding new fields does not affect old rows and will be in tact.
**You need high availability in an Unreliable Environment
MongoDB is highly available on an unreliable environment such as a set of servers that acts as Master-Slaves. Apart from that, recovery from a data center breakdown is safe, instant and automatic.
**Your database is expected to grow big and Schema is Not Stable
MySQL table performance will degrade if you cross 10GB per table. If you want to partition your database, MongoDB has a built-in solution for that.
Adding new columns to RDBMS can lock the entire database in some database, or create a major load and performance degradation in other. Usually it happens when table size is larger than 1GB (and can be major pain for a system like BillRun that is described bellow and has several TB in a single table). As MongoDB is schema-less, adding a new field, does not effect old rows (or documents) and will be instant. Other plus is that you do not need a DBA to modify your schema when application changes.
**Your data is location based
MongoDB can be run on distributed network the load on the database in balanced between salves database, these can be location based as it has built in spacial functions, so finding relevant data from specific locations is fast and accurate.
**You don’t have a DBA
You don’t need a DBA to modify your scheme when application changes. If you don’t want to normalize your data and do joins, then MongoDB would be the right choice. Classes can be serialized to JSON objects and stored in MongoDB.
Conclusion
Choosing and designing database are crucial for your application. Both MySQL and MongoDB are very useful although they differ in their mode of operation, rather than just performance.
MySQL is a relational database management system (RDBMS) with SQL that has a fixed data model to store data in tabular model. Your structured data can be organized like sales statistics, so that in future, data reference and retrieval would not be a tedious job. On the other hand, MongoDB is a document-oriented database, which stores document and treat the document as data. MySQL follows a different approach. Choose the one that suits best for your application and need!
You probably thought this was going to be all about performance, but MySQL and MongoDB are both tremendously useful, and there are much more important differences in their basic operations than simply performance. It really comes down to the needs of your specific application.
References :
http://en.wikipedia.org/wiki/MySQL
http://en.wikipedia.org/wiki/MongoDB
http://docs.mongodb.org/manual/reference/sql-comparison/
http://www.tutorialspoint.com/mongodb/mongodb_php.htm
http://www.percona.com/blog/2013/10/22/designing-one-many-relations-mongodb-vs-mysql/
Pingback: PHP MYSQL Indonesia | Just another My blog Sites site