When making a choice in any kind of software, you must take the factors like price, performance, capacity, and capabilities. And that’s exactly what we will do today in helping you make the choice for your database solution.
Databases are applications that help you store large amounts of data and retrieve them as fast as you put them there. It solves the problem of: “I have all this data, and want to add some, but want to find something i want in this heap of data, what do i do?”
Most databases having features of indexing, searching, categorizing (for example, MySQL has the LIKE statement to find a range of columns containing a word). And knowing that computers can store / retrieve data much faster than humans, Databases were invented as a solution to the problem i described above.
We will enlist a list of database solutions available today in the market and do a comparison of their features, and if exist, limitations.
So, let’s start!
1 – MySQL
MySQL is one of the biggest and oldest in the game, as a relational database, serving over billions of websites out there, and is actually the default database for the majority of web hosts and web applications today. Here’s the list of features of MySQL:
- Uses multi-layered server design with independent modules.
- Designed to be fully multi-threaded using kernel threads, to easily use multiple CPUs if they are available.
- Provides transactional and nontransactional storage engines.
- Uses very fast B-tree disk tables (
MyISAM) with index compression.
- Designed to make it relatively easy to add other storage engines. This is useful if you want to provide an SQL interface for an in-house database.
- Uses a very fast thread-based memory allocation system.
- Executes very fast joins using an optimized nested-loop join.
- Implements in-memory hash tables, which are used as temporary tables.
- Implements SQL functions using a highly optimized class library that should be as fast as possible. Usually there is no memory allocation at all after query initialization.
- Provides the server as a separate program for use in a client/server networked environment, and as a library that can be embedded (linked) into standalone applications. Such applications can be used in isolation or in environments where no network is available.
The feature list goes on and on and on and on… now let’s be fair and enlist the limitations too:
- Performance drops as requests increase in size (meaning, it will be slower with 100 users than 10 users), which is not true in NoSQL solutions
- Can’t use TIMESTAMP in Date and Datetime data types
It is the default database that i use on my web applications, or any other appliance that i use with a need for storing big data.
2 – Microsoft SQL Server
Microsoft SQL Server is even an older player in this game, being the backbone for the Microsoft ISS Servers, ASP.Net applications, and other general Windows appliances.
And of course, as with all Microsoft products, it is proprietary. It comes out of the box with most Windows Server installations but you are still able to do download it from the Microsoft official Website.
Microsoft SQL Server has somewhat of a shadier startup than other database appliances. It initially bought Sybase’s database in the version of that time, and the first version of MSSQL was actually a copy of that version.
This databases features are the same as MySQL’s, except the performance and easy syntax one. MSSQL syntax is pretty ugly and it lacks a support for JSON. But it is a good default database for your server.
3 – MongoDB
This is a NoSQL (document-based) database that is used for the newest web applications. It has REST interfaces, and as i see newly, the guys at the PHP documentation made a page for it with a full tutorial:
Will remember this to use it when i can’t be bothered to install MySQL. I sincerely loved it’s syntax (Check out the BSON (Binary JSON) specification on http://bsonspec.org/#/specification). It uses JSON-styled syntax to save and retrieve documents.
- Document Database
- Documents (objects) map nicely to programming language data types.
- Embedded documents and arrays reduce need for joins.
- Dynamic schema makes polymorphism easier.
- High Performance
- Embedding makes reads and writes fast.
- Indexes can include keys from embedded documents and arrays.
- Optional streaming writes (no acknowledgments).
- High Availability
- Replicated servers with automatic master failover.
- Easy Scalability
- Automatic sharding distributes collection data across machines.
- Eventually-consistent reads can be distributed over replicated servers.
And as every application, it comes with limitations:
- Not Reclaiming Space – MongoDB does not automatically reclaim disk space used by deleted objects, which is a clear problem
- Poor Memory Management – MongoDB manages memory by memory mapping your entire data set, leaving page cache management and faulting up to the kernel. A more intelligent scheme would be able to do things like fault in your indexes before use as well as handle faulting in of cold/hot data more effectively. The result is that memory usage can’t be effectively reasoned about, and performance is non-optimal for a web application.
- Global write lock – MongoDB (as of the current version at the time of writing: 2.0), has aprocess-wide write lock. Conceptually this makes no sense. A write on collection X blocks a write on collection Y, despite MongoDB having no concept of transactions or join semantics. We reached practical limitations of MongoDB when pushing a mere 200 updates per second to a single server. At this point, all other operations including reads are blocked because of the write lock. When reaching out to 10gen for assistance, they recommended we look intosharding, since that is their general scaling solution. With other RDBMS solutions, we would at least be able to continue vertically scaling for some time before investigating sharding as a solution.
Overall, each of the appliances have pluses and minuses, and it is up to the reader to choose the best one of them. But i would still go and claim that MySQL is the best one of them all. Stability and reliability is a must for me and the open-source appliance provides enough of it for my web applications.
Thanks for the read and keep reading!