DBaaS and CosmosDB — Taking the buzz out of the buzzwords!

abhinav tripathi
Cosmosnaut
Published in
4 min readMar 28, 2022

--

If you are a software developer, you would have consumed SaaS like a staple. You might also have come across the term PaaS quite frequently but you wouldn’t have heard of IaaS that frequently. And then, if I asked you about BaaS and DBaaS, a lot of you would probably be able to guess the full-form but may not have come across them so often. DBaaS stands for Database As A Service; in this article, I would be focusing on explaining “why DBaaS”, and try to dumb it down by taking the example of one of the most popular DBaaS — Cosmos DB

What is CosmosDB?

Is CosmosDB a database like MySQL, Postgres, SQL Server, Firebase, Mongo, etc.? NO, it is not. Then what is it? CosmosDB is a multi-model, schema-agnostic, globally distributed database service that comes with SLA-backed availability and enterprise-grade security. Wait! Whaaaaaaattttttt? You are just getting started and most of the things mentioned in the last few lines do not make sense. Aren’t you lost already?

Ok, let’s start with the fundamentals!

CosmosDB is a database service and not just a database. In traditional database setups, while developing your applications, you would need to deploy(host) your database somewhere(a server), monitor traffic on it, make sure that everything is up and running. You would also need to take backups of your data from time to time so that you do not lose the data if anything crazy happens, and if you have users accessing your application from multiple locations far from where you have hosted your database, you would need to take care of the delays that your users would experience while the information travels long distances to your server and then back to the user. As your user base grows, there are other factors that you need to consider too — data security, for example, is one of the factors that become very crucial. At this point, the question arises — do you want to get caught up in the mundane maintenance jobs or you would want to focus on delivering a delightful experience through your product while other things happened in the background. Unless you know the trick to slow down the time and have shit loads of money to drain, I guess the answer is simple — you’d like to focus on your product and end-user experience. This is where a database service comes into the picture. Just like other cloud services, in this case, your database is also hosted in the cloud and depending on which DBaaS (Database As A Service) you are using, a lot of the essential components related to the database operations are handled under the hood by the DBaaS provider. Along with the database hosted in cloud, you get tools and settings to easily monitor traffic, manage and configure backups, setup network and security policies, and also manage your application availability across different geographical locations.

Makes sense, what’s with the buzzwords though?

We are in this together! Let’s tackle all the buzzwords one by one. When you see schema-agnostic, it hints that we are talking about a non-relational database. Relational databases are pretty strict with the schema requirements but non-relational databases are flexible. Most of the non-relational databases use JSON documents where each item (similar to row in relational DB table) in the collection (similar to table) can have different keys if you want. You do not need to wait to start development until you have finalised a close-to-perfect schema with all the referential integrity (defined by foreign keys) checks.

Ok, so, we are doing away with relational DB table structure with schema agnostic systems but then what do we mean when we say multi-model? Let’s say that you have worked with SQL all your life and some of your requirements mandate that you use Cassandra while for some other requirements, Mongo would be more suited. What do you do now? Do you deploy different databases for each use case. Well, no! With CosmosDB, you can specify which API (out of Core SQL, Mongo, Table, Cassandra, and Gremlin) you want to use. Within your Azure account, you can have multiple CosmosDB accounts and with each of the accounts, you can specify which API you want to use. You can query your data with an SQL like syntax if you choose SQL API and likewise for Mongo, Cassandra, and Gremlin APIs. So, multi-model just means that you can use different models (Mongo ~ Document model, Cassandra ~ Wide Column Store, Gremlin ~ Graph model, etc..) within a single database service. You still have to handle the logic to direct relevant requests to the respective collections but the hassle of setting up different kinds of databases is taken care of!

Cool, schema-agnostic and multi-model makes sense now, but what’s the big deal about the global distribution? Our applications are available globally anyway, why do we need to worry about global distribution? Fair question, but, do all your users get with similar latencies even in different geographies? Also, what would happen to your application if the cloud services or your servers were impacted due to outages? The aim is always to reduce response time by hosting databases close to where the users are, and to make sure that users can still be served if there are outages in some regions. So, when we say globally distributed, we are basically talking about the capability to make your databases available through multiple geographies. CosmosDB lets you easily add or remove regions from where you want to serve your users.

Covering the last buzzword i.e. SLA-backed. Service providers usually provide guarantees around read/write latencies, service availability, etc. They guarantee a certain level of performance and make it a part of the Service Level Agreement.

These were all the buzzwords that I could think of, if you are having a hard time wrapping your head around more such buzzwords, send me suggestions around which other ones I must cover.

--

--