MongoDB: Case Study with KPMG
Databases are an integral and one of the most critical element of almost every business. Databases are divided into two categories:
- SQL database
- NoSQL(Not Only SQL) database
SQL databases have a predefined schema according to which the data is entered in database. A schema is simply a blueprint telling how many columns of data will be entered and attributes of each record entered. SQL database have been around since decades. But there has been a rise in popularity of NoSQL data.
The idea behind relational relationships is that data is stored in tables, which are organized into columns, and each column stores one type of data with each instance table data in rows.
There are various products that work on SQL databases MySQL, Oracle, and Microsoft SQL Server database.
Growth of NoSQL
NoSQL databases don’t have such structured schema.The Not Only SQL or NoSQL database is an approach that works towards managing data as well as database design which may come in handy for huge sets of distributed data.
This is actually a need in current scenario. With the advent of 21st century there is more and more need to handle unstructured data which is being generated by most of the companies around the world. This data is not structured. Hence analysis of this data using SQL is not possible. This is primarily the reason why NoSQL databases are gaining more and more popularity nowadays.
The issue with SQL is that there is a disconnect between the data and the interfaces that consume it. Tableless’s rise to mainstream popularity is partially due to its immense flexibility and lack of constraints.
There are various NoSQL products that exist in the market. The NoSQL databases differ in the type of architecture they have.
Today we will be discussing about one such NoSQL database known as MongoDB.
MongoDB is a source-available cross-platform document-oriented database program. Classified as a NoSQL database program, MongoDB uses JSON-like documents with optional schemas.
There are various benefits of using Mongodb. Although the choice of database depends on the use case we have to implement I would like to list a few features of mongodb that makes it a good choice for the companies:
- Rich Object Model: MongoDB supports a rich and expressive object model. Objects can have properties and objects can be nested in one another (for multiple levels). This model is very “object-oriented” and can easily represent any object structure in your domain. You can also index the property of any object at any level of the hierarchy — this is brilliantly powerful!
- Secondary Indexes: Indexes speed up the queries significantly, but they also slow down writes. Secondary indexes are a first-class construct in MongoDB. This makes it easy to index any property of an object stored in MongoDB even if it is nested. This makes it really easy to query from the database based on these secondary indexes.
- Replication and high availability: MongoDB supports a “single master” model. This means you have a master node and a number of slave nodes. In case the master goes down, one of the slaves is elected as master. This process happens automatically but it usually takes time, before the 3.2 release, 10–40 seconds were taken but after the release of MongoDB 3.2 and later, failures are detected faster and a new leader elected in under 2–10 seconds.
Now since we have the gist of the importance of database and an idea of the usage of MongoDB let’s see how KPMG is using Mongodb in it’s tech stack.
The KPMG story
KPMG is one of the world’s largest professional services firms operating as independent businesses in 155 countries, with 174,000 staff. KPMG provides audit, tax and advisory services used by corporations, governments and not-for-profit organizations.
With a business this large it is very important to have an efficient database system which is robust to unstructured data.
There are various ways in which KPMG is utilizing the mongodb architecture.
All raw accounting data from the customers’ business systems, such as sales data, invoices, bank statements, cash transactions, expenses, payroll and so on, is ingested from Microsoft SQL Server into MongoDB. This data is then accessible to the CPAs to generate the customer’s KPIs. A unique capability they have developed for the customers is financial benchmarking. They can use the data in the MongoDB data lake to allow our customers to benchmark their financial performance against competitors operating in the same industries within a specified geographic region.
Another unique feature of the accounting suite is the ability to customize reporting for each customer, based on specific criteria they want to track. For example, a restaurant chain will be interested in different metrics than a construction company. KPMG enable this customization by creating a unique schema for each customer which is inherited from a standard business application schema, and then written to MongoDB.
KPMG also use MongoDB to store all the Loop application’s millions of clients requests each day. This enables them to build Tableau reports on top of the logs to troubleshoot production performance issues for each user session, and for each of the 220 regional KPMG sites spread across France.
KPMG estimated that by selecting MongoDB for the accounting suite they have achieved at least a 50% faster time to market than using any other non-relational database.