How big MNCs stores, manages, and manipulates thousands of terabytes of data with high speed and efficiency
To understand how first we need to know how much exactly is the data to be processed.
Let's take the example of a few MNCs
Google processes about 20 petabytes of data every data i.e. 10⁹ megabytes of data.
Facebook generates 4 new petabytes of data every data i.e. 10⁴ megabytes of data.
Amazon hosts about 1000 petabytes of data in their servers.
Our traditional databases or data storage methods are not compatible to store or process such are amount of data therefore a new term comes into the picture Bigdata.
What do we mean by “Big Data”?
Big Data refers to datasets whose size is beyond the ability of typical database software tools to capture, store, manage, and analyze.
The five Vs of Big Data
-> Volume: Volume is how much data we have — what used to be measured in Gigabytes is now measured in Zettabytes (ZB) or even Yottabytes (YB).
-> Velocity: Velocity is the speed at which data is processed and becomes accessible.
-> Variety: Variety describes one of the biggest challenges of big data. It can be unstructured and it can include so many different types of data from XML to video to SMS. Organizing the data in a meaningful way is no simple task, especially when the data itself changes rapidly.
-> Veracity: Veracity refers to the trustworthiness, authenticity, and accountability of the data.
-> Value: Data alone to a company is of no use unless there is some information/value extracted from the data.

We have various frameworks for storing and processing Big Data like:
1) Cassandra
2) Hadoop
3) Spark



