According to Newton’s Law of Gravity, a particle’s ability to attract other objects is directly proportional to the product of its masses.
What is data gravity?
Like the concept of gravity, data gravity refers to the power of a data set to attract other information, applications, and services. Moreover, the greater mass a piece of matter has, the greater its gravitational force, and the more objects it can draw to itself.
While in information technology (IT) term, this universal law translates to – the more extensive a data set is, the more it attracts other information and applications. Because of data gravity, applications, services, and even additional information would naturally fall into the most massive data set.
A brief history on data gravity
The term ‘Data gravity’ was first introduced in 2010 by Dave McCrory, Former Vice President of Engineering at GE (General Electric) Digital.
McCrory defined data gravity as how the data accumulation is affected and how other services and applications are attracted toward data, same as gravity on objects around the planet.
As the mass or density of the object increases, so does the strength of the planet’s gravitational pull. Similarly, when the data increases, so do the data gravity accumulation increases. A large amount of accumulated data becomes difficult to process and even makes it virtually impossible to move.
Today, enterprises are serving an increasing number of users and endpoints who constantly create and exchange data. A growing volume of interactions and transactions between users and systems invokes a need for increased data processing and storage, both structured and unstructured.
To overcome the limitations of data gravity, it requires a connected community approach for enterprises, cloud, connectivity, and content providers at data exchange centers to eliminate barriers and unlock new capabilities.
Effect of data gravity on enterprises
It is essential to manage data effectively to ensure that the information obtained from it is up-to-date, accurate, and valuable. Data gravity forms a major part of data management and governance when it comes into play (related to data). In that case, enterprises must take the data’s influence under consideration.
If proper policies, procedures, and rules of engagement are not maintained, the sheer amount of data in a warehouse, lake, or other datasets can go flooded. The worst part is, it can become underutilized. Application users should use only data that can make a correct decision and help them reach a proper conclusion.
Data gravity has a severe impact on data integration. This happens especially when there is a drive to unify systems and decrease the resource wasted by errors or rework solutions. Thus, it implies that collecting data in one central place means that data gravity will stop collecting data over time and increase significantly in a short time.
Understanding the impact data gravity will have on the enterprise ensures that contingencies are in place to handle the data’s rapidly increasing influence on the system.
For instance, consider how data gravity affects data analysis. Thus, moving massive datasets into analytic clusters is an ineffective as well as an expensive process. The enterprise needs to concentrate on developing better storage optimization methods that allow for better storage of data.
Various effects of data gravity
McCrory diversely compared data with a planet or an object with considerable mass and has greater gravitational force. Following are the three major effects related to data gravity:
- Force:
The data draws more apps and services as it grows in size by collecting more information. Here, Google can be considered a perfect example of it. It contains massive volumes of data, which is why it knows all of the answers.
As a result, Google now powers several of the businesses and applications. Aside from Google-owned products, third-party developers need to take considerable measures to guarantee that their creations are compatible and fit in with Google norms.
- Speed:
The acceleration of objects increases as they draw closer to the source of gravitational force. As a result, the closer an application is to a data mass, the more quickly it can process data.
An application with a data center in New York that has to access data from a database in Utah, for example, would experience latency concerns. However, data processing would be faster if the data center was more closer.
- Non-portable:
Accumulating data increases data gravity leading to an increase in the size of the dataset. The larger the dataset, the more difficult it is to move. Similar to how difficult moving a planet will be. Additionally, shifting a large quantity of data results in slow migrations and includes multiple resources in the process.
Data gravity needs to be taken into consideration any time when data migrates. Looking at the tremendous growth of the datasets, it becomes critical for enterprises to develop their own migration plans. These migration plans should be designed based on the requirements that account for the size of the dataset as it will be, rather than its actual size.
Data gravity amplifies how different services, applications, and additional data are attracted to the dataset and must be considered while determining future size. Migration will require a specialized, often creative, plan in order to be successful.
Data gravity has a greater pull
The law of data gravity states that whoever has the most data has the most power. Because of the power of the data sets, more applications and services are being developed to accommodate them.
This amplifies that data gravity is a powerful tech and should not be underestimated. After all, it’s the moon’s gravity that can pull the earth’s water up, resulting in high tides.