In its inaugural Data Genomics Index Veritas Technologies, the leader in information management, released the industry’s first accurate view of the composition of enterprise data. This real-time view of today’s corporate data reveals that over 40% of files have remained untouched for three years, creating an opportunity for businesses to positively impact their bottom line costs. The Data Genomics Index is the first step towards benchmarking enterprise data environments to catalyze interest and action in this growing challenge.
The Data Genomics Index is the first report to provide accurate insights into today’s enterprise data environment and act as a comparison standard. These insights can jump start an organization’s initiatives to act intelligently and devote remediation efforts where they will get the best return. Key findings from the report include:
Developers Dominate and Presentations Have Had Their Day
The Index reveals that images, developer files and compressed files take up almost one third of the total environment. Developer files from a file count perspective are a massive 20% of the total number. When we look at trends over the past 10 years, relative to other file types, presentations have declined 500%. Finally we are trending away from death by PowerPoint.
We’re Busiest in the Fall
Fall dominates from a file creation perspective. The most drastic increases are 91% more text files, 48% more spreadsheets, and 89% more geographic and information system files. We apparently do most of our videography in summer and fall, and then save it to company disk. Videos jump 68% in the fall.
41% of the Data Environment Goes Untouched
With the exception of regulatory or compliance requirements, three years is a general standard for when data goes from potentially relevant to stale. Incredibly, 41% of the average environment is stale, or unmodified in the past three years.
Orphaned Data is Overly Burdensome
Data without an attributed owner, either through role changes or employee departures, is orphaned. This data is often out of sight and out of mind for organizations and it is costing them. Based on the insights from the Index, orphan data tends to be content rich file types like videos, images and presentations – risky stuff to leave unattended. It also is taking up more than its fair share of disk space based on file count distribution – over 200% more.
Small Changes Can Have a Big Impact on Storage Costs
With similar insights into their own data, organizations can prioritize areas to achieve significant returns. Traditional “office” formats like presentations, spreadsheets and documents take up more stale space then they should, costing organizations unnecessarily. Visual formats like videos and images are also extra burdensome. These are where archiving, deletion or migration efforts are best spent. Considering the average 10 petabyte environment, an archive project focused on just stale presentations, documents, spreadsheets and text files, could return as much as $2 million a year in storage savings.
“One thing we hear all the time from our customers is they’re struggling with two competing forces of nature – the exponential data growth curve, and the restriction of resources and budget to fight it with new servers and applications,” said Steve Vranyes, CTO, Veritas. “By aggregating Veritas’ unique understanding of key metadata characteristics we can surface an accurate representation of the average data environment. These insights will change the crippling growth dynamic enterprises are faced with today.”
The Data Genomics Index is the first research that benchmarks accurate details of real environments – from the file type composition, to the average age distribution to the size proportions of their individual files. To provide a community and forum for this research and discussion, Veritas is also announcing the Data Genomics Project, a first-of-its-kind research initiative to help organizations understand the true nature of the unstructured data that they create, store, and manage on a daily basis. The inaugural Data Genomics Index is the first contribution to this cause. The Project will be a community of data scientists, industry experts and thought leaders that further builds the data genome for information management, and shares their research and discussions with organizations worldwide that are struggling to solve tremendous data growth challenges. While Veritas is a founding member and contributor, the project will remain commercially separate from the business.