Today's Big Data Problem and the Need for Data Interpreters

By SMU Graduate Studies on June 28, 2018


A staggering 2.5 quintillion bytes of data enters the virtual sphere every day, according to IBM. This extreme influx of data has only appeared over the last two years, effectively drowning IT departments and data analysts across industries in information.

The question confronting all industries right now is how does the public and private sector cope with this over saturation? In order to understand the answer, researchers and professionals alike must understand the background of the problem before coming up with a solution.

How Did This Explosion Happen?

So why did this explosion just occur now? After all, the internet has been around since the ’80s, even if still in its nascent stages. It turns out, the answer to why data has exploded the way it has is quite multifaceted and includes a combination of factors.

Deepak Pareek, a technology strategist, offers a succinct overview of why the amount of data has  exploded:

  1. Simply put, there’s more data available than ever before. In fact by 2020, experts predict a 4,300 percent increase in data produced annually.
  2. As computing power grows exponentially, especially with the introduction of cloud computing options, technological systems are able to process this amount of information without self-destructing.
  3. With the introduction of unstructured data, such as social media, videos, audio, and pictures, more data points exist on human behavior, not simply digitized information.
  4. Advances in analytical algorithms are expediting not only machine learning, but also forthcoming developments in speech recognition, artificial intelligence, and other forms of futuristic technology.
  5. The expansion of the Internet of Things (IoT), the network of “smart” devices that are speaking to each other, enables devices to more easily connect and exchange data.

While the reasons behind the explosion of data are fascinating, current analysts must qualify what they mean by “big data” and what to do with this vast quantity of data.

Big Data, a Problem or an Opportunity?

The term “big data” refers not to the sheer quantity of data, but rather to how structured and unstructured data interacts with social media analytics, IoT data, and other external sources in order to paint a picture.  With the advent of big data, professionals across businesses face a double-edged sword.

The amount of digital information that exists presents a problem, but the true problem often lies in analysts’ ability or inability to deliver cost-effective and successful data analytics.

One of the most infamous failures of “big data” occurred when the Center for Disease Control (CDC) partnered with Google to create the Google Flu Trend (GFT) project. The CDC along with GFT tried to show a correlation between areas with high rates of internet searches about the flu to areas where outbreaks of the flu would be recorded.  Unfortunately, Google’s algorithm was constructed without careful enough parameters, and the result was a wildly inaccurate predictive model based on millions and millions of search terms.

Another problem with big data is the fact that data is not infinitely useful, but has a lifespan of relevance. Those who are responsible for storing and saving this data do not necessarily adopt good housekeeping practices when it comes to evaluating data for importance and relevance which can lead to wasted time and resources.

Professionals who can problem solve in order to produce pertinent research findings in a timely way pose an invaluable resource to companies across fields.

Every Industry Must Make Data-driven Decisions

From the medical field to Wall Street, data has started driving important decision making. Often, decision makers must first decide if statistical analysis of their data will satisfy the return on investment. Data mining and analysis can end up being extremely expensive and inconclusive if the correct parameters are not established from the beginning.

But more and more industries are accepting or embracing the idea that their decisions should be driven by accurate information about reliably collected data. This increases the pressure on managers and hiring departments to find people with the skills necessary to make data meaningful to the rest of the organization.

Enter: Statistical Science and Data Analytics Graduate Programs

Those with graduate degrees, specifically in the area of Statistical Science, Applied Statistics, and Data Analytics, can offer their employers the necessary expertise to expose relevant data and optimize data analytic projects.

The wealth of information in the modern digital age may minimize knowledge gaps, but only if there are people with the ability to interpret the overwhelming nature of the data.

The value of people who understand how to collect accurate information based on reliable data increases exponentially as businesses and individuals struggle to make sense of the volume of information available. Those who grasp how to productively utilize big data will be essential for coming innovation on how to work within the acknowledged limitations of big data to glean the best results.

Interested in studying Statistical Science? Learn more about the varied programs and research opportunities at SMU.

View Our Programs

Submit a Comment