If your organization is like most, you might be overwhelmed by the volume of information available at your fingertips. To make it worse, the pace of this information continues to increase; data (lots of it) is being produced more rapidly than ever before. How can you capture this data and use it to help you and your organization make informed decisions, and ideally to predict future behavior and events?
Historically, you might approach this challenge in a very structured way. First, you would define a way of capturing the data, then test a hypothesis and utilize a standard approach to view, analyze, and make decisions. Ideally, this approach would be formalized into an ongoing operational process that provided valuable insights to your organization.
This is a tried and tested approach, which with the luxury of time and resources, has the potential to produce useful results from the captured data. However, with information growing at a rapid pace, what happens when your data changes? Frequently, in the middle of this analysis, you may discover a new potentially insightful data source, which could be very helpful, but also might set back your timeline as you adjust to the new information and/or the new data stream.
Chances are good that you would need something quicker and more flexible to adapt to changes in the data or in the data sources. You can address these needs by adopting some of the techniques frequently associated with the popular term “Big Data”, which has characteristics that you may recognize and solutions that you can leverage. Some of these characteristics include:
- An abundance of data sources, with both structured and unstructured data.
- A tremendous number of variables with underutilized (or unknown) correlations and interactions.
- Real-time input streams that continually update and could influence your conclusions.
At some level, the above challenges exist for any organization – it is all relative, and the techniques and approaches can often be scaled down. Whether you truly have “Big Data” or not, there are elements and approaches used that you can apply to your data and benefit from.
- Start by defining the goals, understanding the sources of data, and setting the appropriate data boundaries (time, resources, questions, tolerance). This sounds simple and obvious, but frequently when many data points and variables are involved, it is not always easy to know what goals to focus on. I like the analogy of poking a hole in a dike to release the backflow of water. The knowledge may start out as a trickle, but as you start to widen that hole, the force of the water starts working on your side and the outflow of water (or information) accelerates.
- Take advantage of the advances in Data Mining and Statistical Analysis as you work toward a model. You will likely need some standard process for cleaning, integrating and transforming your data (although there are advances that may minimize the need for this).
- There are numerous software packages (both free and otherwise) that can be used to analyze and help visualize your data. These packages implement a variety of statistical algorithms such as Time Series, Regression, Clustering, and Decision Tree Algorithms, all of which can be helpful in scaling your data to an easily digestible level.
- Analyzing your data can be an iterative process, which can lead to a constant back and forth of testing approaches, tweaking variables or algorithms and comparing results. It is important to evaluate the cost and benefit of your changes, as this can easily burn time and resources.
- Finally, when you have found a way through your data to identify root problems or even to predict future actions, you need to operationalize the process and tools. This requires solidifying the data collection and transformation process, hardening the analysis process and then disseminating the results in a way that is easy to understand and act upon. It may take some pre-work to prep your colleagues to expect and accept the new information.
The journey from start to finish is not a straight line, even less so when you are just beginning. However, as you build experience and knowledge of your data, transformations, algorithms, and outputs that drive the most benefit, you will be on your way to leveraging advanced data analytics techniques to grow your organization’s capabilities and responsiveness.