The answer is probably yes.
It is 2014 – everybody has heard about “Big Data” and “Data Science,” and everybody has gigabytes and terabytes of data ready to analyze. Over the past few years alone, Google Trends shows that interest in the search term “Big Data” has skyrocketed tenfold.
The question is really not about the amount of data you have, it is about the information available in your datasets. As Rexer Analytics’s Annual Data Miner Survey from 2013 shows, it’s unclear how much “Big Data” has even impacted the typical data miner; most data miners say data volumes have increased, but they still report using datasets of similar size from 2007. So if you’re worried you don’t have enough “Big Data,” don’t. Today we see many financial services companies wondering if they have enough data – or necessary data – to build predictive models and use more sophisticated analytical techniques for their pricing decisions. Because of this concern they often postpone progressing their pricing processes until they feel have all their ducks in a row with necessary datasets. Naturally, any project will benefit from having robust datasets from day one, but in the real world this ideal preparedness is a rarity. Rather than wait for better days, most financial services companies would benefit from proactively harnessing the data they already have available and begin using it to make better decisions.
Financial services companies are often surprised to learn that they can use the data they already have to build predictive models. Contrary to popular belief, these models do not require hundreds of fields in datasets (nobody wants to have an overfitting problem anyway), nor they require gigabytes of volume. Most if not all companies already collect information about their products (offered and final prices, product characteristics, etc.), customers (age and gender if allowed by law, risk score, geography, relationship with the company, etc.), and many of them also have usable competitor information.
In addition to customer and product data, these companies also have many years of transaction data, which is a huge source of knowledge about customer behavior after a decision point. Profit metrics are highly dependent on prepayment, utilization, default, and retention, so transaction history is one of the best sources to improve company learnings.
And even if important data pieces seem to be missing, well established statistical techniques are available to deal with different limitations in data. For example, bootstrapping can be used if there are not enough observations and proxies can be used if important predictors are missing (for example, different indices may serve as good approximation for competitor information). Quick visual and correlation analysis may also help identify important (and interesting) patterns in data.
Then, predictive models will allow companies to take this data and translate it into information that enables them to make better informed decisions. In addition to improving decisions, working on these models may help to identify what important pieces of data are missing from company databases (for example, company systems may not capture valuable information on lost quotes). That in turn will improve data collection, which will improve next iteration of models, and so on.
So, instead of worrying about “Big Data” and whether you have enough of it, start working with what you already have at your disposal. You’ll be surprised to find that the successful implementation of predictive models and other advanced analytical techniques is a much better indicator of proper data collection and usage than simply waiting around for a sign that it’s the perfect time to begin. The time is now!