Are we moving from “Big Data” to “Smart Data”?

By Ryan Weeks on 17th November 2015

Those within the industry will be all-too-familiar (weary, even) with the term “big data”. Taken simply, the phrase is used to refer to a vast collection of data – which delves far deeper into the life of a prospective customer than a traditional Credit Reference Agency (CRA) might. The utilisation of “big data” is often cited as a means of facilitating enlightened lending decisions – the thinking being that big data users are able to glean a supremely in-depth understanding of a prospective borrower’s ability to repay a loan. Banks and building societies typically won’t deal with anybody below a clearly defined point on the traditional credit scoring ladder. Online lenders – the major proponents of big data – are capable of servicing new segments of the market by leaning upon thousands and thousands of data points.

Are we moving from “Big Data” to “Smart Data”?

And yet the usage of data appears to be somewhat cyclical. In recent times, the emphasis in credit assessment has been on the volume of data considered when making a lending decision. Online lenders (marketplace and direct lenders both) have been hoovering up everything from Experian scores, to data pertaining to the social media activity of a potential customer. But there now appears to be a clear shift away from quantity, and towards the observable quality of those individual data fields. In short, it is a shift from “big data”, to “smart data”.

Creamfinance is firmly in the “smart data” camp. The company uses an “innovative credit scoring approach” to offer rapid credit solutions in five European markets – Poland, Latvia, Czech Republic, Slovakia and Georgia. Creamfinance lends money off its own balance sheet, rather than operating a marketplace lending programme. The company operates through a number of different local platforms, such as and in Latvia. I caught up with Creamfinance Co-Founder Davis Barons a few weeks ago, who described for me the effect of his company’s transitions towards a “less is more” style of lending.

Davis explained that, to his mind, models with a lower number of variables enjoy a higher degree of stability in terms of outcome. Why? All data carries what Davis describes as “information value”. Data with less information value also carries lower predictive power – in terms of a borrower’s ability to repay a loan – and in many cases will not provide value added in the decision-making process. The point appears to be that shaving down the number of data points assessed creates a greater consistency of customer experience. I borrow the term “consistency of experience” from recently appointed Zopa CEO Jaidev Janardana. It is a term that is of vital import to online lenders around the world, as they all rely heavily on high quality customer experience as a means of distinguishing their services from that of the banks.

Further to this point, Davis also made the point that the greater the number of data fields scrutinised by a lender, the more burdensome the loan application process becomes. A smart data focus can have the effect of lowering such barriers, thus increasing customer conversion rate. Of course, the flip side of the argument pertains to how seamlessly borrower data can be sourced. Clearly a big data approach that relies upon borrowers manually providing heaps upon heaps of personal information is unsustainable. Big data assessment will surely only function smoothly and effectively when the large quantities of required information may be gleaned through seamless technology solutions, rather than via manual input.

This point ties in well with another crucial consideration: cost. There is of course a cost associated with the collection of most varieties of borrower data. Online lenders need to focus on the collection of demonstrably impactful data points, otherwise they are quite literally wasting their money on the hoarding of information – much of which will bear no relevance to the success of their operations. And that wastefulness will likely have a negative impact on profitability, or perhaps – for the marketplace model – on investor returns.

For Creamfinance’s part, the transition to smart data appears to be running smoothly. Davis tells me that default rates over the past two years have fallen significantly in most of the countries covered by the platform. For context, 80% of the company’s decisioning is currently automatic, and there remains, according to Davis, space for that proportion to grow. Creamfinance’s platforms are currently lending over €4.5m a month, with an average scoring time of 50 seconds, an acceptance rate of 15-40% (depending on the market) and an expected loss rate of 3-10%.

So, how do we define a “smart data” approach? A credit process that initially relied upon many thousands of data points, which is now driven by just a few hundred – the unimportant factors having been whittled away over time. But how is that any different to the steady evolution of traditional credit models? Here too we see large volumes of data compacted and leveraged, the only difference being that they are often baked into a singular outcome – such as a FICO score.

Patrick Koeck, Chief Risk Officer at Creamfinance, provided an answer: 

“The difference starts with the service execution: banks are more conservative and built on decision making based on traditional data usage. In this case, information from customers is collected partly in paper form, whereas non-bank lenders use electronic data instead. The difference between the methods is mainly reflected in risk management and service speed: due to more standardized electronic information non-bank lenders are faster and more automatized, which translates into speedy transaction.”

“Banks are built to give the credit based on regulations that have existed for centuries, and such strict regulations do not allow banks to implement changes rapidly. Decision making is mainly supported by information given from a customer in a paper form. Compared to banks, non-bank lenders are more data-driven: the service in many cases is supported by self-learning algorithms collecting electronic data that minimizes the need for human interference in decision making. This translates into better risk assessment and decision making. By utilizing self-learning technology in risk assessment, companies are becoming more automated, disrupting existing banking and credit systems. That’s why Fintech is sometimes referred to an algorithm-based banking.”