The Multi-Armed Bandits problem is a classic reinforcement learning problem that demonstrates the exploration–exploitation tradeoff dilemma. To correctly estimate the causal effect of price changes or different financial product bundle offers on consumers’ purchasing decisions, the standard methodology is to perform a randomized controlled trial (RCT). However, RCTs could be expensive, and financial institutions are often reluctant to conduct them for the required period of time.
At AI Week TLV, Reuven Shnaps, Chief Analytics Officer at Earnix, discusses:
- The unique aspects of selling insurance & banking products
- The importance of estimating causal effects for pricing & product personalization
- How this can be achieved via unique usage of Contextual Multi-Armed Bandits that address some of the shortcomings associated with RCT