Earnix Blog > Analytics

Model Analysis: Achieving Fairness Without Compromising Performance

Yuval Ben Dror, Data Science Researcher, Earnix & Eyal Bar Natan, Team Leader, Data Science, Earnix

5. January 2025

  • Analytics
Person Reaching Out to a Robot

Earnix Analytical and Technical Blog Series

What’s the best way to handle the analytical challenges facing insurers and banks today? We at Earnix believe it starts with asking the right questions, challenging assumptions, and finding innovative ways to move forward. 

This technical blog series is all about that process. We’re diving into some of the most pressing issues in financial analytics —how to tackle complex problems, develop smarter solutions, and stay ahead in a competitive industry. 

Whether it’s dealing with biased data, balancing fairness and accuracy, or making machine learning models work better in the real world, each blog will take you through the core ideas behind these challenges and provide you with guidance on how to best consider addressing them.  

These more technical posts are written for professionals with some background in actuarial, data science or analytics, but our goal is to keep things clear, straightforward, and easy to digest.  

We hope you’ll find these insights useful and maybe even inspiring. Let’s get started with our first blog – Model Analysis. 

Introduction: Bias Is Inevitable 

In a perfect world, insurance data would be a completely random sample of the population and accurately reflect reality. In an even more perfect world, ML models would capture the true causal effects of each feature. However, in reality, data is often biased, whether due to uneven sampling (e.g., underrepresentation of certain demographics), or inherent selection effects (e.g., customers who choose certain policies being systematically different from those who do not). As a result (and due to the limitations of modeling algorithms), models tend to latch onto inherent biases in the dataset, leading to predictions based on correlations that don’t necessarily reflect causation. 

For example, consider a scenario where the dataset overrepresents dangerous female drivers, accounting for 12% of this subgroup, compared to only 5% of male drivers. Many ML models would use this information to attribute higher risk to female drivers. 

This is problematic, not only for ethical reasons but also because biased models may lack robustness and degrade over time. For example, a model might assign undue importance to the color of a vehicle simply because certain colors are temporarily associated with specific car types. However, trends change, and new car types may not maintain the same association with color that the model initially identified, leading to erroneous predictions. Beyond the ethical and logical reasons to avoid biases, regulatory restrictions may also prohibit the use of discriminatory models. 

 

See also: watch our webinar to learn about responsible AI in ensuring transparency and fairness in insurance. 

 

Regulation Overview 

The insurance industry is facing increasing scrutiny over unfair pricing practices, with several regulatory bodies emphasizing the importance of fairness and transparency. The European Insurance and Occupational Pensions Authority (EIOPA) has proposed multiple metrics, including Demographic Parity and Equalized Odds, to assess fairness in insurance pricing, particularly to protect vulnerable consumers. They highlight the risks of practices like price-walking, where premiums increase without a corresponding rise in risk. EIOPA urges insurers using AI to prioritize fairness, ensuring that outcomes are non-discriminatory, especially when using sensitive data.  

Meanwhile, the UK’s Financial Conduct Authority (FCA) and the US Consumer Financial Protection Bureau (CFPB) have also addressed concerns regarding price discrimination, loyalty pricing, and the use of complex algorithms in financial services. These regulations seek to protect consumers from unfair pricing while recognizing that some pricing differences, like price sensitivity, may not always be deemed unfair. 

Clearly, insurers and bankers need a tool that’s designed to identify potential bias in ML models and mitigate discrimination while maintaining model performance. This is where the Earnix Model Analysis Lab comes into play. Before diving into our solution, we’ll give a brief overview of Earnix Labs. 

Earnix Labs - Driving Innovation at Earnix 

Ever wonder how new ideas go from “Wouldn’t it be cool if…?” to tools you can actually use? At Earnix, we’re always asking questions like these—and Earnix Labs is how we find the answers. 

Earnix Labs is our playground for experimenting with fresh ideas and trying out features that could make a big difference for insurers and banks. Instead of waiting for a “perfect” solution, we make these features and tools available to our customers during the early development and ideation phase. The resulting customer feedback helps us fine-tune what works and scrap what doesn’t, so we can focus on what really matters: providing substantial value to our customers. 

Our Solution: Model Analysis 

In order to understand what the lab offers, let’s start by tackling the first question that comes to mind: Why not simply remove all sensitive features from the data? For example, why not just remove the gender feature entirely? While that approach might help, there could still be other features correlated with gender (e.g., car models) that indirectly introduce bias. For example, if a certain car model has 70% female drivers and is associated with relatively high risk, the model could indirectly discriminate based on gender. 

If omitting sensitive features can still allow predisposition, how can we measure whether our model introduces bias? There are several metrics to determine this: 

  • Demographic Parity Difference - This metric measures the disparity between model outcomes for different groups. In our earlier example, if the model predicts that 12% of women and only 5% of men are dangerous drivers, the demographic parity difference would be 7%.  

  • Equalized Odds – This metric assesses whether the model's true positive rate (TPR) and false positive rate (FPR) are consistent across different groups. For example, if the model correctly predicts 80% of risky drivers (TPR) and incorrectly flags 10% of non-risky drivers (FPR) for women, but only 60% and 5% respectively for men, the difference in TPR is 20% and in FPR is 5%.  

  • Equal Opportunity – This metric ensures that the true positive rate (TPR) is equal across groups, focusing on fairness for those who qualify for a positive outcome. For instance, if the model correctly identifies 85% of good drivers for women but only 70% for men, the equal opportunity difference would be 15%.  

One of the key features of the Model Analysis lab is its ability to provide metrics and visualizations that help uncover biases, according to the metrics described above and more. This is no small task; data is often intricate and filled with factors that can complicate analysis. Clear and precise visual metrics make it easier to quickly identify potential issues and determine how severe they are. 

Alongside the provided metrics and visualizations that highlight biases in the data, we also offer insights into the model's mechanism and behavior through a feature importance analysis. This analysis is further augmented to emphasize the interaction between the most influential features and the sensitive features, providing a deeper understanding of potential biases. 

Great! Now that we can measure bias in our model or data, what can we do to mitigate it? A straightforward approach might be to manually adjust either the data or the model outcomes, according to a certain metric. For our example, we’ll focus on Demographic Parity Difference. We could omit the 7% of dangerous female drivers to balance the sample across genders. However, that scheme risks losing valuable data and isn’t practical when dealing with multiple sensitive groups. 

A better approach could be implementing a system that adjusts model predictions to account for biases, reducing the demographic parity difference. 

While these are possible solutions, they are manual and could significantly affect model performance.  That’s why the Model Analysis lab offers not only metrics and data visualizations to investigate bias, but also an automated solution to mitigate it while preserving model performance.

How Does It Work?  

The app uses an optimization algorithm that maximizes model performance while adhering to fairness metrics. It’s based on a Microsoft open-source package called Fairlearn, which provides a dashboard and multiple algorithms to build new mitigated models. 

For Demographic Parity Difference, the mitigation algorithm adjusts the weights in the loss function to ensure that the positive outcome rates are more balanced across groups. By assigning higher weights to individuals in groups with lower positive outcome rates, the model is nudged to predict positive outcomes for these groups more frequently. This reweighting process directly targets the disparity in outcomes, aiming to reduce the gap between groups while maintaining the overall accuracy of the model. The result? A new, adjusted model with minimal performance loss that meets fairness requirements. 

Returning to our example: after applying the optimization algorithm to the original model, we might see results like: 

  • 4% demographic parity difference with 87% predictive accuracy. 

Or: 

  • 5% demographic parity difference with 88% predictive accuracy. 

After selecting the best model, we can deploy it confidently, knowing it provides fairer outcomes for all customers while maintaining accuracy. 

 

Conclusion  

As we’ve seen, data is often messy and inherently biased, posing challenges for insurers and banks that must meet ethical and regulatory standards in their decision-making. At Earnix, we developed the Model Analysis Lab to address these biases in both data and modeling processes. Our primary goal is to automate bias detection and mitigation while ensuring that models remain optimized and compliant with necessary constraints and requirements. 

What are your greatest challenges when it comes to applying fairness in modeling? Contact us here to let us know.  

 

Teilen article:

Yuval Ben Dror, Data Science Researcher, Earnix & Eyal Bar Natan, Team Leader, Data Science, Earnix