Regression analysis isn’t just a statistical tool — it’s a strategic asset for financial institutions (FIs) navigating mounds of mortgage, consumer, and small business lending data and staying ahead of fair lending compliance requirements.
While it’s a game-changer, understanding how and when to apply fair lending regression analysis is crucial to maximizing its potential. Read on to learn everything you need to know to determine if it’s the right tool for your FI and how to integrate it into your fair lending program.
Want more regression analysis insights? Check out our Fair Lending Regression Analysis Primer.
What is regression analysis?
Regression analysis is a statistical model that helps organizations understand how variables interact with each other. Rather than relying on line-by-line comparisons, organizations can evaluate data holistically to uncover patterns and explain disparities.
Regression analysis can be used in fair lending to analyze several factors — such as DTI (Debt-to-Income ratio), LTV (Loan-to-Value ratio), and credit score — to explain disparities and determine areas that need to be explored.
Related: State Fair Lending Enforcement Is Heating Up: Massachusetts Hits Lender with $2.5 Million Settlement
How does fair lending regression analysis work?
Many FIs face the challenges of navigating massive and complex data sets. When performed correctly, regression analysis can help simplify data testing by identifying variances that need further investigation in order to mitigate fair lending compliance risk. For example, multiple logistic regression is often used to analyze credit or approval decisions, while multiple linear regression is typically applied to pricing.
Using regression analysis to identify fair lending risks
Underwriting and pricing are two areas of significant risk in lending. From potential biases in the loan underwriting process to unequal loan terms for similar applicants, discriminatory outcomes can spell trouble for lenders.
A lender can use regression analysis to estimate how protected factors (such as race or gender) are statistically related to the likelihood of a given outcome (e.g., approval or denial), controlling for legitimate credit factors.
Here are some other ways regression analysis can help lenders:
- In underwriting, regression predicts approval probabilities and evaluates whether actual decisions differ systematically from model expectations.
- In pricing, regression can be used to estimate expected annual percentage rate (APR) and note rates and assess whether actual pricing outcomes deviate systematically from those expectations.
Related: 5 Questions to Learn if Fair Lending Regression Analysis is Right for You
What data is needed for fair lending regression analysis?
The data required for regression analysis depends on an FI’s goals, risk profile, and lending type. While the core fields are often the same, the details shift depending on whether the focus is on underwriting, pricing, or a specific product line.
For most underwriting and loan pricing analyses, key data points include Credit Score, LTV, DTI, Interest Rate, APR, Branch ID, and Loan Term. These variables establish the baseline for how credit and pricing decisions are made across the portfolio.
From there, additional data are layered in depending on the type of lending. Mortgage reviews typically draw on Home Mortgage Disclosure Act Loan/Application Register (HMDA LAR) data, rate type, loan terms, and rate lock information. Auto lending requires vehicle-specific attributes such as age and mileage, along with buy rates and dealer details. Consumer lending analysis often centers on applicant demographics and the factors explicitly used in pricing or underwriting decisions.
Other factors can also shape credit decisions or pricing, often involving more subjective considerations such as special offers or promotional programs, borrower relationships, market-driven pricing differences, or the individual review of credit information.
Category |
Core/Key Data Fields |
Additional/Lending-Specific Data |
Subjective Factors |
Underwriting & Pricing |
Credit Score, Loan-to-Value (LTV), Debt-to-Income (DTI), Interest Rate, APR, Branch ID, Loan Term |
N/A |
Special offers or promotions, borrower relationships, market-driven pricing differences, and individual review of credit information |
Mortgage Lending |
Core fields above |
HMDA LAR data, rate type, loan terms, rate lock information, etc. |
Same as above |
Auto Lending |
Core fields above |
Vehicle-specific attributes (age, mileage), buy rates, dealer information, etc. |
Same as above |
Consumer Lending |
Core fields above |
Credit type, rate set date, loan terms, etc. |
Same as above |
Which lenders should use fair lending regression analysis?
Regression analysis is an ideal fit for FIs that:
- Have 1,000-plus data applications: High-volume and complex lenders benefit most from regression's analytical power.
- Have unexplained disparities: Basic statistical tests reveal many differences across groups, but without controls for legitimate credit factors, it’s unclear which disparities reflect actual fair lending risk.
- Need deeper data insights: Examiners have recommended more sophisticated analysis, or the FI is taking a proactive approach before an exam.
- Want to be ready for regulators: Examiners increasingly expect data-driven analyses that can quantify and explain disparities. Regression testing demonstrates analytical rigor and supports defensible fair lending conclusions.
What mistakes do organizations make when using regression analysis?
Even with powerful tools like regression analysis, lenders can undermine their own efforts by making mistakes, such as using a too-small dataset, analyzing inappropriate data, skipping basic fair lending reviews, or relying on poor-quality data.
If your FI can check “yes” next to any of the following statements, adjustments may be necessary before proceeding with regression analysis.
Your data volume is low
Regression analysis is not robust with small datasets. Larger files yield more accurate and meaningful predictive models. A minimum of 1,000 records is generally recommended, as organizations with fewer records often find regression analysis to be limited.
The data contains minimal variation
Regression requires variation in both dependent and independent variables. It will not work effectively if, for example:
- All or none of the applicants belong to a protected class
- All or none of the applicants were denied
- All or none of the denied applicants belong to a protected class
- There is no variation in interest rates or pricing
Your data quality is poor
Regression relies on clean, accurate data. Rushing into analysis without reviewing data integrity undermines results. At a minimum, ensure that all records include the price (rate/APR), the action taken, and contain no outlier data, such as negative credit scores or other obvious errors.
Your data points don't have a direct effect on the tested outcome
Variables included in the model should have a direct relationship with the outcome being tested. For example, in underwriting analysis, including a variable such as lock term would invalidate the results, since lock term should not influence underwriting decisions.
How should organizations use regression analysis? Tips and best practices
- Know your goal. Start with a question, not a dataset. Regression should be applied purposefully to serve a specific compliance or business objective.
- Prepare your data. Clean, accurate data is essential. Review your dataset for quality and integrity.
- Focus on lending types suited for regression. While regression can be applied to any loan dataset, it is most commonly used for HMDA analysis, indirect auto lending, credit cards, and consumer loans, where there is variation in the products, product terms, and outcomes. For example, regression would not be helpful if the FI uses a flat pricing matrix where all borrowers receive the same interest rate.
- Maximize your resources. Automated fair lending compliance tools can help streamline your processes and simplify lending data analysis.
Want to learn more about regression analysis? Check out our compliance primer for a detailed breakdown of regression analysis for fair lending.
