FL Explainer

Credit scoring

Vertical Federated Learning for Credit Scoring Models Without Label Leakage

Industry

Banking sector, financial services, scoring products

Technology

Vertical Federated Learning (VFL)

Data type

Tabular (structured) data

ML models

Gradient Boosting (GBDT)

ML task

Credit scoring

Product

Guardora VFL

Customer profile

The customer is a vendor company in the field of analytical products and predictive analytics that provides credit-scoring services to commercial banks.

The vendor holds an extensive body of data about users and, on this basis, builds creditworthiness assessment models for financial institutions.

The vendor’s clients use these models to make credit decisions, manage risk, and counter fraud.

Customer characteristics:
  • High requirements for compliance with personal-data protection legislation

  • Need to enrich models with external data sources without compromising confidentiality

  • Use of predictive analytics for credit scoring and risk assessment

Problem statement

The analytics-product vendor provides credit-scoring services to commercial banks. In the current process, the joint model is built using an ensembling method (stacking): each party trains its own model on its own data, after which the outputs of both models are combined into a final score.

To train the model on the vendor’s side, the labels (the target variable) must be transferred from the bank in the clear. The labels are confidential bank information containing the historical creditworthiness data of its clients. Transferring the labels requires agreement between the parties and compliance with internal information-security and compliance procedures.

The vendor’s core question: can different companies build a single ML model without disclosing data to each other, so that the final model uses the datasets of all parties, and the model quality is better than the quality of the individual local models?

Three key challenges:

  • Label confidentiality: the bank is not willing to transfer the target variable to an external partner.
  • Data decentralisation: features are distributed between the two parties.
  • Regulatory constraints: data transfer is restricted by law.

Testing objective: to compare the quality of a model trained using Vertical Federated Learning (VFL) with the quality of each party’s local model, and with the quality of the two-model ensemble (stacking), on the condition that within VFL, the labels are not transferred between parties.

Solution

To solve this problem, the Guardora VFL platform was deployed

A vertical federated learning solution in which two parties jointly train an ML model without transferring raw data or labels.

Participants and data distribution

The test was conducted simulating two parties:

01/

Active party (bank): holds the labels (the target variable, a binary creditworthiness indicator 0/1), the client identifiers, and some of the features.

02/

Passive party (vendor): holds ~200 features for the same clients, but has no labels.

Models and data protection

The testing was performed using a Gradient Boosting on Decision Trees (GBDT) model (a model consisting of 100 trees with a maximum depth of 6), the primary model for working with tabular data.

Data protection is provided under two security modes:

  • Homomorphic encryption of gradients (Paillier algorithm), the maximum level of protection: raw data and gradients never leave the perimeter in the clear.
  • No-encryption mode, raw data is not transferred, but gradients are transmitted in the clear. Provides high training speed.

Results

Comparison of approaches

The table below presents the test results:

ApproachROC_AUCLabel transfer
Local model of party A67.6Not required
Local model of party B70.1Required
Ensemble of two models (stacking)71.3Required
Guardora VFL (GBDT)≈ 81.3Not required

For the computation of party B’s local model within the test scenario, the labels were used.

Key result: after hyper-parameter optimisation, the VFL model achieved quality comparable to the two-model ensemble (stacking) and outperformed the local models of each party. Labels were not transferred between participants.

Thus, in the modelled scenario, the vendor’s clients (the banks) obtain a model at the ensemble-quality level without transferring labels.

The training and inference speed of the federated model is comparable to the training and inference speed of a standard scoring model:

  • Training the GBDT model without encryption on 300,000 records takes less than 9 minutes.
  • Training the gradient-boosting model with encryption on 50,000 records takes approximately 1.4 hours.
  • Inference speed is optimised: 650 requests per second in a multi-threaded implementation (0.008 s per request).

Detailed experimental results

The table below shows the ROC_AUC results of the federatively trained model for different numbers of data rows (samples) and the applied training method, gradient boosting.

NoSampleEncryptionROC AUCTime
110K-76.7236.5 s
250K-79.151.5 m
3100K-79.062.9 m
4200K-79.775.9 m
5300K-78.828.9 m
610K+76.3238 m
750K+78.781.4 h
8400K+81.3013h

*The results were achieved through experiments involving changes to the model hyperparameters

Key conclusions

The quality of the VFL model is comparable to the two-model ensemble (stacking) and exceeds the quality of each party’s local model, while labels are not transferred between training participants. Federated learning uses the data of both parties to build a unified model without violating confidentiality requirements.

The training speed without encryption is considered high, and with encryption, acceptable for industrial use.

The solution has undergone full-scale testing within the perimeter of a major technology company on real-world data.