What is vertical federated learning (VFL) for credit scoring?

Vertical federated learning (VFL) lets two organizations jointly train a credit scoring model when each side holds different features about the same clients. In the banking use case, the bank holds client identifiers, some features, and the labels (creditworthiness 0/1), while an external analytics vendor holds client identifiers, behavioral and socio-demographic features. VFL produces a single joint model without either side transferring raw data or labels to the other. Only intermediate gradients (optionally encrypted) cross the network.

Can a bank and an analytics vendor train a joint credit scoring model without the bank sharing its labels?

Yes. On the Guardora VFL platform, the bank keeps the target variable (historical creditworthiness labels) entirely on its side. The vendor retains behavioral and socio-demographic features on its side. The platform coordinates the exchange of homomorphically encrypted gradients over gRPC with a TLS channel, so the resulting GBDT model leverages information from both parties—while neither party gains access to the other’s raw data or labels. In the described case, the resulting model achieved a ROC AUC of approximately 81.3, outperforming a standard ensemble (stacking), which also requires sharing the labels.

How does Guardora VFL protect bank labels and vendor features?

Guardora VFL supports two security modes. In the maximum-protection mode, gradients are encrypted using the Paillier homomorphic encryption scheme, so gradients never leave the parties’ perimeters in plaintext. In the high-performance mode, the raw data and labels still remain within their respective perimeters, but feature histograms are exchanged instead. In both modes, raw features, personal records, and labels remain within the environment of the party that owns them.

What model quality does Guardora VFL achieve on credit scoring compared to local and ensemble models?

In the described credit-scoring use case on tabular data with GBDT (100 trees, max depth 6), Guardora VFL achieved ROC AUC ≈ 81.3, outperforming a stacking ensemble of two separately trained models (71.3), the bank's local model (70.1) and the vendor's local model (67.6). Also the stacking ensemble requires label transfer between parties, while Guardora VFL does not.

How fast is Guardora VFL training and inference for credit scoring?

Training a GBDT model without encryption on 300,000 records takes less than 9 minutes. Training with Paillier homomorphic encryption on 50,000 records takes approximately 1.4 hours — encryption does not affect model quality but increases training time. Inference is optimized for production: 650 requests per second in a multi-threaded implementation, or about 0.008 seconds per request, comparable to standard non-federated scoring models.

Is Guardora VFL suitable for regulated industries such as banking?

Yes. Guardora VFL was designed for scenarios where personal-data protection legislation and internal information-security procedures restrict data transfer between parties. Because raw records and labels never leave the perimeter of the party that owns them, the joint training process avoids the compliance review, data-processing agreements, and regulatory risk that accompany raw-data transfer. The solution has been tested on real-world data inside the perimeter of a major technology company.

What machine learning models and data types does Guardora VFL support?

The credit scoring case study uses Gradient Boosting on Decision Trees (GBDT), the standard model for tabular data in financial services. Guardora VFL is purpose-built for tabular (structured) data scenarios where features are vertically partitioned across two parties that share the same set of client identifiers but hold different attributes about each client.

How does Guardora VFL differ from FATE, OpenFL, and Flower for credit scoring?

FATE, OpenFL, and Flower are open-source federated learning frameworks that require significant engineering effort to build a production two-party VFL workflow with homomorphic encryption, gradient coordination, and inference serving. Guardora VFL is a commercial, purpose-built VFL platform with a ready-to-deploy bank-vendor workflow, tested performance benchmarks on credit scoring data (ROC AUC up to 99.7 on GBDT), production-grade inference (650 req/sec), and built-in Paillier encryption. The tradeoff is commercial licensing versus the zero-license-cost of open-source frameworks.

Use cases

Vertical federated learning of credit scoring models

Credit scoring

Vertical Federated Learning for Credit Scoring Models Without Label Leakage

Industry

Banking sector, financial services, scoring products

Technology

Vertical Federated Learning (VFL)

Data type

Tabular (structured) data

ML models

Gradient Boosting (GBDT)

ML task

Credit scoring

Product

Guardora VFL

Customer profile

The customer is a vendor company in the field of analytical products and predictive analytics that provides credit-scoring services to commercial banks.

The vendor holds an extensive body of data about users and, on this basis, builds creditworthiness assessment models for financial institutions.

The vendor’s clients use these models to make credit decisions, manage risk, and counter fraud.

Customer characteristics:

High requirements for compliance with personal-data protection legislation
Need to enrich models with external data sources without compromising confidentiality
Use of predictive analytics for credit scoring and risk assessment

Problem statement

The analytics-product vendor provides credit-scoring services to commercial banks. In the current process, the joint model is built using an ensembling method (stacking): each party trains its own model on its own data, after which the outputs of both models are combined into a final score.

To train the model on the vendor’s side, the labels (the target variable) must be transferred from the bank in the clear. The labels are confidential bank information containing the historical creditworthiness data of its clients. Transferring the labels requires agreement between the parties and compliance with internal information-security and compliance procedures.

The vendor’s core question: can different companies build a single ML model without disclosing data to each other, so that the final model uses the datasets of all parties, and the model quality is better than the quality of the individual local models?

Three key challenges:

Label confidentiality: the bank is not willing to transfer the target variable to an external partner.
Data decentralisation: features are distributed between the two parties.
Regulatory constraints: data transfer is restricted by law.

Testing objective: to compare the quality of a model trained using Vertical Federated Learning (VFL) with the quality of each party’s local model, and with the quality of the two-model ensemble (stacking), on the condition that within VFL, the labels are not transferred between parties.

Solution

To solve this problem, the Guardora VFL platform was deployed

A vertical federated learning solution in which two parties jointly train an ML model without transferring raw data or labels.

Participants and data distribution

The test was conducted simulating two parties:

01/

Active party (bank): holds the labels (the target variable, a binary creditworthiness indicator 0/1), the client identifiers, and some of the features.

02/

Passive party (vendor): holds ~200 features for the same clients, but has no labels.

Models and data protection

The testing was performed using a Gradient Boosting on Decision Trees (GBDT) model (a model consisting of 100 trees with a maximum depth of 6), the primary model for working with tabular data.

Data protection is provided under two security modes:

Homomorphic encryption of gradients (Paillier algorithm), the maximum level of protection: raw data and gradients never leave the perimeter in the clear.
No-encryption mode, raw data is not transferred, but gradients are transmitted in the clear. Provides high training speed.

Results

Comparison of approaches

The table below presents the test results:

Approach	ROC_AUC	Label transfer
Local model of party A	67.6	Not required
Local model of party B	70.1	Required
Ensemble of two models (stacking)	71.3	Required
Guardora VFL (GBDT)	≈ 81.3	Not required

For the computation of party B’s local model within the test scenario, the labels were used.

Key result: after hyper-parameter optimisation, the VFL model achieved quality comparable to the two-model ensemble (stacking) and outperformed the local models of each party. Labels were not transferred between participants.

Thus, in the modelled scenario, the vendor’s clients (the banks) obtain a model at the ensemble-quality level without transferring labels.

The training and inference speed of the federated model is comparable to the training and inference speed of a standard scoring model:

Training the GBDT model without encryption on 300,000 records takes less than 9 minutes.
Training the gradient-boosting model with encryption on 50,000 records takes approximately 1.4 hours.
Inference speed is optimised: 650 requests per second in a multi-threaded implementation (0.008 s per request).

Detailed experimental results

The table below shows the ROC_AUC results of the federatively trained model for different numbers of data rows (samples) and the applied training method, gradient boosting.

No	Sample	Encryption	ROC AUC	Time
1	10K	-	76.72	36.5 s
2	50K	-	79.15	1.5 m
3	100K	-	79.06	2.9 m
4	200K	-	79.77	5.9 m
5	300K	-	78.82	8.9 m
6	10K	+	76.32	38 m
7	50K	+	78.78	1.4 h
8	400K	+	81.30	13h

*The results were achieved through experiments involving changes to the model hyperparameters

Key conclusions

The quality of the VFL model is comparable to the two-model ensemble (stacking) and exceeds the quality of each party’s local model, while labels are not transferred between training participants. Federated learning uses the data of both parties to build a unified model without violating confidentiality requirements.

The training speed without encryption is considered high, and with encryption, acceptable for industrial use.

The solution has undergone full-scale testing within the perimeter of a major technology company on real-world data.

Try it yourself!

Schedule a call about your case