STRIKE: Additive Feature-Group-Aware Stacking Framework for Credit Default Prediction

Published 19 Apr 2026 in cs.LG | (2604.17622v1)

Abstract: Credit risk default prediction remains a cornerstone of risk management in the financial industry. The task involves estimating the likelihood that a borrower will fail to meet debt obligations, an objective critical for lending decisions, portfolio optimization, and regulatory compliance. Traditional machine learning models such as logistic regression and tree-based ensembles are widely adopted for their interpretability and strong empirical performance. However, modern credit datasets are high-dimensional, heterogeneous, and noisy, increasing overfitting risk in monolithic models and reducing robustness under distributional shift. We introduce STRIKE (Stacking via Targeted Representations of Isolated Knowledge Extractors), a feature-group-aware stacking framework for structured tabular credit risk data. Rather than training a single monolithic model on the complete dataset, STRIKE partitions the feature space into semantically coherent groups and trains independent learners within each group. This decomposition is motivated by an additive perspective on risk modeling, where distinct feature sources contribute complementary evidence that can be combined through a structured aggregation. The resulting group-specific predictions are integrated through a meta-learner that aggregates signals while maintaining robustness and modularity. We evaluate STRIKE on three real-world datasets spanning corporate bankruptcy and consumer lending scenarios. Across all settings, STRIKE consistently outperforms strong tree-based baselines and conventional stacking approaches in terms of AUC-ROC. Ablation studies confirm that performance gains stem from meaningful feature decomposition rather than increased model complexity. Our findings demonstrate that STRIKE is a stable, scalable, and interpretable framework for credit risk default prediction tasks.

Abstract PDF Upgrade to Chat

Authors (3)

Summary

The paper introduces STRIKE, a novel stacking framework that leverages semantic feature grouping to enhance credit default prediction.
It employs a low-capacity additive meta-learner to merge base model predictions, outperforming traditional monolithic approaches by significant AUC margins.
Empirical evaluations on datasets like Polish Bankruptcy, LendingClub, and HomeCredit confirm superior robustness, transparency, and predictive accuracy.

Additive Feature-Group-Aware Stacking for Credit Default Prediction: The STRIKE Framework

Introduction

Credit risk modeling confronts high-dimensional, heterogeneous, and noisy tabular data, especially in contemporary financial applications. Prevailing machine learning approaches, including tree-based ensembles and neural methods, have achieved notable empirical results, but struggle with issues of interpretability, model robustness under distributional shifts, and susceptibility to feature interference. This is exacerbated by the conglomeration of features from disparate origins—demographics, bureau records, delinquencies, and vintage signals—within monolithic models. The "STRIKE: Additive Feature-Group-Aware Stacking Framework for Credit Default Prediction" (2604.17622) introduces a refined approach that explicitly utilizes domain semantics through feature-grouped stacking, offering improved predictive performance and transparency critical for both operational and regulatory contexts in credit risk analysis.

Methodological Framework

Structured Feature Grouping

The STRIKE methodology decomposes the feature space into semantically coherent groups (e.g., Demographics, Vintage, Delinquency). This partition can be defined via domain knowledge or constructed through automated clustering (e.g., mutual information or correlation-based). Each group is modeled by diverse base learners (XGBoost, LightGBM, Random Forest, Logistic Regression), leveraging stratified K-fold cross-validation to generate unbiased out-of-fold (OOF) predictions. The explicit decomposition is theoretically motivated by an additive log-odds perspective on binary classification, wherein the predictive evidence from each group is approximately conditionally independent given the outcome.

Meta-Learning with Additive Aggregation

OOF predictions from the top-performing base models in each group are concatenated to form the meta-dataset. By employing a low-capacity additive meta-learner (default: logistic regression), STRIKE fuses groupwise signals while inherently mitigating overfitting. The aggregation reflects the weighted sum of group-specific logit estimates, adhering to the conditional independence approximation, but retaining the flexibility to absorb residual cross-group dependencies as justified by the data.

Empirical and Theoretical Justification

The theoretical basis for STRIKE is the additive decomposition of the Bayes-optimal logit under groupwise conditional independence. Empirically, conditional mutual information between groups (given the label) is shown to be low, substantiating the practical validity of the modeling assumption for real-world credit datasets. The induction of additivity acts as a beneficial bias: STRIKE avoids spurious interactions prevalent in high-dimensional, noisy tabular data, and delivers improved robustness compared to monolithic or interaction-heavy models.

Experimental Evaluation

Benchmarks and Datasets

STRIKE is evaluated on three benchmark datasets representing diverse credit risk scenarios:

Polish Bankruptcy: Corporate bankruptcy prediction, high class imbalance.
LendingClub: Peer-to-peer lending risk, moderate imbalance and heterogeneous features.
HomeCredit: Large-scale consumer credit data, extreme imbalance, significant sparsity and noise.

Uniform preprocessing is applied, with each dataset’s features partitioned according to meaningful financial groupings.

Comparative Results

STRIKE consistently yields superior AUC-ROC performance across all datasets, outperforming strong baselines, including XGBoost, LightGBM, GBDT, and deep neural architectures like DeepFM, DCN-V2, TabNet, and SR1D-CNN. Notably, STRIKE exceeds SR1D-CNN by over 15 percentage points on the Polish dataset and delivers 0.7661 AUC-ROC on HomeCredit, well above traditional and deep learning benchmarks. These results are obtained without hyperparameter tuning, emphasizing the framework’s architectural advantage rather than reliance on model fine-tuning.

Orthodox stacking methods that aggregate models trained on the full feature space are outperformed by a $\sim$ 3.4% AUC improvement, corroborating the efficacy of the explicit feature group decomposition strategy.

Ablation and Sensitivity Analyses

Structured feature grouping is shown to be robust: manually defined or unsupervised (mutual information, correlation-based) groupings yield similar results, whereas arbitrary grouping (random) degrades performance below that of monolithic learners. This attests that STRIKE's gains derive from capturing coherent signal structure, not merely from increased ensembling or model complexity.

Alternative meta-learners (GAM, EBM) provide modest improvements in AUC, but logistic regression meta-learning suffices to surpass all monolithic baselines, indicating that additive aggregation is particularly well matched to the credit risk context.

Implications and Future Directions

The STRIKE framework targets major practical and theoretical requirements in credit modeling: robustness to noise and feature redundancy, scalability to high-dimensionality, and transparent attribution of predictive signals. By structurally isolating semantically coherent sources of information and leveraging targeted base models, STRIKE addresses key failure modes of conventional tabular deep learning and ensemble aggregation—especially those highlighted by recent literature criticizing the deployment of convolutional architectures on non-grid, non-spatial tabular data.

From a regulatory perspective, STRIKE’s modular predictions facilitate traceability and post-hoc analysis, meeting expectations for model transparency in compliance settings (e.g., Basel III). For operational credit systems, improved accuracy and interpretability translate into tangible risk mitigation in lending decisions and portfolio management.

The methodology of feature-grouped stacking introduced here is broadly applicable and suggests further avenues for exploration:

Automated, data-driven group discovery that is adaptive to evolving data distributions.
Integration of sparse interaction models at the meta-learning stage to better capture cross-group dependencies while preserving interpretability.
Extension to other domains characterized by heterogeneous, high-dimensional tabular data (e.g., healthcare risk, fraud detection).

Conclusion

STRIKE constitutes a significant advancement in credit default prediction for structured data, introducing a feature-group-aware additive stacking framework with robust empirical performance and strong theoretical motivation (2604.17622). By reconciling modular specialization and controlled aggregation, STRIKE establishes a compelling methodological template for tackling predictive challenges inherent in noisy, heterogeneous tabular domains.

Markdown Report Issue