Instance-dependent cost-sensitive learning for detecting transfer fraud
From MaRDI portal
Abstract: Card transaction fraud is a growing problem affecting card holders worldwide. Financial institutions increasingly rely upon data-driven methods for developing fraud detection systems, which are able to automatically detect and block fraudulent transactions. From a machine learning perspective, the task of detecting fraudulent transactions is a binary classification problem. Classification models are commonly trained and evaluated in terms of statistical performance measures, such as likelihood and AUC, respectively. These measures, however, do not take into account the actual business objective, which is to minimize the financial losses due to fraud. Fraud detection is to be acknowledged as an instance-dependent cost-sensitive classification problem, where the costs due to misclassification vary between instances, and requiring adapted approaches for learning a classification model. In this article, an instance-dependent threshold is derived, based on the instance-dependent cost matrix for transfer fraud detection, that allows for making the optimal cost-based decision for each transaction. Two novel classifiers are presented, based on lasso-regularized logistic regression and gradient tree boosting, which directly minimize the proposed instance-dependent cost measure when learning a classification model. The proposed methods are implemented in the R packages cslogit and csboost, and compared against state-of-the-art methods on a publicly available data set from the machine learning competition website Kaggle and a proprietary card transaction data set. The results of the experiments highlight the potential of reducing fraud losses by adopting the proposed methods.
Recommendations
Cites work
- scientific article; zbMATH DE number 4055377 (Why is no real title available?)
- scientific article; zbMATH DE number 845714 (Why is no real title available?)
- scientific article; zbMATH DE number 1391397 (Why is no real title available?)
- scientific article; zbMATH DE number 6438182 (Why is no real title available?)
- A new look at the statistical model identification
- Additive logistic regression: a statistical view of boosting. (With discussion and a rejoinder by the authors)
- Algorithm 733: TOMP–Fortran modules for optimal control calculations
- Applied logistic regression
- Development and application of consumer credit scoring models using profit-based classification measures
- Estimating the dimension of a model
- Greedy function approximation: A gradient boosting machine.
- Machine learning. A probabilistic perspective
- Optimal auditing with scoring: theory and application to insurance fraud
- Profit driven decision trees for churn prediction
- Statistical comparisons of classifiers over multiple data sets
- Using neural network rule extraction and decision tables for credit-risk evaluation
Cited in
(7)- To do or not to do? Cost-sensitive causal classification with individual treatment effect estimates
- Claims fraud detection with uncertain labels
- Explainable AI for operational research: a defining framework, methods, applications, and a research agenda
- Cost-sensitive thresholding over a two-dimensional decision region for fraud detection
- Off-the-peg and bespoke classifiers for fraud detection
- Robust instance-dependent cost-sensitive classification
- B2Boost: instance-dependent profit-driven modelling of B2B churn
This page was built for publication: Instance-dependent cost-sensitive learning for detecting transfer fraud
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2242220)