Credit-g Dataset

algorithms, we choose seventeen datasets (Breast-cancer, Breast-w, Car, Credit- g, Cylinder-bands, Diabetes, Heart-c, Heart-statlog, Hepatitis, Ionosphere, Labor, Molecular, Sick, Sonar, Spect, Spectf, and Tic-tac-toe) from the UCI Machine. Learning Repository [1]. Since misclassification costs are not available for the.

Jul 12, 2016. Analyze Credit Risk with Spark Machine Learning Scenario. Our data is from the German Credit Data Set which classifies people described by a set of attributes as good or bad credit risks. For each bank loan application we have the following information: The german credit csv file has the following format :

We apply these techniques on German credit data using an 80:20 learning:test split, and compare the performance of the models fitted using the three classification techniques. The probability of default pi for each observation in the test set is calculated using the models fitted on the training dataset. Each test set sample i.

with two publicly available credit datasets as the study samples confirms the superiority of the proposed. Therefore, a cutoff suitable for one certain credit datasets might not be appropriate. [25] J.P. Li, G. Li, D.X. Sun, C.F. Lee, Evolution strategy based adaptive Lq penalty support vector machines with Gauss kernel for.

