Why discreet variables?

Adidas · ‎07-12-2017

Why do credit models use discreet variables rather than continuous variables?

IE <9% util vs <30% utilization instead of 3%, 4%, 5% ...

FICO 08:743 EQ Bankcard from Citi, 764 EX from AmEx, 747 TU from Disc all updated 8/2017

Discover It $8,600 Since 08/2014 // AmEx BCE $23,100 Since 10/2015 // Citi DC $7,000 Since 06/2016 // BoA BBR $1,800 Since 11/2016 // US Bank Cash+ $3,000 Since 11/2016

Anonymous · ‎07-12-2017

I think you mean discrete rather than discreet.

At the low end of the utilization spectrum it's likely because there actually isn't a genuine risk difference. I.e. a person who is reporting a 1% utilization is truly at no less risk than someone with a 7% utilization. Where it becomes a good question is why there should be big gaps after a scoring penalty begins. E.g. why is there no gradually increasing penalty between 9% and 29% (or wherever the breakpoints are).

A question you don't ask (but which could just as well be raised) is why risk should be correlated with base 10 arithmetic. Why should risk become significant at around 10%? Why not 7% or 12%? Why should so many breakpoints fall around changes in base 10 notation? (10%, 30% 50% etc.) After all our use of base 10 is a kind of biological accident linked to us having 10 fingers.

My guess is because the FICO developers are people who think in base 10, and they put in breakpoints because it is easier to create the models that way. Reality may be closer to a continuum (and likewise unrelated to base 10 notation) but it was easier to make the models with discrete breakpoints and the developers tended to think in tens.

Just some guesswork on my part. I don't know any of the FICO developers so I can't say for sure, of course. Happy to hear from someone who knows more.

NRB525 · ‎07-12-2017

Programming the software algorithm requires having some definition of output ( action ) for every possible combination of input variables. With all the dimensional scales credit scoring has to deal with, any areas they can reasonably consolidate factors, such as ranges of utilization percentages, saves programming complication.

If all else in the profile is the same, is there any change in risk of default for someone going from 12% utilization to 13% utilization? No.

High Bal Jan 2009 $116k on $146k limits 80% Util.
Oct 2014 $46k on $127k 36% util EQ 722 TU 727 EX 727
April 2018 $18k on $344k 5% util EQ 806 TU 810 EX 812
Jan 2019 $7.6k on $360k EQ 832 TU 839 EX 831
March 2021 $33k on $312k EQ 796 TU 798 EX 801
May 2021 Paid all Installments and Mortgages, one new Mortgage EQ 761 TY 774 EX 777
April 2022 EQ=811 TU=807 EX=805 - TU VS 3.0 765

Anonymous · ‎07-13-2017

@NRB525 wrote:

If all else in the profile is the same, is there any change in risk of default for someone going from 12% utilization to 13% utilization? No.

Probably not, but in going from 11% to 27%, for example, there very well could be. Or, even in CGIDs example above of 1% to 7% there could be a significant change. What these percentages don't take into account and perhaps something that wasn't considered with the algorithm would be monster limits, IMO.

Someone with $50k annual income that has credit limits of $100k would be experiencing CC debt of $11,000 verses $27,000 in the first example. Would there be any greater risk of default when considering those 2 numbers? For someone that pushed their limits to $200k, double those numbers to $22,000 in debt verses $54,000.

In CGIDs example, consider $1000 in debt verses $7000. I think most people on an average income could plan to nix that $1000 pretty easily. The $7000 could be a bit more difficult and take a year or so to eliminate depending on income/expenses. To me, by definition, that makes the $7000 debt more of a risk, as with the greater time to pay it off more variables could be introduced (loss of job, medical issue, etc) to hinder the ability TO pay it off.

Adidas · ‎07-13-2017

Oops yes I meant discrete.

Regarding the comments about it being easier to program or that developers think in base 10 so therefore they used base 10, I find it hard to believe they wouldn't make their models as accurate as possible. If this score is going to be used by large institutions to make lending decisions that help them earn a profit then it's worth their time to focus on the details to get as accurate as possible. Also it wouldn't be that hard to write an equation to relate different factors with the final score so the idea that they need to program in individual inputs <-> output associations doesn't make sense either.

I agree the difference in 12% vs 13% probably isn't big but that's why they could program it to have a very small difference, say just a few points.

One thought I had is perhaps the scores aren't designed to add up to your score. In other words you can't add up how many points you get from util + AAOA + AOA - baddies = FICO because maybe having 1 baddies makes a 1% util difference less important. I realize having separate scorecards accomplishes this a little bit too.

FICO 08:743 EQ Bankcard from Citi, 764 EX from AmEx, 747 TU from Disc all updated 8/2017

Discover It $8,600 Since 08/2014 // AmEx BCE $23,100 Since 10/2015 // Citi DC $7,000 Since 06/2016 // BoA BBR $1,800 Since 11/2016 // US Bank Cash+ $3,000 Since 11/2016

Anonymous · ‎07-13-2017

@Adidas wrote:
Oops yes I meant discrete.

Regarding the comments about it being easier to program or that developers think in base 10 so therefore they used base 10, I find it hard to believe they wouldn't make their models as accurate as possible. If this score is going to be used by large institutions to make lending decisions that help them earn a profit then it's worth their time to focus on the details to get as accurate as possible. Also it wouldn't be that hard to write an equation to relate different factors with the final score so the idea that they need to program in individual inputs <-> output associations doesn't make sense either.

I agree the difference in 12% vs 13% probably isn't big but that's why they could program it to have a very small difference, say just a few points.

One thought I had is perhaps the scores aren't designed to add up to your score. In other words you can't add up how many points you get from util + AAOA + AOA - baddies = FICO because maybe having 1 baddies makes a 1% util difference less important. I realize having separate scorecards accomplishes this a little bit too.

Yes indeed. Different scorecards does accomplish this. And furthermore, it's quite possible that in different scorecards there are different breakpoints, depending on the factor.

Regarding the Base 10 issue, it seems like the testers here have vast amounts of test cases that show that the model developers often create breakpoints (for many factors) that mirror base 10 arithmetic. The likelihood that actual risk reality happens to mirror base 10 arithemtic seems to me a priori immensely improbable. If a physicist submitted a paper with a model that expected the moons of Neptune to exhibit motion that just happened to mirror base 10, that would be a red flag that his own arbitrary system of notation had interfered with his model building.

The simplest explanation I can think of is that the FICO model builders think in base 10, and they created models that were "good enough" back in the day based on that. Remember too that when a new model is released it (in practice) needs to track somewhat with previous models for the reasons of backward compatibility. FICO has learned in a hard school that when they completely rework their models virtually none of their customers adopt them. This happened with FICO NextGen in 2001.