最美情侣中文字幕电影,在线麻豆精品传媒,在线网站高清黄,久久黄色视频

歡迎光臨散文網(wǎng) 會(huì)員登陸 & 注冊(cè)

銀行案例學(xué)習(xí)實(shí)例3_邏輯回歸

2020-07-22 08:37 作者:python風(fēng)控模型  | 我要投稿

python金融風(fēng)控評(píng)分卡模型和數(shù)據(jù)分析微專業(yè)課:http://dwz.date/b9vv

up主金融微專業(yè)課

http://ucanalytics.com/blogs/case-study-example-banking-logistic-regression-3/參考

The Beautiful Formula美麗公式


The Beautiful Formula – by Roopam

Mathematicians often conduct competitions for the most beautiful formulae of all. The first position, almost every time, goes to the formula discovered by Leonhard Euler. Displayed below is the formula.

This formula is phenomenal because it is a combination of the five most important constants in mathematics i.e.

0?: Additive Identity
1?: Multiplicative Identity
π : King of geometry and?trigonometry
i?: King of complex algebra
e: King of logarithms

It is just beautiful how such a simple equation links these fundamental constants in mathematics.? I was mesmerized when I learned this Euler’s formula in high school and still am. Euler is also responsible for coining the symbol e (our king of the logarithm), which is sometimes also known as Euler’s constant. The name is an apt choice for another reason – Euler is considered the most prolific mathematician of all time. He used to produce novel mathematics at an exponential rate. This is particularly startling since Euler was partially blind for more than half his life and completely blind for around last two decades of his life. Incidentally, he was producing a high-quality scientific paper a week for a significant period when he was completely blind.

Today, before we discuss logistic regression, we must pay tribute to the great man, Leonhard Euler as Euler’s constant (e) forms the core of logistic regression.

數(shù)學(xué)家經(jīng)常為最美麗的公式進(jìn)行比賽。幾乎每次都是第一個(gè)位置,由Leonhard Euler發(fā)現(xiàn)的公式。下面顯示的是公式。

e ^ {i \ pi} + 1 = 0
這個(gè)公式是驚人的,因?yàn)樗菙?shù)學(xué)中五個(gè)最重要的常數(shù)的組合,即

0:附加標(biāo)識(shí)
1:乘法身份
π:幾何和三角學(xué)之王
我:復(fù)雜代數(shù)之王
e:對(duì)數(shù)之王

如此簡(jiǎn)單的方程如何將這些基本常數(shù)與數(shù)學(xué)聯(lián)系起來,這真是太好了。當(dāng)我在高中學(xué)習(xí)歐拉的公式并且仍然是我時(shí),我被迷住了。歐拉還負(fù)責(zé)創(chuàng)造符號(hào)e(我們的對(duì)數(shù)之王),有時(shí)也稱為歐拉常數(shù)。這個(gè)名字是另一個(gè)原因的合適選擇 - 歐拉被認(rèn)為是有史以來最多產(chǎn)的數(shù)學(xué)家。他曾經(jīng)以指數(shù)速度創(chuàng)作出新的數(shù)學(xué)。這尤其令人吃驚,因?yàn)闅W拉在他生命的一半以上部分失明,并且在他生命的最后二十年里完全失明。順便說一下,在他完全失明的一段時(shí)間里,他每周都會(huì)制作一份高質(zhì)量的科學(xué)論文。

今天,在我們討論邏輯回歸之前,我們必須向偉大的人萊昂哈德歐拉致敬,因?yàn)闅W拉常數(shù)(e)構(gòu)成了邏輯回歸的核心。

?

Case Study Example – Banking

In our last two articles?(part 1)?&?(Part 2), you were playing the role of the Chief Risk Officer (CRO) for CyndiCat bank. The bank had disbursed 60816 auto loans in the quarter between April–June 2012. Additionally, you had noticed around 2.5% of bad rate. You did some exploratory data analysis (EDA) using tools of data visualization and found a relationship between age?(Part 1)?& FOIR?(Part 2)?with bad rates. Now, you want to create a simple logistic regression model with just age as the variable. If you recall, you have observed the following normalized histogram for age overlaid with bad rates.

We shall use this plot for creating the coarse classes to run a simple logistic regression. However, the idea over here is to learn the nuances of logistic regression. Hence, let us first go through some basic concepts in logistic regression.

在我們的最后兩篇文章(第1部分)和(第2部分)中,您扮演的是CyndiCat銀行的首席風(fēng)險(xiǎn)官(CRO)。 該銀行在2012年4月至6月期間在該季度發(fā)放了60816份汽車貸款。此外,您注意到大約2.5%的不良率。 您使用數(shù)據(jù)可視化工具進(jìn)行了一些探索性數(shù)據(jù)分析(EDA),并發(fā)現(xiàn)年齡(第1部分)和FOIR(第2部分)與不良率之間的關(guān)系。 現(xiàn)在,您想要?jiǎng)?chuàng)建一個(gè)簡(jiǎn)單的邏輯回歸模型,僅將年齡作為變量。 如果你還記得,你已經(jīng)觀察到以下標(biāo)準(zhǔn)化的直方圖,其中年齡覆蓋了不良率。

我們將使用此圖創(chuàng)建粗類以運(yùn)行簡(jiǎn)單的邏輯回歸。 然而,這里的想法是學(xué)習(xí)邏輯回歸的細(xì)微差別。 因此,讓我們首先介紹邏輯回歸中的一些基本概念

Logistic regression

In a previous article?(Logistic Regression), we have discussed some of the aspects of logistic regression. Let me reuse a picture from the same article. I would recommend that you read that article, as it would be helpful while understanding some of the concepts mentioned here.

在前一篇文章(Logistic回歸)中,我們討論了邏輯回歸的一些方面。 讓我重復(fù)使用同一篇文章中的圖片。 我建議你閱讀那篇文章,因?yàn)樵诶斫膺@里提到的一些概念時(shí)會(huì)有所幫助

Logistic Regression

In our case ?z is a function of age, we will define the probability of bad loan as the following

在我們的案例中,z是年齡的函數(shù),我們將如下定義不良貸款的概率。

你必須注意到歐拉常數(shù)對(duì)邏輯回歸的影響。 貸款或P(不良貸款)的概率在Z =-∞時(shí)變?yōu)?,在Z = +∞時(shí)變?yōu)?。 這使得概率范圍在無限遠(yuǎn)的兩側(cè)保持在0和1之內(nèi)


{P(Bad Loan)}=\frac{e^{Z}}{1+e^{Z}}=\frac{e^{\beta \times Age+Constant}}{1+e^{\beta \times Age+Constant}}

=?odd/(1+odd)

You must have noticed the impact of Euler’s constant on logistic regression. The probability of loan or P(Bad Loan) becomes 0 at Z= –∞ and 1 at Z = +∞. This keeps the bounds of probability within 0 and 1 on either side at infinity.

Additionally, we know that probability of good loan is one minus probability of bad loan hence:

你必須注意到歐拉常數(shù)對(duì)邏輯回歸的影響。 貸款或P(不良貸款)的概率在Z =-∞時(shí)變?yōu)?,在Z = +∞時(shí)變?yōu)?。 這使得概率范圍在無限遠(yuǎn)的兩側(cè)保持在0和1之內(nèi)。

If you have ever indulged in betting of any sorts, the bets are placed in terms of odds. Mathematically, odds are defined as the probability of winning divided by the probability of losing. If we calculate the odds for our problem, we will get the following equation.

如果你曾經(jīng)沉迷于任何種類的投注,那么投注就是賠率。 在數(shù)學(xué)上,賠率被定義為獲勝概率除以失敗概率。 如果我們計(jì)算出問題的幾率,我們將得到以下等式。


\frac{P(Bad Loan)}{P(Good Loan)}={e^{\beta \times Age+Constant}}

Here we have the Euler’s constant stand out in all its majesty.

在這里,我們讓歐拉的不變?cè)谄渌械耐?yán)中脫穎而出。

Coarse Classing

Now, let create coarse classes from the data-set we have seen in the first article of this series for age groups. Coarse classes are formed by combining the groups that have similar bad rates while maintaining the overall trend for bad rates. We have done the same thing for age groups as shown below.

現(xiàn)在,讓我們從本系列第一篇文章中為年齡組看到的數(shù)據(jù)集創(chuàng)建粗類。 粗類通過組合具有相似不良率的組而形成,同時(shí)保持不良率的整體趨勢(shì)。 我們?yōu)槟挲g組做了同樣的事情,如下所示。


Coarse Classing

Table 1 – Coarse Class

We will use the above four coarse classes to run our logistic regression algorithm. As discussed in the earlier article the algorithm tries to optimize Z. In our case, Z is a linear combination of age groups i.e Z = G1+G2+G3+Constant. You must have noticed that we have not used G4 in this equation.?This is because the constant will absorb the information for G4. This is similar to using dummy variables in linear regression. If you want to learn more about this, you could post your questions on this blog and we can discuss it further.

我們將使用上述四個(gè)粗類來運(yùn)行邏輯回歸算法。 正如在前面的文章中所討論的,算法試圖優(yōu)化Z.在我們的例子中,Z是年齡組的線性組合,即Z = G1 + G2 + G3 +常數(shù)。 你一定注意到我們沒有在這個(gè)等式中使用G4。 這是因?yàn)槌?shù)將吸收G4的信息。 這類似于在線性回歸中使用虛擬變量。 如果您想了解更多相關(guān)信息,可以在此博客上發(fā)布您的問題,我們可以進(jìn)一步討論。

Logistic Regression

Now, we are all set to generate our final logistic regression through a statistical program for the following equation.

現(xiàn)在,我們都準(zhǔn)備通過以下等式的統(tǒng)計(jì)程序生成我們的最終邏輯回歸。


\frac{P(Bad Loan)}{P(Good Loan)}=e^{\beta _{1}\times G_{1}+\beta _{2}\times G_{2}+\beta _{3}\times G_{3}+Constant}

You could either use a commercial software (SAS, SPSS or Minitab) or an open source software (R) for this purpose. They will all generate a table similar to the one shown below:

您可以使用商業(yè)軟件(SAS,SPSS或Minitab)或開源軟件(R)來實(shí)現(xiàn)此目的。 它們都將生成一個(gè)類似于下圖所示的表:

Let us quickly decipher this table and understand how the coefficients are estimated here. Let us look at the last column in this table i.e. Odds Ratio. How did the software arrive at the value of 3.07 for G1? The odds (bad loans/good loans) for G1 are?206/4615 = 4.46%?(refer to above?Table 1 – Coarse Class). Additionally, odds for G4 (the baseline group) are 183/12605 =1.45%. ?The odds ratio is the ratio of these two numbers 4.46%/1.45% = 3.07. Now, take the natural log of 3.07 i.e. ln(3.07) = 1.123 – this is our c for G1. Similarly, you could find the coefficient for G2 and G3 as well. Try it with your calculator!

These coefficients are the β values to our original equation and hence the equation will look like the following

讓我們快速解讀這個(gè)表,并了解如何估計(jì)系數(shù)。 讓我們看看這個(gè)表中的最后一列,即優(yōu)勢(shì)比。 G1軟件如何達(dá)到3.07的價(jià)值? G1的賠率(不良貸款/優(yōu)惠貸款)為206/4615 = 4.46%(參見上表1 - 粗類)。 此外,G4(基線組)的賠率為183/12605 = 1.45%。?優(yōu)勢(shì)比是這兩個(gè)數(shù)字的比率4.46%/ 1.45%= 3.07。 現(xiàn)在,取3.07的自然對(duì)數(shù),即ln(3.07)= 1.123 - 這是G1的c。 同樣,您也可以找到G2和G3的系數(shù)。 試試你的計(jì)算器吧!

3.5/1.4=2.5

2.4/1.4
Out[5]: 1.7142857142857144

?

這些系數(shù)是我們?cè)挤匠痰摩轮?,因此方程式如下所?/p>


\frac{P(Bad Loan)}{P(Good Loan)}=e^{1.123\times G_{1}+0.909\times G_{2}+0.508\times G_{3}-4.232}

Remember, G1,?G2 and G3 can only take values of either 0 or 1. Additionally, since they are mutually exclusive when either of them is 1 the remaining will automatically become 0. If you make G1 = 1 the equation will take the following form.

請(qǐng)記住,G1,G2和G3只能取0或1的值。此外,由于當(dāng)它們中的任何一個(gè)為1時(shí)它們是互斥的,剩余的將自動(dòng)變?yōu)?.如果你使G1 = 1,則等式將采用以下形式。

Similarly, we could find the estimated value of bad rate for G1

This is precisely the value we have observed. Hence, the logistic regression is doing a good job for estimation of bad rate. Great! We have just created our first model.

Sign-off Note

Euler, though blind, showed us the way to come so far!?Let me also reveal some more facts about the most beautiful?formulae we have discussed at the beginning of this article. In the top five places, you will find two more formulae discovered by Leonhard Euler. That is 3 out of 5 most beautiful formulae. Wow! I guess we need to redefine blind.

To learn more about leonhard Euler watch the following You Tube Video by William Dunham (Video)

?歐拉雖然是盲目的,卻向我們展示了到目前為止的方式! 讓我也揭示一些關(guān)于我們?cè)诒疚拈_頭討論過的最美麗公式的更多事實(shí)。 在前五名中,你會(huì)發(fā)現(xiàn)Leonhard Euler發(fā)現(xiàn)的另外兩個(gè)公式。 這是5種最美麗的配方中的3種。 哇! 我想我們需要重新定義盲目。


up主微信公眾號(hào)pythonEducation

博主網(wǎng)校主頁 :http://dwz.date/bwes


博主網(wǎng)校主頁


銀行案例學(xué)習(xí)實(shí)例3_邏輯回歸的評(píng)論 (共 條)

分享到微博請(qǐng)遵守國家法律
玛沁县| 栖霞市| 邯郸县| 龙川县| 景德镇市| 合阳县| 巴林右旗| 巴里| 乌什县| 于都县| 祁阳县| 澄江县| 曲麻莱县| 宽甸| 星座| 城口县| 安多县| 台东市| 清苑县| 营山县| 凤冈县| 色达县| 水城县| 长春市| 剑川县| 江油市| 天门市| 尤溪县| 大连市| 什邡市| 怀集县| 明溪县| 泰安市| 宜都市| 孙吴县| 大化| 九龙坡区| 平原县| 昌黎县| 兴海县| 安阳市|