|
Post by Radek on Oct 19, 2014 18:28:39 GMT
Dear All, I see strange results when analyzing LoadData.xlsx (downloaded on 19th of October 2014). I have converted the file into csv and imported it into sqlite database for easier processing. I verified the import by cross checking couple of parameters, everything seems to be fine. I than executed couple of queries focusing on two columns: InterestAndPenaltiesPaid and EAD1. My logic being quite simple: In order to make profit, paid interests + fees needs to be bigger than principal lost during the default. I used following query: SELECT Country, SUM(InterestAndPenaltiesPaid), SUM(EAD1)
FROM LoanData
WHERE CreditDecision = 1 AND WasFunded = 1 AND x60D_FromFirstPayment = 1
GROUP BY Country; and I got following data out Country
| SUM(InterestAndPenaltiesPaid)
| SUM(EAD1)
| EE | 2013696
| 1207513
| ES | 204727
| 557113
| FI | 334004
| 379173
| SK | 18318
| 49932
|
What strike me the most is that everywhere with exception of Estonia, the EAD1(outstanding principal at default) is higher than InterestAndPenaltiesPaid. That means that without recovery person would be actually losing money by investing into anywhere outside of Estonia. Additionally I noticed something strange about InterestAndPenaltiesPaid. If I sum them up I got to the number 2570745 (=2.57M) - this is nowhere close to number Interest paid out (€) - total from www.bondora.ee/en/invest/statistics/loans, which is currently at 3567493 (=3.56M). I know that I am filtering out all loans which are not yet 60days old, but even with them it would be 2699810 (=2.70M), which is still nowhere close. Just to be sure, I checked directly in Excel and same number there: 2699810 (=2.70M) Questions- Q1: Do you see something fundamentally wrong with my logic of comparing InterestAndPenaltiesPaid with EAD1?
- Q2: Have you find similar observations regarding the countries - does really only investing in Estonia makes sense?
- Q3: Do you have any explanation regarding the wrongly calculated sum between dataset and Bondora statistics page?
Thanks, Radek
|
|
JamesFrance
Member of DD Central
Port Grimaud 1974
Posts: 1,323
Likes: 897
|
Post by JamesFrance on Oct 20, 2014 9:17:25 GMT
That is a very interesting analysis and I will try to answer based on my own portfolio. Q1: Do you see something fundamentally wrong with my logic of comparing InterestAndPenaltiesPaid with EAD1? Yes I do, most defaults occur during the early stages of repayment, so most of the loans over 6 months old will continue to earn interest until they are fully repaid. The non Estonian loans have not had several years of repayments so comparisons are difficult. The volume of loans has grown considerably this year, so defaulted amounts are high compared with interest from older loan. Q2: Have you find similar observations regarding the countries - does really only investing in Estonia makes sense? Yes the default rates seem to be lower in Estonia but Finland is improving recently. Everything probably depends on recovery rates and the non Estonian loans are too recent for that to have any noticeable effect. They have a good history of recovery but that is in Estonia, as you don't see much recovered in the first year after default. I do prefer Estonian loans but it is very difficult to invest much in those because of the high demand, so a rationing system operates. These default curves are not up to date but we don't have a more recent version which I have been asking for on the Bondora forum, Default20curves.xlsx (240.24 KB) Q3: Do you have any explanation regarding the wrongly calculated sum between dataset and Bondora statistics page? No I don't understand that unless you only used A1000 grade loans.
|
|
|
Post by coolrunning on Oct 20, 2014 13:08:44 GMT
|
|
|
Post by Radek on Oct 20, 2014 17:37:39 GMT
Hi jamesfrance Yes I do, most defaults occur during the early stages of repayment, so most of the loans over 6 months old will continue to earn interest until they are fully repaid. The non Estonian loans have not had several years of repayments so comparisons are difficult. The volume of loans has grown considerably this year, so defaulted amounts are high compared with interest from older loan. I see, that is indeed interesting observation, out of curiosity I executed the following query, which takes into account only loans with LoanDate higher than 2014-03-13, this is the day when first Slovakian loan was issued. SELECT Country, ROUND(SUM(InterestAndPenaltiesPaid),1), ROUND(SUM(EAD1),1), ROUND(100.0*SUM(CurrentLoanHasBeenExtended)/COUNT(*), 2)
FROM LoanData
WHERE CreditDecision = 1 AND WasFunded = 1 AND x60D_FromFirstPayment = 1 AND LoanDate > "2014-03-13" GROUP BY Country
and here is the output Country | SUM(InterestAndPenaltiesPaid) | SUM(EAD1) | Extended Schedule Percentage
| EE
| 238099.1
| 100766.1
| 10.32%
| ES
| 108634.6
| 295893.4
| 14.12%
| FI
| 92681.7
| 104112.2
| 19.86%
| SK
| 17924.4
| 45108.2
| 23.46%
|
this should give exactly the same condition to all the countries and honestly I do not see much of the difference. It still seems that only Estonian loans are able to earn for themselves (assuming pessimistic 0% recovery). Additionally I included one more column about extended schedule percentage. One possibility I have considered was, that if loan get extended schedule before it actually defaults, that could drive the total sum in Default lower. Does not seem to be the case as Estonia seems to have the lowest Extended Schedule Percentage. These numbers are also only includes loans since start of Slovakia. Yes the default rates seem to be lower in Estonia but Finland is improving recently. Everything probably depends on recovery rates and the non Estonian loans are too recent for that to have any noticeable effect. They have a good history of recovery but that is in Estonia, as you don't see much recovered in the first year after default. I do prefer Estonian loans but it is very difficult to invest much in those because of the high demand, so a rationing system operates. These default curves are not up to date but we don't have a more recent version which I have been asking for on the Bondora forum, I agree, the numbers for Finland are improving, but that maybe because it has almost double re-schedule rate compare to Estonia. I believe assuming Estonian rates for Spain or Slovakia would be quite optimistic. No I don't understand that unless you only used A1000 grade loans. That is two of us. I did the sum in Excel on the whole sheet just to be sure without any filtering.
|
|
JamesFrance
Member of DD Central
Port Grimaud 1974
Posts: 1,323
Likes: 897
|
Post by JamesFrance on Oct 20, 2014 18:09:35 GMT
I have to agree with your conclusions, it does appear that rescheduling is occurring to keep the default rates from being even higher for some areas and are only delaying the inevitable. I have a Finnish loan which was removed from the default total 3 months ago and is back there today after making no rescheduled payments.
I am hoping to see some solid recovery of my early defaults, but they are only from a year ago so it is too soon to expect much activity.
Personally I don't have the knowledge needed to do the analysis you are doing, so thanks for your useful input. When I retired the internet and tools such as excell were very new and I remember writing my own business programs in basic and using msdos on a Tandy TRS80 computer.
|
|
|
Post by coolrunning on Oct 21, 2014 16:25:24 GMT
Without doing anything hard, like what you were doing with sqllite, I just simply added up all the EAD1s in all the 20000 loans.
I got a figure a magnitude lower than your totals of selected EAD1 values.
Can you give me the simple EAD1 total from your database for comparison?
|
|
|
Post by reeknralf on Oct 21, 2014 17:28:12 GMT
Running the analysis in excel, I get similar, but not identical, answers to OP.
If the point is to find out which countries pay better, the analysis is fine. But we all already know Estonia pays better.
If, as it appears, the point of the analysis is to work out ROI before default collection, the analysis is not fine. Consider 10 loans which have run for six months, one of which has already defaulted. EAD1 will be greater than interest paid, but this tells us next to nothing about ROI before defaults, because the interest on the remaining 9 is still coming in. Analysing more recent loans exacerbates this problem.
Because Bondora is expanding fast, much of the database is ongoing loans. You could look at older loans, but then you'd lose the inter country comparison, and loans from 3 year ago probably don't tell us much about current loans.
Bondora give us ROI by country and loan type, which includes default recovery. I think they assume to be 27%. They also give us default rates, ~15%. So you can estimate what part of ROI is due to debt recovery, and subtract it from the total. I think this would be better than what you are doing.
|
|
|
Post by Radek on Oct 21, 2014 19:17:54 GMT
Without doing anything hard, like what you were doing with sqllite, I just simply added up all the EAD1s in all the 20000 loans. I got a figure a magnitude lower than your totals of selected EAD1 values. Can you give me the simple EAD1 total from your database for comparison? sure - 2262451.6946 EUR, which is exactly same as what I see as sum when I select Column EN - EAD1 in the LoanData.xlsx Please don't tell me that I have somehow corrupted original dataset. What number you get when you do the sum?
|
|
|
Post by Radek on Oct 21, 2014 19:52:49 GMT
Running the analysis in excel, I get similar, but not identical, answers to OP. If the point is to find out which countries pay better, the analysis is fine. But we all already know Estonia pays better. If, as it appears, the point of the analysis is to work out ROI before default collection, the analysis is not fine. Consider 10 loans which have run for six months, one of which has already defaulted. EAD1 will be greater than interest paid, but this tells us next to nothing about ROI before defaults, because the interest on the remaining 9 is still coming in. Analysing more recent loans exacerbates this problem. Because Bondora is expanding fast, much of the database is ongoing loans. You could look at older loans, but then you'd lose the inter country comparison, and loans from 3 year ago probably don't tell us much about current loans. Bondora give us ROI by country and loan type, which includes default recovery. I think they assume to be 27%. They also give us default rates, ~15%. So you can estimate what part of ROI is due to debt recovery, and subtract it from the total. I think this would be better than what you are doing. You are raising some interesting points. Being quite new to all this stuff I was assuming that probability of default is equally distributed over the time. I was hoping that Bondora screening would give A1000 grade only to people that are able to pay in the standard situations and defaults should be happening only because of loosing jobs, getting seriously sick, injured or unexpected expenses such as car accident, etc. My assumption was that probability that person looses the job, etc. is more or less constant over the time of the loan. If this is not the case and there is a reason for defaults to happens more often during the beginning of the loan, than you have to keep the other paying loans till they are paid out, which has quite a significant affect on liquidity.
|
|
james
Posts: 2,205
Likes: 955
|
Post by james on Oct 22, 2014 4:30:50 GMT
I was assuming that probability of default is equally distributed over the time. It definitely isn't evenly distributed over time. There tends to be a clear peak point then some decline in default rate. Just when this happens depends on many factors including length of loan and market. I was hoping that Bondora screening would give A1000 grade only to people that are able to pay in the standard situations and defaults should be happening only because of loosing jobs, getting seriously sick, injured or unexpected expenses such as car accident, etc. A1000 refers to none of those things. A, B or C refers only to the discretionary income available after deductions for committed expenses. 1000, 900, 800, 700 or 600 only refer to the information from external credit reference agencies. In particular, Bondora will completely ignore late payments on previous or current loans unless Bondora has reported those to the agencies, which usually does not happen for late payments. So you can find a borrower who is repeatedly late still able to borrow in A1000 credit group. My assumption was that probability that person looses the job, etc. is more or less constant over the time of the loan. Some people will apply for a loan when they know they may lose their job. That way they will have money to live on. Others will borrow to repay other lending but spend the money instead. In both cases after some time has passed the problems that were present at the start show up. Eventually the situation does become more based on the circumstances you expect. In a P2P first for me I have one loan where Bondora has said "To initiate criminal proceedings" about a borrower who made no payments. I don't know whether the claimed borrower was the real borrower. If this is not the case and there is a reason for defaults to happens more often during the beginning of the loan, than you have to keep the other paying loans till they are paid out, which has quite a significant affect on liquidity. Why would one affect the other? Why would you not keep the other loans anyway? I know of one reason - tax, which can make it better to sell near to the end of the loan, if you can manage to get a reasonable price. If the tax rate is sufficiently high the cost of the Bondora fees can be less than the tax saved by selling.
|
|
duck
Member of DD Central
Posts: 2,864
Likes: 6,898
|
Post by duck on Oct 22, 2014 16:34:09 GMT
I've picked up over 100 60 day 'defaults' in the last year, 34% failed to make the first or any subsequent payments. There are peaks and troughs for defaults and the initial peak can be high!
|
|
|
Post by coolrunning on Oct 22, 2014 16:44:40 GMT
Without doing anything hard, like what you were doing with sqllite, I just simply added up all the EAD1s in all the 20000 loans. I got a figure a magnitude lower than your totals of selected EAD1 values. Can you give me the simple EAD1 total from your database for comparison? sure - 2262451.6946 EUR, which is exactly same as what I see as sum when I select Column EN - EAD1 in the LoanData.xlsx Please don't tell me that I have somehow corrupted original dataset. What number you get when you do the sum? Whoops. You are right. I redid the calc and get your answer this time. Apologies for doubting you.
|
|
|
Post by Radek on Oct 23, 2014 7:06:32 GMT
Thanks for all the answers above, this really brings me a lot of insight into what is actually happening. Why would you not keep the other loans anyway? This was just speculative question, what would happen in case I would need the money back earlier than expected. From what I have heard it should be quite easy to sell all well paying loans on secondary market quickly when selling them for principal value. Anyway if I would do so, this would causes quite some losses as it is these loans, that are suppose to earn back money for the loans which are already in 60+ overdue. Good to know and thank you for pointing me to this. It definitely isn't evenly distributed over time. I fully agree now. Did not noticed this before. Out of curiosity I did a bit of analysis on how defaults are distributed based on the field DefaultedOnDay in the xlsx sheet. Following query check all A1000 Income+Expenses verified loans, which were issued before 1st of October 2013 and defaulted within one year. On purpose I wanted to analyze only the part of spectrum, where each loan had the equal possibility to default. SELECT 20*ROUND(DefaultedOnDay/20,0) As DefaultedOnDayMin, 19+20*ROUND(DefaultedOnDay/20,0) As DefaultedOnDayMax, COUNT(*) As DefaultCnt
FROM LoanData
WHERE AD=1 AND VerificationType = 4 AND CreditGroup = 'A' AND credit_score = 1000 AND LoanDate < "2013-10-01" AND DefaultedOnDay < 365 and results are here DefaultedOnDayMin DefaultedOnDayMax DefaultCnt ----------------- ----------------- ---------- 40.0 59.0 1 80.0 99.0 2 100.0 119.0 12 120.0 139.0 10 140.0 159.0 10 160.0 179.0 4 180.0 199.0 3 200.0 219.0 5 220.0 239.0 8 240.0 259.0 11 260.0 279.0 7 280.0 299.0 13 300.0 319.0 10 320.0 339.0 5 340.0 359.0 2 360.0 379.0 2 I am bit surprised that there is a loan (id=64f53d93-9c2e-46b7-b9aa-9e8e014cb6aa), which manged to default in less than 60 days, but it can be some mistake in Bondora's data. Most of loans indeed start defaulting at the beginning = ca 40days till the first payment and than another 60days after that.
|
|
|
Post by coolrunning on Oct 23, 2014 9:26:14 GMT
Useful post. I shall have to install SQLite myselt.
|
|
|
Post by brettb on Nov 3, 2014 12:10:06 GMT
Yes it is a useful post - thanks contributors.
I've just started investing in Bondora. I'm also a SQL guy so I'd love to do some analysis on the dataset. Let me know what needs to be researched. I've only had a quick look at the data (there's an awful lot of it!) but I have some ideas. Like finding pockets of value in the secondary market I guess.
From my experience on Zopa since 2009, most bad loans I had there seemed to go bad after 3 years. So I'd tend to avoid 5 year loans on any P2P platform without a contingency (although most Bondora loans seem to be for 5 years from what I've seen).
|
|