可用于机器学习课程的结课论文。 本文在对Lending Club数据集进行初步数据分析的基础上,通过选取4组不同的特征,采用同一种算法(逻辑回归,LR)进行分类预测,最终确定3个相对较优特征为:loan_amnt,annual_inc,term。随后本文针对“多源数据集”,采用神经网络、贝叶斯分类器和决策树三种算法对数据进行分类预测,最终综合三种算法的模型结果参数,确定决策树为三者最优。最后,本文仍选取Lending Club数据集作为研究对象,经预处理后,选取数据的55个特征,并将二分类问题变为三分类问题。之后,采用单一树类模型——决策树,以及集成树类模型——随机森林和极端随机树对数据进行分类预测,对比模型结果参数,得出结论:集成算法相比较于单一算法有更好的准确度和泛化能力,但是相应模型也会消耗更多计算机资源。
2022-11-16 18:32:34 1.15MB 机器学习 结课论文 分类预测 LendingClub
1
本资源为原创论文的word版。 可用于机器学习课程的结课论文。 本文在对Lending Club数据集进行初步数据分析的基础上,通过选取4组不同的特征,采用同一种算法(逻辑回归,LR)进行分类预测,最终确定3个相对较优特征为:loan_amnt,annual_inc,term。随后本文针对“多源数据集”,采用神经网络、贝叶斯分类器和决策树三种算法对数据进行分类预测,最终综合三种算法的模型结果参数,确定决策树为三者最优。最后,本文仍选取Lending Club数据集作为研究对象,经预处理后,选取数据的55个特征,并将二分类问题变为三分类问题。之后,采用单一树类模型——决策树,以及集成树类模型——随机森林和极端随机树对数据进行分类预测,对比模型结果参数,得出结论:集成算法相比较于单一算法有更好的准确度和泛化能力,但是相应模型也会消耗更多计算机资源。
2022-11-16 18:32:30 1.81MB 机器学习 分类预测 LendingClub 结课论文
1
    《金融现金贷用户数据分析和用户画像》课程用python代码对LendingClub平台贷款数据分析和用户画像,针对银行,消费金融,现金贷等场景,教会学员用python实现金融信贷申请用户数据分析。项目采用lendingclub 12万多条真实信贷数据,包括用户年收入,贷款总额,分期金额,分期数量,职称,住房情况等几十个维度。通过课程学习,我们发现2019年四季度时候,美国多头借贷情况非常严重,为全球系统性金融危机埋下种子。
1
lending club 贷款数据 2018年第二季度的贷款数据 "id","member_id","loan_amnt","funded_amnt","funded_amnt_inv","term","int_rate","installment","grade","sub_grade","emp_title","emp_length","home_ownership","annual_inc","verification_status","issue_d","loan_status","pymnt_plan","url","desc","purpose","title","zip_code","addr_state","dti","delinq_2yrs","earliest_cr_line","inq_last_6mths","mths_since_last_delinq","mths_since_last_record","open_acc","pub_rec","revol_bal","revol_util","total_acc","initial_list_status","out_prncp","out_prncp_inv","total_pymnt","total_pymnt_inv","total_rec_prncp","total_rec_int","total_rec_late_fee","recoveries","collection_recovery_fee","last_pymnt_d","last_pymnt_amnt","next_pymnt_d","last_credit_pull_d","collections_12_mths_ex_med","mths_since_last_major_derog","policy_code","application_type","annual_inc_joint","dti_joint","verification_status_joint","acc_now_delinq","tot_coll_amt","tot_cur_bal","open_acc_6m","open_act_il","open_il_12m","open_il_24m","mths_since_rcnt_il","total_bal_il","il_util","open_rv_12m","open_rv_24m","max_bal_bc","all_util","total_rev_hi_lim","inq_fi","total_cu_tl","inq_last_12m","acc_open_past_24mths","avg_cur_bal","bc_open_to_buy","bc_util","chargeoff_within_12_mths","delinq_amnt","mo_sin_old_il_acct","mo_sin_old_rev_tl_op","mo_sin_rcnt_rev_tl_op","mo_sin_rcnt_tl","mort_acc","mths_since_recent_bc","mths_since_recent_bc_dlq","mths_since_recent_inq","mths_since_recent_revol_delinq","num_accts_ever_120_pd","num_actv_bc_tl","num_actv_rev_tl","num_bc_sats","num_bc_tl","num_il_tl","num_op_rev_tl","num_rev_accts","num_rev_tl_bal_gt_0","num_sats","num_tl_120dpd_2m","num_tl_30dpd","num_tl_90g_dpd_24m","num_tl_op_past_12m","pct_tl_nvr_dlq","percent_bc_gt_75","pub_rec_bankruptcies","tax_liens","tot_hi_cred_lim","total_bal_ex_mort","total_bc_limit","total_il_high_credit_limit","revol_bal_joint","sec_app_earliest_cr_line","sec_app_inq_last_6mths","sec_app_mort_acc","sec_app_open_acc","sec_app_revol_util","sec_app_open_act_il","sec_app_num_rev_accts","sec_app_chargeoff_within_12_mths","sec_app_collections_12_mths_ex_med","sec_app_mths_since_last_major_derog","hardship_flag","hardship_type","hardship_reason","hardship_status","deferral_term","hardship_amount","hardship_start_date","hardship_end_date","payment_plan_start_date","hardship_length","hardship_dpd","hardship_loan_status","orig_projected_additional_accrued_interest","hardship_payoff_balance_amount","hardship_last_payment_amount","disbursement_method","debt_settlement_flag","debt_settlement_flag_date","settlement_status","settlement_date","settlement_amount","settlement_percentage","settlement_term"
2022-05-20 10:45:12 100.99MB csv数据
1
贷款快速选择器 在 LendingClub.com 上预测贷款违约的洞察数据科学项目
2022-01-03 12:12:49 92.69MB JavaScript
1
使用机器学习算法进行分析,以使用来自LendingClub的数据集识别信用卡风险。 概述 该分析的目的是了解如何利用Machine Learning统计算法基于提供的数据模式进行预测。 在这一挑战中,我们专注于使用来自P2P借贷服务公司LendingClub的免费数据集进行的监督学习,以评估和预测信用风险。 之所以将其称为“监督学习”,是因为数据包括标记的结果。 为了完成此分析,我们使用不同的Machine Learning技术来训练和评估不平衡类的数据。 LendingClub的数据集存在分类不平衡的问题,因为优质贷款的数量超过了风险贷款的数量。 为了平衡分类以进行更有意义的预测并提高准确性得分,我们需要采用各种Machine Learning算法来对数据进行重新采样。 这些算法包括RandomOverSampler , SMOTE , ClusterCentroids , SMOTE
2021-11-11 21:13:08 19.39MB JupyterNotebook
1
官网地址:https://www.lendingclub.com/statistics/additional-statistics,可自行下载也可下载本人的
2021-03-09 19:54:19 98.47MB LendingClub LendingClub2019
1
官网地址:https://www.lendingclub.com/statistics/additional-statistics,可自行下载也可下载本人的
2021-03-02 19:37:41 84.89MB LendingClub LendClub2017
1
官网地址:https://www.lendingclub.com/statistics/additional-statistics,可自行下载也可下载本人的
2021-03-02 19:01:38 82.08MB LendingClub LendingClub2016
1
官网地址:https://www.lendingclub.com/statistics/additional-statistics,可自行下载也可下载本人的
2021-01-28 02:48:52 94.77MB LendingClub LendingClub2018
1