[1]郑 列,穆新宇. 改进的XGBoost模型在短租房价格预测中的应用[J].湖北工业大学学报,2021,(2):104-109.
 ZHENG Lie,MU Xinyu. The Application of improved XGBoost in the Prediction of Short-Term Rental Housing Price[J].,2021,(2):104-109.
点击复制

 改进的XGBoost模型在短租房价格预测中的应用()
分享到:

《湖北工业大学学报》[ISSN:1003-4684/CN:42-1752/Z]

卷:
期数:
2021年第2期
页码:
104-109
栏目:
湖北工业大学学报
出版日期:
2021-04-22

文章信息/Info

Title:
 The Application of improved XGBoost in the Prediction of Short-Term Rental Housing Price
文章编号:
1003-4684(2021)02-0104-06
作者:
 郑 列 穆新宇
 湖北工业大学理学院, 湖北 武汉 430068
Author(s):
 ZHENG Lie MU Xinyu
 School of Sciences, Hubei Univ.of Tech., Wuhan 430068, China
关键词:
 在线短租房 OLS回归 分位数回归 XGBoost 网格搜索法
Keywords:
 online short term rental OLS regression quantile regression xgboost grid search method
分类号:
F299.23
文献标志码:
A
摘要:
 对短租房价格原始数据集进行缺失值和异常值处理,针对短租房价格的影响因素构建包括23个特征的特征体系,使用OLS回归和分位数回归对这些因素的影响程度和影响方向进行分析,最后挑选具有较强显著性的18个特征构建XGBoost模型,用于预测房源价格。建模过程中采用网格搜索法调参。拟合优度这一指标在使用XGBoost模型进行价格预测时可以达到0.60,而线性回归模型仅为0.38。因此,使用XGBoost模型对短租房价格进行预测较优,将其与OLS回归和分位数回归相结合,既保留了传统统计模型的解释性,又提升了预测的精确度。
Abstract:
 This paper studies the prediction model of the housing price via the information of listings. Firstly, it processes the missing and abnormal values of the original data. Secondly, it constructs a reasonable feature system including 23 features for the influencing factors of short term rental housing price. Thirdly, it uses OLS regression and quantile regression to analyze the influence of these factors. Finally, it selects 18 outstanding features to construct the XGBoost model for predicting the price of listings. The model uses the grid search method to adjust parameters. The goodness of fit of XGBoost is 0.60, while that of linear regression is only 0.38. Therefore, the XGBoost combined with OLS regression and quantile regression, not only keeps the interpretation of the traditional model, but also improves the prediction accuracy.

参考文献/References:

[1] 李鹏,陈雪均.国内共享住宿研究综述[J].商业经济,2020(6):49-53.
[2] 宋玲玲,王时绘,杨超,等.改进的XGBoost在不平衡数据处理中的应用研究[J].计算机科学,2020(6):98-103.
[3] Wang C, Deng C, Wang S. Imbalance-XGBoost: leveraging weighted and focal losses for binary label-imbalanced classification with XGBoost[J]. Pattern Recognition Letters, 2020(5):35-37.
[4] 黄卿,谢合亮.机器学习方法在股指期货预测中的应用研究——基于BP神经网络、SVM和XGBoost的比较分析[J].数学的实践与认识,2018,48(8):297-307.
[5] Parsa A,Movahedi A,Taghipour H,et al.Toward safer highways,application of XGBoost and SHAP for real-time accident detection and feature analysis[J].Accident Analysis & Prevention,2020,136(4):58-66.
[6] 佚名.短租数据集[EB/OL].(2020-09-01).https://tianchi.aliyun.com/competition/entrance/231715/information.
[7] Falk M, Larpin B, Scaglione M. The role of specific attributes in determining prices of Airbnb listings in rural and urban locations[J]. International Journal of Hospitality Management, 2019, 83(2):132-140.
[8] 吴晓隽,裘佳璐.Airbnb房源价格影响因素研究——基于中国36个城市的数据[J].旅游学刊,2019,34(4):16-31.
[9] 薛洁,姚雨萌,吴霞.杭州共享住宿入住影响因素分析及预测——基于Airbnb爱彼迎平台数据[J].统计科学与实践,2018(12):44-48.
[10] 郝令昕,丹尼尔奈曼.分位数回归模型[M].上海:人民出版社,2012.
[11] Chen T, Guestrin C. XGBoost: A scalable tree boosting system[J]. Knowledge Discovery and Data Mining , 2016(8):785-794.
[12] XGBoost.XGBoost parameters[EB/OL][2020-09-10].https://xgboost.readthedocs.io/en/latest/parameter.html.
[13] 龚洪亮.基于XGBoost算法的武汉市二手房价格预测模型的实证研究[D].武汉:华中师范大学,2018.

相似文献/References:

[1]熊韧,曹海印,王焱清,等.非牛顿润滑静压轴承的节流器流量方程修正[J].湖北工业大学学报,2019,34(5):6.
 XIONG Ren,CAO Haiyin,WANG Yanqing,et al.Modified restrictor flow equations of hydrostatic bearings ubricated by non-Newtonian fluids[J].,2019,34(2):6.
[2]王照远,曹 民,王 毅,等. 场景与数据双驱动的隧道图像拼接方法[J].湖北工业大学学报,2020,(4):11.
 WANG Zhaoyuan,CAO Min,WANG Yi,et al. Tunnel Image Stitching Method based on Scene and Data[J].,2020,(2):11.
[3]潘 健,梁佳成,陈凤娇,等. 单电流闭环多重PR控制的LCL型逆变器[J].湖北工业大学学报,2020,(4):16.
 PAN Jian,LIANG Jiacheng,CHEN Fengjiao,et al. Design of LCL Grid Connected Inverter based on Single Closed Loop Control and Multiple PR Controllers[J].,2020,(2):16.
[4]王晓光,赵 萌,文益雪,等. 定子闭口槽结构对永磁电机齿槽转矩影响分析[J].湖北工业大学学报,2020,(4):25.
 WANG Xiaoguang,ZHAO Meng,WEN Yixue,et al. Study on Cogging Torque and Vibration Noise of Permanent Magnet Motor with Segmental Stator and Closed-Slot[J].,2020,(2):25.
[5]宇 卫,凃玲英,陈 健. 风电场集中接入对集电线电流保护的影响[J].湖北工业大学学报,2020,(4):29.
 YU Wei,TU Lingying,CHEN Jian. Effect of the Collective Line Current Protection when Wind Farms are Centralized Accessed to the Power System[J].,2020,(2):29.
[6]廖政斌,王泽飞,祝 珊. 二惯量系统谐振在线抑制及相位补偿[J].湖北工业大学学报,2020,(4):34.
 LIAO Zhengbin,WANG Zefei,ZHU Shan. Online Resonance Suppression and Phase Compensation for Double Inertia System[J].,2020,(2):34.
[7]王 欣,游 颖,姜天翔,等. 面向3D打印过程的产品工艺设计和优化[J].湖北工业大学学报,2020,(4):39.
 WANG Xin,YOU Ying,JIANG Tianxiang,et al. Product Process Design and Optimization for 3D Printing Processes[J].,2020,(2):39.
[8]冉晶晶,文 红,罗雅梅,等. 全自动样品前处理平台及其控制系统[J].湖北工业大学学报,2020,(4):43.
 RAN Jingjing,WEN Hong,LUO Yamei,et al. Research on Automatic Sample Preprocessing Platform and its Control System[J].,2020,(2):43.
[9]杨 磊,马志艳,石 敏,等. 基于模糊PID的小型冷库过热度控制方法[J].湖北工业大学学报,2020,(4):43.
 YANG Lei,MA Zhiyan,SHI Min,et al. Research on Superheat Control Method of Small Cold Storage based on Fuzzy PID[J].,2020,(2):43.
[10]黄 晶,周细枝,周业望. 动态注塑成型模具的设计与实验研究[J].湖北工业大学学报,2020,(4):52.
 HUANG Jing,ZHOU Xizhi,ZHOU Yewang. Design and Experimental Study of Dynamic Injection Molding[J].,2020,(2):52.

备注/Memo

备注/Memo:
 [收稿日期] 2020-09-10
[基金项目] 教育部人文社会科学研究规划基金项目(17YJA790098)
[第一作者] 郑 列(1963-), 男,湖北英山人,湖北工业大学教授,研究方向为应用数学
[通信作者] 穆新宇(1996-), 女,江苏丰县人,湖北工业大学硕士研究生,研究方向为数据挖掘
更新日期/Last Update: 2021-04-23