[1]华 满,邵雄凯,高 榕.基于Spark的电信用户画像的研究应用[J].湖北工业大学学报,2019,34(5):78-82.
 HUA Man,SHAO Xiongkai,GAO Rong.Research and Application of Telecom Portrait Based on Spark[J].,2019,34(5):78-82.
点击复制

基于Spark的电信用户画像的研究应用()
分享到:

《湖北工业大学学报》[ISSN:1003-4684/CN:42-1752/Z]

卷:
34卷
期数:
2019年第5期
页码:
78-82
栏目:
湖北工业大学学报
出版日期:
2019-10-30

文章信息/Info

Title:
Research and Application of Telecom Portrait Based on Spark
文章编号:
1003-4684(2019)05-0078-05
作者:
华  满 邵雄凯 高  榕
湖北工业大学计算机学院, 湖北 武汉 430068
Author(s):
HUA Man SHAO Xiongkai GAO Rong
School of Computer Science, Hubei Univ. of Tech., Wuhan 430068, China
关键词:
用户画像 Spark 精准推荐 kmeans 数据降维
Keywords:
user portrait spark precise recommendation Kmeans data dimensionality reduction
分类号:
TP391
文献标志码:
A
摘要:
为达到精准推荐,给用户提供个性化服务的目的, 通过以Spark为核心的大数据技术,对电信运营商的数据进行挖掘、分析、聚类、建模,从而发现不同用户的个性化需求。实验结果表明,此方法能够较好地对不同用户的行为进行画像,同时经过优化后的PK-means聚类方法准确率有明显提高,与传统的数据处理模式相比,运算速度得到极大提升。
Abstract:
Traditional telecom operators are faced up with the situation of failing to accurately understand users under the Internet environment, which leads to a decrease in the number of its users and is thus gradually being marginalized. Meanwhile, the TBlevel log files generated by the stock users are stored for a period of time and then cleaned up as useless junk content. A user portrait method for Telecom operators is proposed in this paper. Through Sparkbased big data technology, the data of telecom operators are mined, analyzed, dimensionreduced and optimized. Clustering is carried out to discover the personalized needs of different users, so as to achieve the purpose of providing precise recommendation and personalized services for users. The experimental results show that this method can better portray the behavior of different users. At the same time, the accuracy of the optimized PKmeans clustering method is significantly improved. Compared with the traditional data processing mode, the computing speed has been greatly improved.

参考文献/References:

[1] 黄文彬, 徐山川, 吴家辉, 等. 移动用户画像构建研究[J]. 现代情报, 2016, 36(10):54-61.
[2] 孟巍, 吴雪霞, 李静,等. 基于大数据技术的电力用户画像[J]. 电信科学, 2017(S1):15-20.
[3] 单晓红, 张晓月, 刘晓燕. 基于在线评论的用户画像研究——以携程酒店为例[J]. 情报理论与实践, 2018, 41(4): 99-104,149.
[4] Zaharia  M, Chowdhury M, Franklin M J, et al. Spark: cluster computing with working sets[C]//Usenix Conference on Hot Topics in Cloud Computing, 2010.
[5] 朱珠. 基于Hadoop的海量数据处理模型研究和应用[D]. 北京:北京邮电大学, 2008.
[6] 李绍俊, 杨海军, 黄耀欢,等. 基于NoSQL数据库的空间大数据分布式存储策略[J]. 武汉大学学报(信息科学版), 2017, 42(2):163-169.
[7] Vora M N. Hadoop-HBase for large-scale data[C]// International Conference on Computer Science & Network Technology,2012.
[8] 程国建, 赵倩倩. K-means聚类算法在Spark平台上的应用[J]. 软件导刊, 2016, 15(2):146-148.
[9] 张媛, 张燕平. 一种PCA算法及其应用[J]. 计算机技术与发展, 2005, 15(2):67-68.
[10] 吴晓婷, 闫德勤. 数据降维方法分析与研究[J]. 计算机应用研究, 2009, 26(8):28322835.

备注/Memo

备注/Memo:
[收稿日期] 2019-06-04
[第一作者] 华  满(1994-), 男, 湖北黄冈人,湖北工业大学硕士研究生,研究方向为大数据与云计算
更新日期/Last Update: 2019-11-21