[1]陈志刚,岳 倩,赵 威. 弹幕文本情感分类模型研究——基于中文预训练模型与双向长短期记忆网络[J].湖北工业大学学报,2021,(6):56-61.
 CHEN Zhigang,YUE Qian,ZHAO Wei. Research on Sentiment Classification of Barrage Text Based on BERT-wwm and BiLSTM[J].,2021,(6):56-61.
点击复制

 弹幕文本情感分类模型研究
——基于中文预训练模型与双向长短期记忆网络
()
分享到:

《湖北工业大学学报》[ISSN:1003-4684/CN:42-1752/Z]

卷:
期数:
2021年第6期
页码:
56-61
栏目:
湖北工业大学学报
出版日期:
2021-12-31

文章信息/Info

Title:
 Research on Sentiment Classification of Barrage Text Based on BERT-wwm and BiLSTM
文章编号:
1003-4684(2021)06-0056-06
作者:
 陈志刚1 岳 倩1 赵 威2
1 湖北工业大学经济与管理学院, 湖北 武汉 430068;
2 华中科技大学武汉光电国家研究中心, 湖北 武汉 430074
Author(s):
 CHEN Zhigang1 YUE Qian1 ZHAO Wei2
 1 School of Economics and Management, Hubei Univ. of Tech., Wuhan 430068, China;
 2 Wuhan National Laboratory for Optoelectronics,[JZ]Huazhong Univ. of Sci. and Tech., Wuhan 430074,China
关键词:
 弹幕文本情感分类 中文预训练模型 双向长短时记忆网络
Keywords:
 barrage text sentiment classification BERT wwm BiLSTM
分类号:
TP391
文献标志码:
A
摘要:
 针对弹幕文本的口语化、网络化、一词多义等特点,提出BERT-wwm-BiLSTM模型以提升情感分类准确率。该模型引入BERT-wwm预训练模型,得到有关上下文信息的动态词向量,采用BiLSTM对特征进行提取,最后使用softmax进行情感分类。在自建的bilibili和腾讯视频两个弹幕数据集上进行实验,Acc、p、R、F1值等4个指标均优于其他模型,且在一词多义弹幕文本中有突出表现,证明该模型在弹幕文本情感分类中的有效性。
Abstract:
 Aiming at the colloquial,networked,and polysemous features of the barrage text,the BERT wwm BiLSTM model is proposed to improve the accuracy of sentiment classification and provide a reference for subsequent video content mining and network public opinion governance.The model introduces the BERT wwm pre-training model to obtain dynamic word vectors related to context information,then uses BiLSTM to extract features,and finally uses softmax to perform sentiment classification.Experiments were conducted on the self-built two barrage data sets of bilibili and Tencent video.The four indicators of Acc,P,R,and F1 are better than other models,and they have outstanding performance in the one-word polysemous barrage text.It proves the effectiveness of the model in the sentiment classification of barrage text.

参考文献/References:

[1] 邓扬,张晨曦,李江峰.基于弹幕情感分析的视频片段推荐模型[J].计算机应用,2017,37(4):1065-1070+1134.
[2] 朱思淼,魏世伟,魏思恒,等. 基于弹幕情感分析和主题模型的视频推荐算法[J]. 计算机应用,[2021-11-15] : 1-9.
[3] 刘琼,马文婷,范一欣.短视频平台突发公共事件的网络情绪呈现及舆情治理——以Bilibili网站“新冠疫情”议题为例[J/OL].电子政务:1-14[2021-04-12].http://kns.cnki.net/kcms/detail/11.5181.TP.20210409.0848.012.html.
[4] 邱全磊,崔宗敏,喻静.基于表情和语气的情感词典用于弹幕情感分析[J].计算机技术与发展,2020,30(8):178-182.
[5] 洪庆,王思尧,赵钦佩,等.基于弹幕情感分析和聚类算法的视频用户群体分类[J].计算机工程与科学,2018,40(6):1125-1139.
[6] 王文韬,陈千,张肖,等. 弹幕视角下的网络热搜健康视频关注度与情感分析[J]. 图书馆论坛, : 1-11.
[7] 司峥鸣,谭天. 弹幕评论情感分析的交互设计及舆论引导[J]. 青年记者, 2021, 702(10): 35-37.
[8] 庄须强. 基于深度学习的弹幕评论情感分析研究[D].山东师范大学,2018.
[9] 叶健,赵慧.基于大规模弹幕数据监听和情感分类的舆情分析模型[J].华东师范大学学报(自然科学版),2019(3):86-100.
[10] 李稚,朱春红. 双模态情感分析的弹幕网络视频平台营销策略[J]. 心理科学进展, 2021, 29(9): 1561-1575.
[11] MIKOLOV T, SUTSKEVER I, CHEN K, et al. Distributed representations of words and phrases and their compositionality[EB/OL].(2021-07-22). https://arxiv.org/pdf/1310.4546.pdf
[12] MIKOLOV T, CHEN K, CORRADO G, et al. Efficient estimation of word representations in vector space[EB/OL]. (2021-07-22)https://arxiv.org/pdf/1301.3781.pdf
[13] PENNINGTON J, SOCHER R, MANNING C D. Glove: Global vectors for word representation[C]//Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). 2014: 1532-1543.
[14] 张群,王红军,王伦文.词向量与 LDA 相融合的短文本分类方法[J].现代图书情报技术,2016,(12):27-35.
[15] DEVLIN J, CHANG M W, LEE K, et al. Bert: Pre-training of deep bidirectional transformers for language understanding[EB/OL].(2021-07-22).https://arxiv.org/pdf/1810.04805.pdf
[16] PETERS M, NEUMANN M, IYYER M, et al. Deep Contextualized Word Representations[C]//Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), 2018: 2227-2237.
[17] RADFORD A, NARASIMHAN K, Salimans T, et al. Improving Language Understanding by Generative Pre[EB/OL].(2021-07-22).https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf
[18] 唐晓波,刘江南.基于BERT和TF-IDF的问答社区问句自动标引研究——以金投网问答社区为例[J].情报科学,2021,39(3):3-10.
[19] SHI KAIZE,GONG CHANGJIN,LU HAO,et al.Wide-grained capsule network with sentence-level feature to detect meteorological event in social network[J].Future Generation Computer Systems,2020(102):323-332.
[20] CHEN F, YUAN Z, HUANG Y. Multi-source data fusion for as-pect-level sentiment classification[J]. Knowledge-Based Systems, 2020(187): 104831.
[21] ZHANG X, ZHANG Y, ZHANG Q, et al. Extracting comprehen-sive clinical information for breast cancer using deep learn-ing methods[J]. International Journal of Medical Informat-ics, 2019(132): 103985.
[22] DU Y, PEI B, ZHAO X, et al. Deep scaled dot-product attention based domain adaptation model for biomedical question answering - ScienceDirect[J]. Methods, 2020, 173:69-74.
[23] MORADI M, DORFFNER G, SAMWALD M. Deep contextualized embeddings for quantifying the informative content in bio-medical text summarization[J]. Computer Methods and Pro-grams in Biomedicine, 2020 (184): 105117.
[24] CUI Y, CHE W, LIU T, et al. Pre-training with whole word masking for chinesebert[J]. arXiv preprint arXiv:1906.08101, 2019.
[25] ZHU X, SOBIHANI P, GUO H. Long short-term memory over recursive structures[C]//International Conference on Machine Learning. PMLR, 2015: 1604-1612.
[26] REN Y, WANG R, JI D. A topic-enhanced word embedding for Twitter sentiment classification[J]. Information Sciences, 2016, 369: 188-198.
[27] YAO Y, HUANG Z. Bi-directional LSTM recurrent neural network for Chinese word segmentation[C]//International Conference on Neural Information Processing. Springer, Cham, 2016: 345-353.
[28] BALIKAS G, MOURA S, AMINI M R. Multitask learning for fine-grained twitter sentiment analysis[C]//Proceedings of the 40th international ACM SIGIR conference on research and development in information retrieval. 2017: 1005-1008.
[29] 吴鹏,应杨,沈思.基于双向长短期记忆模型的网民负面情感分类研究[J].情报学报,2018,37(8):845-853.
[30] BAZIOTIS C, PELEKIS N, DOULKERIDIS C. Datastories at semeval-2017 task 4: Deep lstm with attention for message-level and topic-based sentiment analysis[C]//Proceedings of the 11th international workshop on semantic evaluation (SemEval-2017), 2017: 747-754.

相似文献/References:

[1]熊韧,曹海印,王焱清,等.非牛顿润滑静压轴承的节流器流量方程修正[J].湖北工业大学学报,2019,34(5):6.
 XIONG Ren,CAO Haiyin,WANG Yanqing,et al.Modified restrictor flow equations of hydrostatic bearings ubricated by non-Newtonian fluids[J].,2019,34(6):6.
[2]王照远,曹 民,王 毅,等. 场景与数据双驱动的隧道图像拼接方法[J].湖北工业大学学报,2020,(4):11.
 WANG Zhaoyuan,CAO Min,WANG Yi,et al. Tunnel Image Stitching Method based on Scene and Data[J].,2020,(6):11.
[3]潘 健,梁佳成,陈凤娇,等. 单电流闭环多重PR控制的LCL型逆变器[J].湖北工业大学学报,2020,(4):16.
 PAN Jian,LIANG Jiacheng,CHEN Fengjiao,et al. Design of LCL Grid Connected Inverter based on Single Closed Loop Control and Multiple PR Controllers[J].,2020,(6):16.
[4]王晓光,赵 萌,文益雪,等. 定子闭口槽结构对永磁电机齿槽转矩影响分析[J].湖北工业大学学报,2020,(4):25.
 WANG Xiaoguang,ZHAO Meng,WEN Yixue,et al. Study on Cogging Torque and Vibration Noise of Permanent Magnet Motor with Segmental Stator and Closed-Slot[J].,2020,(6):25.
[5]宇 卫,凃玲英,陈 健. 风电场集中接入对集电线电流保护的影响[J].湖北工业大学学报,2020,(4):29.
 YU Wei,TU Lingying,CHEN Jian. Effect of the Collective Line Current Protection when Wind Farms are Centralized Accessed to the Power System[J].,2020,(6):29.
[6]廖政斌,王泽飞,祝 珊. 二惯量系统谐振在线抑制及相位补偿[J].湖北工业大学学报,2020,(4):34.
 LIAO Zhengbin,WANG Zefei,ZHU Shan. Online Resonance Suppression and Phase Compensation for Double Inertia System[J].,2020,(6):34.
[7]王 欣,游 颖,姜天翔,等. 面向3D打印过程的产品工艺设计和优化[J].湖北工业大学学报,2020,(4):39.
 WANG Xin,YOU Ying,JIANG Tianxiang,et al. Product Process Design and Optimization for 3D Printing Processes[J].,2020,(6):39.
[8]冉晶晶,文 红,罗雅梅,等. 全自动样品前处理平台及其控制系统[J].湖北工业大学学报,2020,(4):43.
 RAN Jingjing,WEN Hong,LUO Yamei,et al. Research on Automatic Sample Preprocessing Platform and its Control System[J].,2020,(6):43.
[9]杨 磊,马志艳,石 敏,等. 基于模糊PID的小型冷库过热度控制方法[J].湖北工业大学学报,2020,(4):43.
 YANG Lei,MA Zhiyan,SHI Min,et al. Research on Superheat Control Method of Small Cold Storage based on Fuzzy PID[J].,2020,(6):43.
[10]黄 晶,周细枝,周业望. 动态注塑成型模具的设计与实验研究[J].湖北工业大学学报,2020,(4):52.
 HUANG Jing,ZHOU Xizhi,ZHOU Yewang. Design and Experimental Study of Dynamic Injection Molding[J].,2020,(6):52.

备注/Memo

备注/Memo:
[收稿日期] 2021-07-22
[第一作者] 陈志刚(1970-),男,湖北汉川人,湖北工业大学副教授,研究方向为电商理论与实务,数字经济,高新企业管理
[通信作者] 岳 倩(1996-),女,四川巴中人,湖北工业大学硕士研究生,研究方向为情报分析
更新日期/Last Update: 2022-01-04