[1]陶 琪,靳华中,李文萱,等. 一种空间关系增强的场景图生成方法[J].湖北工业大学学报,2022,(4):36-42.
 TAO Qi,JIN Huazhong,LI Wenxuan,et al. A Scene Graph Generation Method With Enhanced Spatial Relationship[J].,2022,(4):36-42.
点击复制

 一种空间关系增强的场景图生成方法()
分享到:

《湖北工业大学学报》[ISSN:1003-4684/CN:42-1752/Z]

卷:
期数:
2022年第4期
页码:
36-42
栏目:
湖北工业大学学报
出版日期:
2022-08-28

文章信息/Info

Title:
 A Scene Graph Generation Method With Enhanced Spatial Relationship
文章编号:
1003-4684(2022)04-0036-07
作者:
 陶 琪 靳华中 李文萱 黎 林 袁福祥
 湖北工业大学计算机学院, 湖北 武汉 430068
Author(s):
 TAO Qi JIN Huazhong LI WenxuanLI LinYUAN Fuxiang
 School of Computer Science, Hubei Univ. of Tech., Wuhan 430068, China
关键词:
 场景图生成 空间信息 空间关系 关系统计 关系检测
Keywords:
 scene graph generation spatial information spatial relationship relationship statistics relationship detection
分类号:
TP391
文献标志码:
A
摘要:
 为了充分利用目标间的空间信息,更准确描述场景目标之间的关系,提出一种空间关系增强的场景图生成方法。该方法主要贡献包括目标间的关系统计和空间关系增强两个方面。其一,通过数据库建立目标对的关系数值矩阵,利用关系数值矩阵简化目标对的数量,以便目标间的关系检测;其二,由目标对的坐标信息计算两者之间的相对大小、相对位置和交并比,从而增强目标间的空间关系。实验结果表明,在Visual Genome数据集上,提出方法比Neural Motifs模型在场景图生成、场景图分类和谓词分类任务上均有提升。
Abstract:
 To make full use of the spatial information between the targets and more accurately describe the relationship, a scene graph generation method with enhanced spatial relationship is proposed. The main contributions of this method include statistics of the relationship between the targets and the enhancement of the spatial relationship. The relationship statistics module between targets establishes the relationship value matrix of the target pair through the database, and the number of target pairs is simplified by the relationship value matrix, so as to facilitate the relationship detection. The spatial relationship enhancement module uses the coordinate information of the target pair to calculate the relative size, relative position and intersection ratio between the targets, and this spatial information are enhanced the relationship between the targets. The experimental results show that the proposed method is better than the Neural Motifs model in the tasks of scene graph generation, scene graph classification and predicate classification on the visual genome dataset.

参考文献/References:

[1] JOHNSON J, KRISHNA R, STARK M, et al. Image retrieval using scene graphs[C]∥ IEEE Conference on Computer Vision & Pattern Recognition. IEEE Computer Society, 2015,3668-3678.
[2] CHEN S, JIN Q, WANG P, et al. Say as you wish: fine-grained control of image caption generation with abstract scene graphs[C]∥ 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2020,9959-9968.
[3] X YANG, TANG K, ZHANG H, et al. Auto-encoding scene graphs for image captioning[C]∥IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2019,10 677-10 686.
[4] GU J, JOTY S, CAI J, et al. Unpaired Image captioning via scene graph alignments[C]∥ 2019 IEEE/CVF International Conference on Computer Vision (ICCV).IEEE, 2019,10 322-10 331.
[5] SCHROEDER B, TRIPATHI S. Structured query-based image retrieval using scene graphs[C]∥ 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). IEEE,2020,680-684.
[6] WANG S, WANG R, YAO Z, et al. Cross-modal scene graph matching for relationship-aware image-text retrieval[C]∥ 2020 IEEE Winter Conference on Applications of Computer Vision (WACV).IEEE,2020,1497-1506.
[7] JOHNSON J, GUPTA A, FEI-FEI L, et al. Image generation from scene graphs[C]∥ 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).IEEE,2018,1219-1228.
[8] 兰红, 刘秦邑. 图注意力网络的场景图到图像生成模型[J]. 中国图象图形学报,2020, 25(8): 1591-1603.
[9] TRIPATHI S,BHIWANDIWALLA A,BASTIDAS A, et al. Using scene graph context to improve image generation[EB/OL]. (2019-01-15). [2021-10-15]. https:∥arxiv.org/abs/1901.03762. 
[10] LI Y, MA T,BAI Y, et al. PasteGAN: a semi-parametric method to generate image from scene graph[EB/OL]. (2019-05-27). [2021-10-15]. https:∥arxiv.org/abs/1905.01608. 
[11] LU C, KRISHNA R, BERNSTEIN M, et al. Visual relationship detection with language priors[C]∥ Springer International Publishing. Springer International Publishing, 2016,852-869.
[12] BO D, ZHANG Y, LIN D. Detecting visual relationships with deep relational networks[C]∥ 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).IEEE, 2017,3076-3086.
[13] XU D, ZHU Y,CHOY C B, et al. Scene graph generation by iterative message passing [C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017, 5410-5419. 
[14] ZELLERS R, YATSKAR M, THOMSON S, et al.Neural motifs: scene graph parsing with global context[C]∥ 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).IEEE,2018,5831-5840.
[15] KRISHNA R, ZHU Y, GROTH O, et al. Visual genome: connecting language and vision using crowdsourced dense image annotations[J].International Journal of Computer Vision, 2017,123(1):32-73.
[16] REN S, HE K, GIRSHICK R, et al. Faster r-CNN: towards real-time object detection with region proposal networks[J].IEEE Transactions on Pattern Analysis & Machine Intelligence, 2017,39(6):1137-1149.
[17] CHEN T, YU W,CHEN R, et al. Knowledge-embedded routing network for scene graph generation[C]∥ 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2019,6156-6164.
[18] TANG K, ZHANG H, WU B, et al. Learning to compose dynamic tree structures for visual contexts[C]∥ 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).IEEE,2020,6612-6621.
[19] 林欣,田鑫,季怡,等. 一种残差置乱上下文信息的场景图生成方法[J].计算机研究与发展, 2019,56(8): 1721-1730.
[20] REN G, REN L, LIAO Y, et al. Scene graph generation with hierarchical context[J].IEEE Transactions on Neural Networks and Learning Systems, 2020,PP(99):1-7.
[21] ZHENG Z, LI Z, AN G, et al. Subgraph and object context-masked network for scene graph generation[J]. IET Computer Vision,2020,14(7):546-553.
[22] ZHANG Y, CHEN G, YU D, et al. Highway long short-term memory rnns for distant speech recognition[C]∥IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2016,5755-5759.
[23] HOCHREITER S,SCHMIDHUBER J. Long short-term memory[J]. Neural Computation, 1997,9(8):1735-1780.
[24] KIM J H, ON K W, LIM W, et al. Hadamard product for low-rank bilinear pooling[C]∥International Conference on Learning Representations(ICLR), 2016.
[25] SIMONYAN K, ZISSERMAN A.Very deep convolutional networks for large-scale image recognition[EB/OL]. (2015-04-10). [2021-10-15]. https:∥arxiv.org/abs/1409.1556. 
[26] REDMON J, FARHADI A. YOLO9000: better, faster, stronger[C]∥2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2017,6517-6525.
[27] NEWELL A, DENG J. Pixels to graphs by associative embedding[C]∥NIPS, 2017,2168-2177.

相似文献/References:

[1]熊韧,曹海印,王焱清,等.非牛顿润滑静压轴承的节流器流量方程修正[J].湖北工业大学学报,2019,34(5):6.
 XIONG Ren,CAO Haiyin,WANG Yanqing,et al.Modified restrictor flow equations of hydrostatic bearings ubricated by non-Newtonian fluids[J].,2019,34(4):6.
[2]王照远,曹 民,王 毅,等. 场景与数据双驱动的隧道图像拼接方法[J].湖北工业大学学报,2020,(4):11.
 WANG Zhaoyuan,CAO Min,WANG Yi,et al. Tunnel Image Stitching Method based on Scene and Data[J].,2020,(4):11.
[3]潘 健,梁佳成,陈凤娇,等. 单电流闭环多重PR控制的LCL型逆变器[J].湖北工业大学学报,2020,(4):16.
 PAN Jian,LIANG Jiacheng,CHEN Fengjiao,et al. Design of LCL Grid Connected Inverter based on Single Closed Loop Control and Multiple PR Controllers[J].,2020,(4):16.
[4]王晓光,赵 萌,文益雪,等. 定子闭口槽结构对永磁电机齿槽转矩影响分析[J].湖北工业大学学报,2020,(4):25.
 WANG Xiaoguang,ZHAO Meng,WEN Yixue,et al. Study on Cogging Torque and Vibration Noise of Permanent Magnet Motor with Segmental Stator and Closed-Slot[J].,2020,(4):25.
[5]宇 卫,凃玲英,陈 健. 风电场集中接入对集电线电流保护的影响[J].湖北工业大学学报,2020,(4):29.
 YU Wei,TU Lingying,CHEN Jian. Effect of the Collective Line Current Protection when Wind Farms are Centralized Accessed to the Power System[J].,2020,(4):29.
[6]廖政斌,王泽飞,祝 珊. 二惯量系统谐振在线抑制及相位补偿[J].湖北工业大学学报,2020,(4):34.
 LIAO Zhengbin,WANG Zefei,ZHU Shan. Online Resonance Suppression and Phase Compensation for Double Inertia System[J].,2020,(4):34.
[7]王 欣,游 颖,姜天翔,等. 面向3D打印过程的产品工艺设计和优化[J].湖北工业大学学报,2020,(4):39.
 WANG Xin,YOU Ying,JIANG Tianxiang,et al. Product Process Design and Optimization for 3D Printing Processes[J].,2020,(4):39.
[8]冉晶晶,文 红,罗雅梅,等. 全自动样品前处理平台及其控制系统[J].湖北工业大学学报,2020,(4):43.
 RAN Jingjing,WEN Hong,LUO Yamei,et al. Research on Automatic Sample Preprocessing Platform and its Control System[J].,2020,(4):43.
[9]杨 磊,马志艳,石 敏,等. 基于模糊PID的小型冷库过热度控制方法[J].湖北工业大学学报,2020,(4):43.
 YANG Lei,MA Zhiyan,SHI Min,et al. Research on Superheat Control Method of Small Cold Storage based on Fuzzy PID[J].,2020,(4):43.
[10]黄 晶,周细枝,周业望. 动态注塑成型模具的设计与实验研究[J].湖北工业大学学报,2020,(4):52.
 HUANG Jing,ZHOU Xizhi,ZHOU Yewang. Design and Experimental Study of Dynamic Injection Molding[J].,2020,(4):52.

备注/Memo

备注/Memo:
[收稿日期] 2021-10-19
[基金项目] 大学生创新创业训练计划项目(S201910500074)
[第一作者] 陶 琪(1995-),男,湖北孝感人,湖北工业大学硕士研究生,研究方向为计算机视觉
[通信作者] 靳华中(1973-),男,湖北洪湖人,湖北工业大学副教授,研究方向为计算机视觉
更新日期/Last Update: 2022-08-29