Scene Segmentation(场景分割)研究综述
Scene Segmentation 场景分割 - Video surveillance techniques like scene segmentation are playing an increasingly important role in multimedia Internet-of-Things (IoT) systems. [1] The 3D model of the scene can be used to navigate the robot or used for scene segmentation. [2] Despite rapid progress in scene segmentation in recent years, 3D segmentation methods are still limited when there is severe occlusion. [3] The main contribution of our approach is a cost calculation framework which is a hybrid of cross-correlation between stereo-image pairs and scene segmentation (HCS). [4] Graph Convolutional Networks (GCNs) have been successfully used for object classification and scene segmentation in point clouds, and also to predict grasping points in simple laboratory experimentation. [5] The idea of convergent networks is adopted to implement infrared airport scene segmentation. [6] With a more thorough study of the road scene segmentation issues we face the problem that the existing benchmark suites such as MOTS, KITTI as well as recent DNNs for the road/lane semantic segmentation employ only mutually exclusive classes i. [7] In this article, we propose a Dual Relation-aware Attention Network (DRANet) to handle the task of scene segmentation. [8] Finally, the superpixels is aggregated by fuzzy c-means (FCM), to get water scene segmentation. [9] In order to solve the problem of missing pedestrian features and poor statistics results in a large-angle overlooking scene, in this paper we propose a human head statistics system that consists of head detection, head tracking and head counting, where the proposed You-Only-Look-Once-Head (YOLOv5-H) network, improved from YOLOv5, is taken as the head detection benchmark, the DeepSORT algorithm with the Fusion-Hash algorithm for feature extraction (DeepSORT-FH) is proposed to track heads, and heads are counted by the proposed cross-boundary counting algorithm based on scene segmentation. [10] Semantic scene segmentation is vital for a large variety of applications as it enables understanding of 3D data. [11] Traditional scene segmentation algorithms are mostly based on the analysis of frames, which cannot automatically generate labels. [12] We propose a scene segmentation network based on local Deep Implicit Functions as a novel learning-based method for scene completion. [13] The results of experiments on scene segmentation tasks using contact center conversational datasets demonstrate the effectiveness of the proposed method. [14] Note that, compared with Dual Attention Network for scene segmentation, our attention module greatly reduces the consumption of computing resources while ensuring the accuracy. [15] Our proposed framework was based on semantic scene segmentation using an optimized convolutional neural network. [16] In addition, we present the first drone-perspective panoramic scene segmentation dataset Aerial-PASS, with annotated labels of track, field, and others. [17] Based on the key observation that urban buildings usually have piecewise planar rooftops and vertical walls, we propose a segment-based modeling method, which consists of three major stages: scene segmentation, roof contour extraction, and building modeling. [18] However, compared with the natural scene segmentation, surgical instrument segmentation is more difficult. [19] We make a further step in this article by proposing a multitask learning method, namely DPSNet, which can jointly perform depth and camera pose estimation and semantic scene segmentation. [20] Our approach can be used for LiDAR-only 360 ° surround view semantic scene segmentation while being suitable for real-time critical systems. [21] This paper introduces the novel task of scene segmentation on narrative texts and provides an annotated corpus, a discussion of the linguistic and narrative properties of the task and baseline experiments towards automatic solutions. [22] Finally, two robotic bin-picking experiments are demonstrated and the Part Mask RCNN for scene segmentation is evaluated through the constructed 3D object datasets. [23] The fully convolutional network (FCN) has achieved tremendous success in dense visual recognition tasks, such as scene segmentation. [24] The processing time of the process is also accelerated through 3D scene segmentation. [25] In this work, we define and address a novel domain adaptation (DA) problem in semantic scene segmentation, where the target domain not only exhibits a data distribution shift w. [26] , scene segmentation and multi-modal tagging. [27] In particular, urban scene segmentation is a significant integral module commonly equipped in the perception system of autonomous vehicles to understand the real scene like a human. [28] We annotate two robotic surgery datasets of MICCAI robotic scene segmentation and Transoral Robotic Surgery (TORS) with the captions of procedures and empirically show that our proposed method improves the performance in both source and target domain surgical reports generation in the manners of unsupervised, zero-shot, one-shot, and few-shot learning. [29] The extensive use of the SSL methods is dominant in the field of computer vision, for example, image classification, human activity recognition, object detection, scene segmentation, and image generation. [30] In order to deeply incorporate local structures and global context to support 3D scene segmentation, our network is built on four repeatedly stacked encoders, where each encoder has two basic components: EdgeConv that captures local structures and NetVLAD that models global context. [31] It is the focus of this paper to construct a road scene segmentation model with simple structure and no need of large computing power under the premise of certain accuracy. [32] As a result, object detection and road scene segmentation are critical in navigation for recognizing the drivable and non-drivable areas. [33] In addition, a multi-task network architecture is proposed to optimize the camera parameters based on the scene segmentation. [34] The core of the proposed methodology concerns a video scene segmentation algorithm based on visual and semantic features extracted using various CNNs architectures. [35] The application of deep learning techniques has led to substantial progress in solving a number of critical problems in machine vision, including fundamental problems of scene segmentation and depth estimation. [36] The automatic temporal video scene segmentation (also known as video story segmentation) is still an open problem without definite solutions in most cases. [37] In particular, graph-based point-cloud deep neural networks (DNNs) have demonstrated promising performance in 3D object classification and scene segmentation tasks. [38] SegNet is characterized as a scene segmentation network and U-NET as a medical segmentation tool. [39] Semantic scene segmentation plays a critical role in a wide range of robotics applications, e. [40] Moreover, under the BP-CGMM, a detection method based on batch test is given to detect sea-surface small targets, which is composed of scene segmentation, by the aid of Bayesian threshold and morphological filtering, and adaptive generalized likelihood ratio test linear-threshold detector (GLRT-LTD) separately in each set. [41] The convolutional neural networks (CNNs) are a powerful tool of image classification that has been widely adopted in applications of automated scene segmentation and identification. [42] KEyWoRDS 3D Pose Estimation, 3D Scene Reconstruction, 3D Scene Segmentation, Augmented Reality, Disaster Simulation, Virtual Reality International Journal of Multimedia Data Engineering and Management Volume 10 • Issue 1 • January-March 2019. [43] Video scene segmentation has become one of the research hotspots in the video field because of its important role in improving retrieval accuracy, and plays a very important role in the construction of virtual scenes. [44] In this paper, our aim is to find the best exploitation of different imaging modalities for road scene segmentation, as opposed to using a single RGB modality. [45] Various retinal vessel segmentation methods based on convolutional neural networks were proposed recently, and Dense U-net as a new semantic segmentation network was successfully applied to scene segmentation. [46] This paper describes a 3D semantic scene segmentation with convolutional neural networks for unordered point clouds of autonomous robots. [47] Deep learning is also widely used in object recognition, object detection, scene segmentation and other image processing. [48] Many clinical procedures could benefit from automatic scene segmentation and subsequent action recognition. [49] The road scene segmentation is an important problem which is helpful for a higher level of the scene understanding. [50]场景分割等视频监控技术在多媒体物联网 (IoT) 系统中发挥着越来越重要的作用。 [1] 场景的 3D 模型可用于导航机器人或用于场景分割。 [2] 尽管近年来在场景分割方面取得了长足的进步,但当存在严重遮挡时,3D 分割方法仍然受到限制。 [3] 我们方法的主要贡献是成本计算框架,它是互相关的混合体 立体图像对和场景分割(HCS)之间的关系。 [4] 图卷积网络 (GCN) 已成功用于点云中的对象分类和场景分割,还可以在简单的实验室实验中预测抓取点。 [5] 采用收敛网络的思想实现红外机场场景分割。 [6] 随着对道路场景分割问题的更深入研究,我们面临的问题是现有的基准套件(如 MOTS、KITTI 以及最近用于道路/车道语义分割的 DNN)仅使用互斥类 i。 [7] 在本文中,我们提出了一个双关系感知注意网络 (DRANet) 来处理场景分割任务。 [8] 最后,通过模糊 c 均值 (FCM) 对超像素进行聚合,得到水景分割。 [9] 为了解决大角度俯瞰场景中行人特征缺失和统计不佳导致的问题,本文提出了一个由头部检测、头部跟踪和头部计数组成的人头统计系统,其中提出的You-Only - 以YOLOv5改进的Look-Once-Head (YOLOv5-H)网络作为头部检测基准,提出了带有Fusion-Hash特征提取算法的DeepSORT算法(DeepSORT-FH)来跟踪头部,并且头部通过提出的基于场景分割的跨界计数算法进行计数。 [10] 语义场景分割对于各种应用至关重要,因为它可以理解 3D 数据。 [11] 传统的场景分割算法大多基于对帧的分析,无法自动生成标签。 [12] 我们提出了一种基于局部深度隐式函数的场景分割网络,作为一种新颖的基于学习的场景补全方法。 [13] 使用联络中心对话数据集对场景分割任务的实验结果证明了所提出方法的有效性。 [14] 请注意,与用于场景分割的双注意力网络相比,我们的注意力模块在保证准确性的同时大大降低了计算资源的消耗。 [15] 我们提出的框架基于使用优化的卷积神经网络的语义场景分割。 [16] 此外,我们展示了第一个无人机视角全景场景分割数据集 Aerial-PASS,带有带注释的轨道、场地等标签。 [17] 基于城市建筑通常具有分段平面屋顶和垂直墙的关键观察,我们提出了一种基于分段的建模方法,该方法包括三个主要阶段:场景分割、屋顶轮廓提取和建筑物建模。 [18] 然而,与自然场景分割相比,手术器械分割更加困难。 [19] 我们在本文中更进一步,提出了一种多任务学习方法,即 DPSNet,它可以联合执行深度和相机姿态估计以及语义场景分割。 [20] 我们的方法可用于仅 LiDAR 的 360° 环绕视图语义场景分割,同时适用于实时关键系统。 [21] 本文介绍了叙事文本场景分割的新任务,并提供了一个带注释的语料库、对该任务的语言和叙事属性的讨论以及针对自动解决方案的基线实验。 [22] 最后,演示了两个机器人分箱实验,并通过构建的 3D 对象数据集评估了用于场景分割的 Part Mask RCNN。 [23] 全卷积网络(FCN)在场景分割等密集视觉识别任务中取得了巨大成功。 [24] 该过程的处理时间也通过 3D 场景分割加快。 [25] 在这项工作中,我们定义并解决了语义场景分割中的一个新的域适应 (DA) 问题,其中目标域不仅表现出数据分布偏移 w。 [26] ,场景分割和多模态标记。 [27] 特别是,城市场景分割是自动驾驶汽车感知系统中普遍配备的一个重要的集成模块,可以像人类一样理解真实场景。 [28] 我们用程序的标题注释了 MICCAI 机器人场景分割和经口机器人手术 (TORS) 的两个机器人手术数据集,并通过经验表明,我们提出的方法以无监督、零的方式提高了源域和目标域手术报告生成的性能。镜头、一次性和少镜头学习。 [29] SSL 方法的广泛使用在计算机视觉领域占主导地位,例如图像分类、人类 活动识别、目标检测、场景分割和图像生成。 [30] 为了深入结合局部结构和全局上下文以支持 3D 场景分割,我们的网络建立在四个重复堆叠的编码器上,其中每个编码器有两个基本组件:捕获局部结构的 EdgeConv 和模拟全局上下文的 NetVLAD。 [31] 在保证一定精度的前提下,构建结构简单、不需要大量计算能力的道路场景分割模型是本文的重点。 [32] 因此,目标检测和道路场景分割对于识别可驾驶和不可驾驶区域的导航至关重要。 [33] 此外,提出了一种基于场景分割的多任务网络架构来优化相机参数。 [34] 所提出方法的核心涉及一种基于使用各种 CNN 架构提取的视觉和语义特征的视频场景分割算法。 [35] 深度学习技术的应用在解决机器视觉中的许多关键问题方面取得了实质性进展,包括场景分割和深度估计的基本问题。 [36] 在大多数情况下,自动时间视频场景分割(也称为视频故事分割)仍然是一个悬而未决的问题,没有明确的解决方案。 [37] 特别是,基于图的点云深度神经网络 (DNN) 在 3D 对象分类和场景分割任务中表现出良好的性能。 [38] SegNet 的特点是场景分割网络和 U-NET 作为医学分割工具。 [39] 语义场景分割在广泛的机器人应用中起着至关重要的作用,例如。 [40] 此外,在BP-CGMM下,给出了一种基于batch test的海面小目标检测方法,由场景分割、贝叶斯阈值和形态滤波、自适应广义似然比检验线性-阈值检测器(GLRT-LTD)分别在每组中。 [41] 卷积神经网络 (CNN) 是一种强大的图像分类工具,已广泛应用于自动场景分割和识别。 [42] nan [43] nan [44] nan [45] nan [46] nan [47] nan [48] nan [49] nan [50]
Semantic Scene Segmentation
Semantic scene segmentation is vital for a large variety of applications as it enables understanding of 3D data. [1] Our proposed framework was based on semantic scene segmentation using an optimized convolutional neural network. [2] We make a further step in this article by proposing a multitask learning method, namely DPSNet, which can jointly perform depth and camera pose estimation and semantic scene segmentation. [3] Our approach can be used for LiDAR-only 360 ° surround view semantic scene segmentation while being suitable for real-time critical systems. [4] In this work, we define and address a novel domain adaptation (DA) problem in semantic scene segmentation, where the target domain not only exhibits a data distribution shift w. [5] Semantic scene segmentation plays a critical role in a wide range of robotics applications, e. [6] This paper describes a 3D semantic scene segmentation with convolutional neural networks for unordered point clouds of autonomous robots. [7] Processing speeds are increased by using semantic scene segmentation and a tiered inference scheme to focus processing on the most salient regions of the 43° x 7. [8] Specifically, the deep network architecture has been proposed which consists of a cascaded combination of 3D point-based residual networks for simultaneous semantic scene segmentation and object classification. [9] Promising recent work suggests that semantic scene segmentation can provide a robust regularizing prior for resolving ambiguities in stereo correspondence and reconstruction problems. [10]语义场景分割对于各种应用至关重要,因为它可以理解 3D 数据。 [1] 我们提出的框架基于使用优化的卷积神经网络的语义场景分割。 [2] 我们在本文中更进一步,提出了一种多任务学习方法,即 DPSNet,它可以联合执行深度和相机姿态估计以及语义场景分割。 [3] 我们的方法可用于仅 LiDAR 的 360° 环绕视图语义场景分割,同时适用于实时关键系统。 [4] 在这项工作中,我们定义并解决了语义场景分割中的一个新的域适应 (DA) 问题,其中目标域不仅表现出数据分布偏移 w。 [5] 语义场景分割在广泛的机器人应用中起着至关重要的作用,例如。 [6] nan [7] nan [8] nan [9] nan [10]
Road Scene Segmentation
With a more thorough study of the road scene segmentation issues we face the problem that the existing benchmark suites such as MOTS, KITTI as well as recent DNNs for the road/lane semantic segmentation employ only mutually exclusive classes i. [1] It is the focus of this paper to construct a road scene segmentation model with simple structure and no need of large computing power under the premise of certain accuracy. [2] As a result, object detection and road scene segmentation are critical in navigation for recognizing the drivable and non-drivable areas. [3] In this paper, our aim is to find the best exploitation of different imaging modalities for road scene segmentation, as opposed to using a single RGB modality. [4] The road scene segmentation is an important problem which is helpful for a higher level of the scene understanding. [5]随着对道路场景分割问题的更深入研究,我们面临的问题是现有的基准套件(如 MOTS、KITTI 以及最近用于道路/车道语义分割的 DNN)仅使用互斥类 i。 [1] 在保证一定精度的前提下,构建结构简单、不需要大量计算能力的道路场景分割模型是本文的重点。 [2] 因此,目标检测和道路场景分割对于识别可驾驶和不可驾驶区域的导航至关重要。 [3] nan [4] nan [5]
3d Scene Segmentation
The processing time of the process is also accelerated through 3D scene segmentation. [1] In order to deeply incorporate local structures and global context to support 3D scene segmentation, our network is built on four repeatedly stacked encoders, where each encoder has two basic components: EdgeConv that captures local structures and NetVLAD that models global context. [2] KEyWoRDS 3D Pose Estimation, 3D Scene Reconstruction, 3D Scene Segmentation, Augmented Reality, Disaster Simulation, Virtual Reality International Journal of Multimedia Data Engineering and Management Volume 10 • Issue 1 • January-March 2019. [3] We demonstrate the effectiveness of our formulation on optimizing object correspondences, estimating dense image maps via neural networks, and 3D scene segmentation via map networks of diverse 3D representations. [4]该过程的处理时间也通过 3D 场景分割加快。 [1] 为了深入结合局部结构和全局上下文以支持 3D 场景分割,我们的网络建立在四个重复堆叠的编码器上,其中每个编码器有两个基本组件:捕获局部结构的 EdgeConv 和模拟全局上下文的 NetVLAD。 [2] nan [3] nan [4]
Video Scene Segmentation
The core of the proposed methodology concerns a video scene segmentation algorithm based on visual and semantic features extracted using various CNNs architectures. [1] The automatic temporal video scene segmentation (also known as video story segmentation) is still an open problem without definite solutions in most cases. [2] Video scene segmentation has become one of the research hotspots in the video field because of its important role in improving retrieval accuracy, and plays a very important role in the construction of virtual scenes. [3]所提出方法的核心涉及一种基于使用各种 CNN 架构提取的视觉和语义特征的视频场景分割算法。 [1] 在大多数情况下,自动时间视频场景分割(也称为视频故事分割)仍然是一个悬而未决的问题,没有明确的解决方案。 [2] nan [3]
Automatic Scene Segmentation
Many clinical procedures could benefit from automatic scene segmentation and subsequent action recognition. [1] Moreover, automatic scene segmentation and object detection are joined for traffic scene understanding. [2]scene segmentation algorithm
Traditional scene segmentation algorithms are mostly based on the analysis of frames, which cannot automatically generate labels. [1] The core of the proposed methodology concerns a video scene segmentation algorithm based on visual and semantic features extracted using various CNNs architectures. [2]传统的场景分割算法大多基于对帧的分析,无法自动生成标签。 [1] 所提出方法的核心涉及一种基于使用各种 CNN 架构提取的视觉和语义特征的视频场景分割算法。 [2]
scene segmentation network
We propose a scene segmentation network based on local Deep Implicit Functions as a novel learning-based method for scene completion. [1] SegNet is characterized as a scene segmentation network and U-NET as a medical segmentation tool. [2]我们提出了一种基于局部深度隐式函数的场景分割网络,作为一种新颖的基于学习的场景补全方法。 [1] SegNet 的特点是场景分割网络和 U-NET 作为医学分割工具。 [2]
scene segmentation task
The results of experiments on scene segmentation tasks using contact center conversational datasets demonstrate the effectiveness of the proposed method. [1] In particular, graph-based point-cloud deep neural networks (DNNs) have demonstrated promising performance in 3D object classification and scene segmentation tasks. [2]使用联络中心对话数据集对场景分割任务的实验结果证明了所提出方法的有效性。 [1] 特别是,基于图的点云深度神经网络 (DNN) 在 3D 对象分类和场景分割任务中表现出良好的性能。 [2]