本专栏是计算机视觉方向论文收集积累,时间:2021年7月9日,来源:paper digest
欢迎关注原创公众号?【计算机视觉联盟】,回复?【西瓜书手推笔记】?可获取我的机器学习纯手推笔记!
直达笔记地址:机器学习手推笔记(GitHub地址)
1, TITLE:?Crowd Counting Via Perspective-Guided Fractional-Dilation Convolution AUTHORS: Zhaoyi Yan ; Ruimao Zhang ; Hongzhi Zhang ; Qingfu Zhang ; Wangmeng Zuo CATEGORY: cs.CV [cs.CV] HIGHLIGHT: To address this issue, this paper proposes a novel convolution neural network-based crowd counting method, termed Perspective-guided Fractional-Dilation Network (PFDNet).
2, TITLE:?Tensor Methods in Computer Vision and Deep Learning AUTHORS: YANNIS PANAGAKIS et. al. CATEGORY: cs.CV [cs.CV] HIGHLIGHT: This article provides an in-depth and practical review of tensors and tensor methods in the context of representation learning and deep learning, with a particular focus on visual data analysis and computer vision applications.
3, TITLE:?Feature Pyramid Network for Multi-task Affective Analysis AUTHORS: Ruian He ; Zhen Xing ; Bo Yan CATEGORY: cs.CV [cs.CV] HIGHLIGHT: We propose a novel model named feature pyramid networks for multi-task affect analysis.
4, TITLE:?Comparing ML Based Segmentation Models on Jet Fire Radiation Zone AUTHORS: CARMINA P�REZ-GUERRERO et. al. CATEGORY: cs.CV [cs.CV] HIGHLIGHT: One such characterization would be the segmentation of different radiation zones within the flame, so this paper presents an exploratory research regarding several traditional computer vision and Deep Learning segmentation approaches to solve this specific problem.
5, TITLE:?An Embedded Iris Recognition System Optimization Using Dynamically ReconfigurableDecoder with LDPC Codes AUTHORS: Longyu Ma ; Chiu-Wing Sham ; Chun Yan Lo ; Xinchao Zhong CATEGORY: cs.CV [cs.CV] HIGHLIGHT: In this paper, the proposed design includes a minimal set of computer vision modules and multi-mode QC-LDPC decoder which can alleviate variability and noise caused by iris acquisition and follow-up process.
6, TITLE:?Collaboration of Experts: Achieving 80% Top-1 Accuracy on ImageNet with 100M FLOPs AUTHORS: Yikang Zhang ; Zhuo Chen ; Zhao Zhong CATEGORY: cs.CV [cs.CV, cs.LG] HIGHLIGHT: In this paper, we propose a Collaboration of Experts (CoE) framework to pool together the expertise of multiple networks towards a common aim.
7, TITLE:?$S^3$: Sign-Sparse-Shift Reparametrization for Effective Training of Low-bit Shift Networks AUTHORS: XINLIN LI et. al. CATEGORY: cs.CV [cs.CV, cs.LG] HIGHLIGHT: To address these issues, we propose S low-bit re-parameterization, a novel technique for training low-bit shift networks.
8, TITLE:?An Audiovisual and Contextual Approach for Categorical and Continuous Emotion Recognition In-the-wild AUTHORS: Panagiotis Antoniadis ; Ioannis Pikoulis ; Panagiotis P. Filntisis ; Petros Maragos CATEGORY: cs.CV [cs.CV] HIGHLIGHT: In this work we tackle the task of video-based audio-visual emotion recognition, within the premises of the 2nd Workshop and Competition on Affective Behavior Analysis in-the-wild (ABAW).
9, TITLE:?A Dataset and Method for Hallux Valgus Angle Estimation Based on Deep Learing AUTHORS: Ningyuan Xu ; Jiayan Zhuang ; Yaojun Wu ; Jiangjian Xiao CATEGORY: cs.CV [cs.CV, cs.AI, I.4.7; I.2.10; I.5.1] HIGHLIGHT: However, it lack of dataset and the keypoints based method which made a great success in pose estimation is not suitable for this field.To solve the problems, we made a dataset and developed an algorithm based on deep learning and linear regression.
10, TITLE:?Automated Object Behavioral Feature Extraction for Potential Risk Analysis Based on Video Sensor AUTHORS: Byeongjoon Noh ; Wonjun Noh ; David Lee ; Hwasoo Yeo CATEGORY: cs.CV [cs.CV, cs.CY] HIGHLIGHT: In this paper, we propose an automated and simpler system for effectively extracting object behavioral features from video sensors deployed on the road.
11, TITLE:?Causal Affect Prediction Model Using A Facial Image Sequence AUTHORS: Geesung Oh ; Euiseok Jeong ; Sejoon Lim CATEGORY: cs.CV [cs.CV, cs.HC] HIGHLIGHT: In this paper, we propose the causal affect prediction network (CAPNet), which uses only past facial images to predict corresponding affective valence and arousal.
12, TITLE:?Instance-Level Relative Saliency Ranking with Graph Reasoning AUTHORS: Nian Liu ; Long Li ; Wangbo Zhao ; Junwei Han ; Ling Shao CATEGORY: cs.CV [cs.CV] HIGHLIGHT: In this paper, we investigate a practical problem setting that requires simultaneously segment salient instances and infer their relative saliency rank order.
13, TITLE:?Use of Affective Visual Information for Summarization of Human-Centric Videos AUTHORS: Berkay K�pr� ; Engin Erzin CATEGORY: cs.CV [cs.CV, cs.HC] HIGHLIGHT: In this study, we investigate the affective-information enriched supervised video summarization task for human-centric videos.
14, TITLE:?NccFlow: Unsupervised Learning of Optical Flow With Non-occlusion from Geometry AUTHORS: Guangming Wang ; Shuaiqi Ren ; Hesheng Wang CATEGORY: cs.CV [cs.CV] HIGHLIGHT: This paper reveals novel geometric laws of optical flow based on the insight and detailed definition of non-occlusion.
15, TITLE:?Uncertainty-Aware Camera Pose Estimation from Points and Lines AUTHORS: Alexander Vakhitov ; Luis Ferraz Colomina ; Antonio Agudo ; Francesc Moreno-Noguer CATEGORY: cs.CV [cs.CV] HIGHLIGHT: We propose PnP(L) solvers based on EPnP and DLS for the uncertainty-aware pose estimation.
16, TITLE:?Exploiting The Relationship Between Visual and Textual Features in Social Networks for Image Classification with Zero-shot Deep Learning AUTHORS: Luis Lucas ; David Tomas ; Jose Garcia-Rodriguez CATEGORY: cs.CV [cs.CV, cs.AI] HIGHLIGHT: In this work, we propose a classifier ensemble based on the transferable learning capabilities of the CLIP neural network architecture in multimodal environments (image and text) from social media.
17, TITLE:?Technical Report for Valence-Arousal Estimation in ABAW2 Challenge AUTHORS: Hong-Xia Xie ; I-Hsuan Li ; Ling Lo ; Hong-Han Shuai ; Wen-Huang Cheng CATEGORY: cs.CV [cs.CV] HIGHLIGHT: In this work, we describe our method for tackling the valence-arousal estimation challenge from ABAW2 ICCV-2021 Competition.
18, TITLE:?Uncertainty-aware Human Motion Prediction AUTHORS: Pengxiang Ding ; Jianqin Yin CATEGORY: cs.CV [cs.CV] HIGHLIGHT: Hence, we propose an uncertainty-aware framework for human motion prediction (UA-HMP).
19, TITLE:?Rethinking of Pedestrian Attribute Recognition: A Reliable Evaluation Under Zero-Shot Pedestrian Identity Setting AUTHORS: Jian Jia ; Houjing Huang ; Xiaotang Chen ; Kaiqi Huang CATEGORY: cs.CV [cs.CV] HIGHLIGHT: Thus, we propose two datasets, PETA\textsubscript{$ZS$} and RAP\textsubscript{$ZS$}, constructed following the zero-shot settings on pedestrian identity.
20, TITLE:?Weight Reparametrization for Budget-Aware Network Pruning AUTHORS: Robin Dupont ; Hichem Sahbi ; Guillaume Michel CATEGORY: cs.CV [cs.CV] HIGHLIGHT: In this paper, we introduce an "end-to-end" lightweight network design that achieves training and pruning simultaneously without fine-tuning.
21, TITLE:?Video 3D Sampling for Self-supervised Representation Learning AUTHORS: Wei Li ; Dezhao Luo ; Bo Fang ; Yu Zhou ; Weiping Wang CATEGORY: cs.CV [cs.CV] HIGHLIGHT: In this paper, we propose a novel self-supervised method for video representation learning, referred to as Video 3D Sampling (V3S).
22, TITLE:?SCSS-Net: Superpoint Constrained Semi-supervised Segmentation Network for 3D Indoor Scenes AUTHORS: Shuang Deng ; Qiulei Dong ; Bo Liu CATEGORY: cs.CV [cs.CV] HIGHLIGHT: Specifically, we use the pseudo labels predicted from unlabeled point clouds for self-training, and the superpoints produced by geometry-based and color-based Region Growing algorithms are combined to modify and delete pseudo labels with low confidence.
23, TITLE:?Case-based Similar Image Retrieval for Weakly Annotated Large Histopathological Images of Malignant Lymphoma Using Deep Metric Learning AUTHORS: NORIAKI HASHIMOTO et. al. CATEGORY: cs.CV [cs.CV, H.3.3; I.2.1; J.3] HIGHLIGHT: In the present study, we propose a novel case-based similar image retrieval (SIR) method for hematoxylin and eosin (H&E)-stained histopathological images of malignant lymphoma.
24, TITLE:?Investigate The Essence of Long-Tailed Recognition from A Unified Perspective AUTHORS: Lei Liu ; Li Liu CATEGORY: cs.CV [cs.CV] HIGHLIGHT: In this work, we systematically investigate the essence of the long-tailed problem from a unified perspective.
25, TITLE:?Multi-frame Collaboration for Effective Endoscopic Video Polyp Detection Via Spatial-Temporal Feature Transformation AUTHORS: Lingyun Wu ; Zhiqiang Hu ; Yuanfeng Ji ; Ping Luo ; Shaoting Zhang CATEGORY: cs.CV [cs.CV] HIGHLIGHT: In this paper, we present Spatial-Temporal Feature Transformation (STFT), a multi-frame collaborative framework to address these issues.
26, TITLE:?EEG-ConvTransformer for Single-Trial EEG Based Visual Stimuli Classification AUTHORS: Subhranil Bagchi ; Deepti R. Bathula CATEGORY: cs.CV [cs.CV] HIGHLIGHT: This work introduces an EEG-ConvTranformer network that is based on multi-headed self-attention.
27, TITLE:?Grid Partitioned Attention: Efficient TransformerApproximation with Inductive Bias for High Resolution Detail Generation AUTHORS: Nikolay Jetchev ; G�khan Yildirim ; Christian Bracher ; Roland Vollgraf CATEGORY: cs.CV [cs.CV, cs.LG] HIGHLIGHT: We present Grid Partitioned Attention (GPA), a new approximate attention algorithm that leverages a sparse inductive bias for higher computational and memory efficiency in image domains: queries attend only to few keys, spatially close queries attend to close keys due to correlations.
28, TITLE:?Relation-Based Associative Joint Location for Human Pose Estimation in Videos AUTHORS: Yonghao Dang ; Jianqin Yin CATEGORY: cs.CV [cs.CV] HIGHLIGHT: In this paper, unlike the prior methods, we propose a Relation-based Pose Semantics Transfer Network (RPSTN) to locate joints associatively.
29, TITLE:?Complete Scanning Application Using OpenCv AUTHORS: Ayushe Gangal ; Peeyush Kumar ; Sunita Kumari CATEGORY: cs.CV [cs.CV, cs.AI] HIGHLIGHT: In the following paper, we have combined the various basic functionalities provided by the NumPy library and OpenCv library, which is an open source for Computer Vision applications, like conversion of colored images to grayscale, calculating threshold, finding contours and using those contour points to take perspective transform of the image inputted by the user, using Python version 3.7.
30, TITLE:?Multi-Modality Task Cascade for 3D Object Detection AUTHORS: Jinhyung Park ; Xinshuo Weng ; Yunze Man ; Kris Kitani CATEGORY: cs.CV [cs.CV, cs.AI, cs.LG, cs.RO] HIGHLIGHT: To provide a more integrated approach, we propose a novel Multi-Modality Task Cascade network (MTC-RCNN) that leverages 3D box proposals to improve 2D segmentation predictions, which are then used to further refine the 3D boxes.
31, TITLE:?Task Fingerprinting for Meta Learning in Biomedical Image Analysis AUTHORS: Patrick Godau ; Lena Maier-Hein CATEGORY: cs.CV [cs.CV] HIGHLIGHT: In this paper, we address the problem of quantifying task similarity with a concept that we refer to as task fingerprinting.
32, TITLE:?Prior Aided Streaming Network for Multi-task Affective Recognitionat The 2nd ABAW2 Competition AUTHORS: WEI ZHANG et. al. CATEGORY: cs.CV [cs.CV] HIGHLIGHT: In this paper, we introduce our submission to the 2nd Affective Behavior Analysis in-the-wild (ABAW2) Competition.
33, TITLE:?Optimizing Data Processing in Space for Object Detection in Satellite Imagery AUTHORS: Martina Lofqvist ; Jos� Cano CATEGORY: cs.CV [cs.CV, cs.DC, cs.LG, eess.IV] HIGHLIGHT: In this work, we investigate the performance of CNN-based object detectors on constrained devices by applying different image compression techniques to satellite data.
34, TITLE:?Adiabatic Quantum Graph Matching with Permutation Matrix Constraints AUTHORS: Marcel Seelbach Benkner ; Vladislav Golyanik ; Christian Theobalt ; Michael Moeller CATEGORY: cs.CV [cs.CV] HIGHLIGHT: In this work, we address such problems with emerging quantum computing technology and propose several reformulations of QAPs as unconstrained problems suitable for efficient execution on quantum hardware.
35, TITLE:?TGHop: An Explainable, Efficient and Lightweight Method for Texture Generation AUTHORS: Xuejing Lei ; Ganning Zhao ; Kaitai Zhang ; C. -C. Jay Kuo CATEGORY: cs.CV [cs.CV] HIGHLIGHT: An explainable, efficient and lightweight method for texture generation, called TGHop (an acronym of Texture Generation PixelHop), is proposed in this work.
36, TITLE:?Image Resolution Susceptibility of Face Recognition Models AUTHORS: Martin Knoche ; Stefan H�rmann ; Gerhard Rigoll CATEGORY: cs.CV [cs.CV, cs.LG] HIGHLIGHT: To tackle this problem, we propose the following two methods: 1) Train a state-of-the-art face-recognition model straightforward with $50\%$ low-resolution images directly within each batch.
37, TITLE:?Staying in Shape: Learning Invariant Shape Representations Using Contrastive Learning AUTHORS: Jeffrey Gu ; Serena Yeung CATEGORY: cs.CV [cs.CV] HIGHLIGHT: To producerepresentations that are specifically isometry andalmost-isometry invariant, we propose new dataaugmentations that randomly sample these transfor-mations.
38, TITLE:?Malware Classification Using Deep Boosted Learning AUTHORS: Muhammad Asam ; Saddam Hussain Khan ; Tauseef Jamal ; Umme Zahoora ; Asifullah Khan CATEGORY: cs.CR [cs.CR, cs.CV, cs.LG] HIGHLIGHT: This work proposes a novel deep boosted hybrid learning-based malware classification framework and named as Deep boosted Feature Space-based Malware classification (DFS-MC).
39, TITLE:?Learning Vision-Guided Quadrupedal Locomotion End-to-End with Cross-Modal Transformers AUTHORS: Ruihan Yang ; Minghao Zhang ; Nicklas Hansen ; Huazhe Xu ; Xiaolong Wang CATEGORY: cs.LG [cs.LG, cs.CV, cs.RO] HIGHLIGHT: In this paper, we introduce LocoTransformer, an end-to-end RL method for quadrupedal locomotion that leverages a Transformer-based model for fusing proprioceptive states and visual observations.
40, TITLE:?Active Safety Envelopes Using Light Curtains with Probabilistic Guarantees AUTHORS: Siddharth Ancha ; Gaurav Pathak ; Srinivasa G. Narasimhan ; David Held CATEGORY: cs.LG [cs.LG, cs.AI, cs.CV, cs.RO] HIGHLIGHT: We evaluate our approach in a simulated urban driving environment and a real-world environment with moving pedestrians using a light curtain device and show that we can estimate safety envelopes efficiently and effectively.
41, TITLE:?RMA: Rapid Motor Adaptation for Legged Robots AUTHORS: Ashish Kumar ; Zipeng Fu ; Deepak Pathak ; Jitendra Malik CATEGORY: cs.LG [cs.LG, cs.AI, cs.CV, cs.RO] HIGHLIGHT: This paper presents Rapid Motor Adaptation (RMA) algorithm to solve this problem of real-time online adaptation in quadruped robots.
42, TITLE:?CamTuner: Reinforcement-Learning Based System for Camera Parameter Tuning to Enhance Analytics AUTHORS: SIBENDU PAUL et. al. CATEGORY: cs.LG [cs.LG, cs.CV] HIGHLIGHT: We propose CamTuner, which is a system to automatically, and dynamically adapt the complex sensor to changing environments.
43, TITLE:?LanguageRefer: Spatial-Language Model for 3D Visual Grounding AUTHORS: Junha Roh ; Karthik Desingh ; Ali Farhadi ; Dieter Fox CATEGORY: cs.RO [cs.RO, cs.CL, cs.CV] HIGHLIGHT: In this paper, we develop a spatial-language model for a 3D visual grounding problem.
44, TITLE:?4D Attention: Comprehensive Framework for Spatio-Temporal Gaze Mapping AUTHORS: Shuji Oishi ; Kenji Koide ; Masashi Yokozuka ; Atsuhiko Banno CATEGORY: cs.RO [cs.RO, cs.CV] HIGHLIGHT: This study presents a framework for capturing human attention in the spatio-temporal domain using eye-tracking glasses.
45, TITLE:?3D Neural Scene Representations for Visuomotor Control AUTHORS: Yunzhu Li ; Shuang Li ; Vincent Sitzmann ; Pulkit Agrawal ; Antonio Torralba CATEGORY: cs.RO [cs.RO, cs.CV, cs.LG] HIGHLIGHT: In this work, we desire to learn models for dynamic 3D scenes purely from 2D visual observations.
46, TITLE:?Modality Completion Via Gaussian Process Prior Variational Autoencoders for Multi-Modal Glioma Segmentation AUTHORS: Mohammad Hamghalam ; Alejandro F. Frangi ; Baiying Lei ; Amber L. Simpson CATEGORY: eess.IV [eess.IV, cs.CV, cs.LG] HIGHLIGHT: In this paper, we propose a novel model, Multi-modal Gaussian Process Prior Variational Autoencoder (MGP-VAE), to impute one or more missing sub-modalities for a patient scan.
47, TITLE:?Elastic Deformation of Optical Coherence Tomography Images of Diabetic Macular Edema for Deep-learning Models Training: How Far to Go? AUTHORS: DANIEL BAR-DAVID et. al. CATEGORY: eess.IV [eess.IV, cs.CV, cs.LG] HIGHLIGHT: Elastic Deformation of Optical Coherence Tomography Images of Diabetic Macular Edema for Deep-learning Models Training: How Far to Go?
48, TITLE:?Regional Differential Information Entropy for Super-Resolution Image Quality Assessment AUTHORS: Ningyuan Xu ; Jiayan Zhuang ; Jiangjian Xiao ; Chengbin Peng CATEGORY: eess.IV [eess.IV, cs.CV, I.4.3; I.4.4] HIGHLIGHT: To solve the problem, we proposed a method called regional differential information entropy to measure both of the similarities and perceptual quality.
49, TITLE:?Joint Motion Correction and Super Resolution for Cardiac Segmentation Via Latent Optimisation AUTHORS: SHUO WANG et. al. CATEGORY: eess.IV [eess.IV, cs.CV] HIGHLIGHT: Here we propose a novel latent optimisation framework that jointly performs motion correction and super resolution for cardiac image segmentations.
50, TITLE:?Deep Learning Based Image Retrieval in The JPEG Compressed Domain AUTHORS: Shrikant Temburwar ; Bulla Rajesh ; Mohammed Javed CATEGORY: eess.IV [eess.IV, cs.CV] HIGHLIGHT: Here, we propose a unified model for image retrieval which takes DCT coefficients as input and efficiently extracts global and local features directly in the JPEG compressed domain for accurate image retrieval.
51, TITLE:?A Hybrid Deep Learning Framework for Covid-19 Detection Via 3D Chest CT Images AUTHORS: Shuang Liang CATEGORY: eess.IV [eess.IV, cs.CV] HIGHLIGHT: In this paper, we present a hybrid deep learning framework named CTNet which combines convolutional neural network and transformer together for the detection of COVID-19 via 3D chest CT images.
52, TITLE:?Label-set Loss Functions for Partial Supervision: Application to Fetal Brain 3D MRI Parcellation AUTHORS: LUCAS FIDON et. al. CATEGORY: eess.IV [eess.IV, cs.CV, cs.LG] HIGHLIGHT: In this paper, we propose the first axiomatic definition of label-set loss functions that are the loss functions that can handle partially segmented images.
53, TITLE:?Atlas-Based Segmentation of Intracochlear Anatomy in Metal Artifact Affected CT Images of The Ear with Co-trained Deep Neural Networks AUTHORS: JIANING WANG et. al. CATEGORY: eess.IV [eess.IV, cs.CV] HIGHLIGHT: We propose an atlas-based method to segment the intracochlear anatomy (ICA) in the post-implantation CT (Post-CT) images of cochlear implant (CI) recipients that preserves the point-to-point correspondence between the meshes in the atlas and the segmented volumes.
|