TABLE OF CONTENTS

Sunday, July 25, 2021

Oral 1-1 Detection and Segmentation

O1-1-1Boosting Semi-Supervised Anomaly Detection via Contrasting Synthetic Images[paper]
Sheng-Feng Yu (Macronix International Co., Ltd.)*; Wei-Chen Chiu (National Chiao Tung University)[supplementary]
O1-1-2Crack Segmentation for Low-Resolution Images using Joint Learning with Super-Resolution[paper]
Yuki Kondo (TTI-J)*; Norimichi Ukita (TTI-J)
O1-1-3Distant Bird Detection for Safe Drone Flight and Its Dataset[paper]
Sanae Fujii (Toyota Technological Institute); Kazutoshi Akita (TTI-J)*; Norimichi Ukita (TTI-J)
O1-1-4Multi-Modal Pedestrian Detection with Large Misalignment Based on Modal-Wise Regression and Multi-Modal IoU[paper]
Napat Wanchaitanawong (Tokyo Institute of Technology)*; Masayuki Tanaka (Tokyo Institute of Technology); Takashi Shibata (NTT Corporation); Masatoshi Okutomi (Tokyo Institute of Technology)

Oral 1-2 Relationship Modeling

O1-2-1Human-Object Interaction Detection with Missing Objects[paper]
Kaen Kogashi (kyoto university)*; Yang Wu (Kyoto University); Shohei Nobuhara (Kyoto University); Ko Nishino (Kyoto University)[supplementary]
O1-2-2Group Activity Recognition Using Joint Learning of Individual Action Recognition and People Grouping[paper]
Chihiro Nakatani (TTI-J)*; Kohei Sendo (TTI-J); Norimichi Ukita (TTI-J)
O1-2-3Saliency based Subject Selection for Diverse Image Captioning[paper]
An Quoc Luong (The Graduate University for Advanced Studies, SOKENDAI)*; Minh-Duc Vo (The University of Tokyo); Akihiro Sugimoto (NII)
O1-2-4Semantic Hierarchy Preserving Deep Hashing for Large-scale Image Retrieval[paper]
Ming Zhang (City University of Hong Kong)*

Oral 1-3 Action and Event Localization

O1-3-1Live Video Action Recognition from Unsupervised Action Proposals[paper]
Roberto Javier Lopez-Sastre (University of Alcala)*; Marcos Baptista-Rios (Gradiant); Francisco J. Acevedo-Rodríguez (University of Alcalá); Pilar Martín Martín (universidad de Alcalá); Saturnino Maldonado-Bascon (Universidad de Alcalá)
O1-3-2Action Spotting and Temporal Attention Analysis of Events in Soccer Videos[paper]
Hiroaki Minoura (Chubu University)*; Tsubasa Hirakawa (Chubu University); Takayoshi Yamashita (Chubu University); Hironobu Fujiyoshi (Chubu University); Mitsuru Nakazawa (Rakuten Institute of Technology, Rakuten Group, Inc.); Yeongnam Chae (Rakuten Institute of Technology); Bjorn Stenger (Rakuten Institute of Technology)
O1-3-3Selecting an Iconic Pose From an Action Video[paper]
Geethu Jacob (Rakuten Institute of Technology); Bjorn Stenger (Rakuten Institute of Technology)*
O1-3-4Leveraging Frequency Based Salient Spatial Sound Localization to Improve 360° Video Saliency Prediction[paper]
Mert Cokelek (Hacettepe University)*; Nevrez Imamoglu (AIST); Cagri Ozcinar (Samsung); Erkut Erdem (Hacettepe University); Aykut Erdem (Koc University)

Poster 1

P1-1Machine-learning-based Quality-level-estimation System for Inspecting Steel Microstructures[paper]
Hiromi Nishiura (Hitachi,Ltd.)*; Atsushi Miyamoto (Hitachi,Ltd); Akira Ito (Hitachi,Ltd.); Shogo Suzuki (Hitachi Metals,Ltd.); Kouhei Fujii (Hitachi Metals,Ltd.); Hiroshi Morifuji (Hitachi Metals,Ltd.); Hiroyuki Takatsuka (Hitachi Metals,Ltd.)
P1-2Contextual Information based Network with High-Frequency Feature Fusion for High Frame Rate and Ultra-Low Delay Small-Scale Object Detection[paper]
Dongmei Huang (Waseda University)*; Jihan Zhang (Waseda University); Tingting Hu (Waseda University); Ryuji Fuchikami (Panasonic); Takeshi Ikenaga (Waseda University)
P1-3Position Estimation of Pedestrians in Surveillance Video using Face Detection and Simple Camera Calibration[paper]
Toshio Sato (Waseda University)*; Xin Qi (Waseda University); keping yu (Waseda University); Zheng Wen (Waseda Universiy); Yutaka Katsuyama (Waseda University); Takuro Sato (waseda university)
P1-4Facial landmark detection transfer learning for a specific user in driver status monitoring systems[paper]
Jaechul Kim (Kyocera Corporation)*; Kensuke Taguchi (Kyocera Corporation); Yusuke Hayashi (Kyocera Corporation); Jungo Miyazaki (Kyocera Corporation); Hironobu Fujiyoshi (Chubu University)
P1-5FBNet: FeedBack-Recursive CNN for Saliency Detection[paper]
Guanqun Ding (University of Tsukuba)*; Nevrez Imamoglu (AIST); Ali Caglayan (National Institute of Advanced Industrial Science and Technologhy (AIST), Tokyo, Japan); Masahiro Murakawa (National Institute of Advanced Industrial Science and Technology (AIST)); Ryosuke Nakamura (National Institute of Advanced Industrial Science and Technology)
P1-6Angular Margin Constrained Loss for Automatic Liver Fibrosis Staging[paper]
Katsuhiro Nakai (Yamaguchi University)*; Xu Qiao (Shandong University); Xian-Hua Han (Yamaguchi University)
P1-7Attention Mining Branch for Optimizing Attention Map[paper]
Takaaki Iwayoshi (Chubu University)*; Masahiro Mitsuhara (Chubu University); Masayuki Takada (Chubu University); Tsubasa Hirakawa (Chubu University); Takayoshi Yamashita (Chubu University); Hironobu Fujiyoshi (Chubu University)
P1-8Critically Compressed Quantized Convolution Neural Network based High Frame Rate and Ultra-Low Delay Fruit External Defects Detection[paper]
Jihan Zhang (Waseda University)*; Dongmei Huang (Waseda University); Tingting Hu (Waseda University); Ryuji Fuchikami (Panasonic); Takeshi Ikenaga (Waseda University)
P1-9Lossless AI: Toward Guaranteeing Consistency between Inferences Before and After Quantization via Knowledge Distillation[paper]
Tomoyuki Okuno (Panasonic)*; Yohei Nakata (Panasonic); Yasunori Ishii (Panasonic); Sotaro Tsukizawa (Panasonic)
P1-10Joint Learning of Object Detection and Pose Estimation using Augmented Autoencoder[paper]
Ryota Hayashi (TTI-J); Asei Shimokura (TTI-J)*; Takuya Matsumoto (TTI-J); Norimichi Ukita (TTI-J)
P1-11Relational Subgraph for Graph-based Path Prediction[paper]
masaki miyata (chubu university)*; Katsutoshi Shiraki (Chubu University); Hiroaki Minoura (Chubu University); Tsubasa Hirakawa (Chubu University); Takayoshi Yamashita (Chubu University); Hironobu Fujiyoshi (Chubu University)
P1-12Image Information Assistance Neural Network for VideoPose3D-based Monocular 3D Pose Estimation[paper]
Hao Wang (Waseda University)*; Dingli Luo (Waseda University); Takeshi Ikenaga (Waseda University)
P1-13Video Summarization With Frame Index Vision Transformer[paper]
Tzu-Chun Hsu (National Chung Hsing University); Yi-Sheng Liao (National Chung Hsing University); Chun-Rong Huang (National Chung Hsing University)*
P1-14Multi-physical and Temporal Feature Based Self-correcting Approxi-mation Model for Monocular 3D Volleyball Trajectory Analysis[paper]
Jiaxu Dong (Waseda University)*; Xina Cheng (Xidian University); Takeshi Ikenaga (Waseda University)
P1-15Japanese Sentence Dataset for Lip-reading[paper]
Tatsuya Shirakata (Kyushu Institute of Technology); Takeshi Saitoh (Kyushu Institute of Technology)*

Monday, July 26, 2021

Oral 2-1 Attentive and Structural Prediction

O2-1-1Augmenting Discriminative Correlation Filters with Stereo Blob Tracking for Long-Term Tracking of Underwater Animals[paper]
Miao Zhang (Stanford University)*; Stephen Rock (Stanford University)
O2-1-2Predicting Next Local Appearance for Video Anomaly Detection[paper]
Pankaj PRR Roy (École Polytechnique de Montréal)*; Guillaume-Alexandre Bilodeau (Polytechnique Montréal); Lama Seoud (Polytechnique Montreal)
O2-1-3HMA-Depth: A New Monocular Depth Estimation Model Using Hierarchical Multi-Scale Attention[paper]
Zhaofeng Niu (NAIST)*; Yuichiro Fujimoto (NAIST); Masayuki Kanbara (Nara Institute of Science and Technology); Hirokazu Kato (NAIST)
O2-1-4Shape-Based Floor Plan Retrieval Using Parse Tree Matching[paper]
Philip Kenneth Lee (Stanford University); Bjorn Stenger (Rakuten Institute of Technology)*

Oral 2-2 Robust and Adaptive Learning

O2-2-1Estimating Contribution of Training Datasets using Shapley Values in Data-scale for Visual Recognition[paper]
Takayuki Semitsu (Mitsubishi Electric Corporation)*; Mitsuki Nakamura (Mitsubishi Electric Corporation); Shotaro Ishigami (Mitsubishi Electric Corporation ); Teng-Yok Lee (Mitsubishi Electric); Toru Aoki (Mitsubishi Electric Corporation); Yoshimi Isu (Mitsubishi Electric Corporation)
O2-2-2Data Augmentation for Human Motion Prediction[paper]
Takahiro Maeda (TTI-J)*; Norimichi Ukita (TTI-J)[media]
O2-2-3Content Filtering in Streaming Video Using Domain Adaptation[paper]
Utsav Shah (Rakuten Institute of Technology)*; Muhammad Rasyid Aqmar (Bukalapak); Mitsuru Nakazawa (Rakuten Institute of Technology, Rakuten Group, Inc.); Bjorn Stenger (Rakuten Institute of Technology)
O2-2-4Occlusion-Robust 3D Hand Pose Estimation from a Single RGB Image[paper]
Asuka Ishii (NEC)*; Gaku Nakano (NEC Corporation); Tetsuo Inoshita (NEC)

Oral 2-3 Physics and Geometric based Modeling

O2-3-1Information Hiding Using a Coded Aperture as a Key[paper]
Tomoki Minamata (Kagoshima University)*; Shoma Ishida (Kagoshima University); Shingo Takeshita (Kagoshima University); Hiroshi Kawasaki (Kyushu univ.); Hajime Nagahara (Osaka University); Satoshi Ono (Kagoshima University)
O2-3-2An Optical Model for Show-through Cancellation in Ancient Document Imaging with Dark and Bright Mounts[paper]
Yuri Ueno (Nara Institute of Science and Technology)*; Kenichiro Tanaka (Ritsumeikan University); Takuya Funatomi (Nara Institute of Science and Technology); Yasuhiro Mukaigawa (NAIST)
O2-3-3Self-Supervised Deep Fisheye Image Rectification Approach using Coordinate Relations[paper]
Masaki Hosono (Waseda University)*; Edgar Simo-Serra (Waseda University); Tomonari Sonoda (Utagoe Inc.)
O2-3-4Expandable Spherical Projection and Feature Fusion Methods for Object Detection from Fisheye Images[paper]
Songeun Kim (Kyungpook National University); Soon Yong Park (Kyungpook National University)*[supplementary]

Poster 2

P2-1Temporal Extension for Encoder-Decoder-based Crowd Counting Approaches[paper]
Thomas Golda (Karlsruhe Institute of Technology)*; Florian Krüger (Fraunhofer Insitute for Optronics, System Technologies and Image Exploitation IOSB); Jürgen Beyerer (Fraunhofer IOSB)
P2-2Model-based Crack Width Estimation using Rectangle Transform[paper]
Christian Benz (Bauhaus-Universität Weimar)*; Volker Rodehorst (Bauhaus-Universität Weimar)
P2-3A baseline for semi-supervised learning of efficient semantic segmentation models[paper]
Ivan Grubišić (University of Zagreb, Faculty of Electrical Engineering and Computing)*; Marin Oršić (UNIZG-FER); Sinisa Segvic (UniZg-FER)[supplementary]
P2-4Efficient transfer learning for multi-channel convolutional neural networks[paper]
Aloïs de La Comble (Rakuten)*; Ken Prepin (Rakuten)
P2-5On the Influence of Viewpoint Change for Metric Learning[paper]
Marco Filax (Chair of Software Engineering, OvGU Magdeburg)*; Frank Ortmeier (Chair of Software Engineering, OvGU Magdeburg)
P2-6Analysis of Evaluation Metrics with the Distance between Positive Pairs and Negative Pairs in Deep Metric Learning[paper]
Hajime Oi (The University of Tokyo)*; Rei Kawakami (Tokyo Institute of Technology); Takeshi Naemura (The University of Tokyo)
P2-7Seeing Farther Than Supervision: Self-supervised Depth Completion in Challenging Environments[paper]
Seiya Ito (Aoyama Gakuin University)*; Naoshi Kaneko (Aoyama Gakuin University); Kazuhiko Sumi (Aoyama Gakuin University)[supplementary]
P2-8Pix2Point: Learning Outdoor 3D Using Sparse Point Clouds and Optimal Transport[paper]
Rémy Leroy (ONERA)*; Pauline Trouvé (ONERA); Frédéric Champagnat (ONERA); Bertrand Le Saux (ESA / Phi-lab); Marcela Carvalho (Upciti)[supplementary]
P2-9Practical Descattering of Transmissive Inspection Using Slanted Linear Image Sensors[paper]
Takahiro Kushida (Nara Institute of Science and Technology)*; Kenichiro Tanaka (Ritsumeikan University); Takuya Funatomi (Nara Institute of Science and Technology); Komei Tahara (Vienex Corporation); Yukihiro Kagawa (Vienex Corporation); Yasuhiro Mukaigawa (NAIST)
P2-10Recurrent RLCN-Guided Attention Network for Single Image Deraining[paper]
Yizhou Li (Tokyo Institute of Technology)*; Yusuke Monno (Tokyo Institute of Technology); Masatoshi Okutomi (Tokyo Institute of Technology)
P2-11AVM Image Quality Enhancement by Synthetic Image Learning for Supervised Deblurring[paper]
Kazutoshi Akita (TTI-J)*; Masayoshi Hayama (TTI-J); Haruya Kyutoku (Toyota Technological Institute); Norimichi Ukita (TTI-J)
P2-12Shape from shading and polarization constrained by approximate shape[paper]
Wataru Muraoroshi (Hiroshima City University); Daisuke Miyazaki (Hiroshima City University)*
P2-13Illumination Planning for Measuring Per-Pixel Surface Roughness[paper]
Kota Arieda (Kyushu Institute of Technology); Takahiro Okabe (Kyushu Institute of Technology)*
P2-14ROT-Harris: A Dynamic Approach to Asynchronous Interest Point Detection[paper]
Shane P Harrigan (Ulster University)*; Sonya Coleman (School of Computing and Intelligent Systems, University of Ulster); Dermot Kerr (Ulster University); Dr. Yogarajah Pratheepan (Ulster University, UK); Zheng Fang (Northeastern University); Chengdong Wu (Northeastern University)
P2-15Encoding-free Incrementing Hough Transform for High Frame Rate and Ultra-low Delay Straight-line Detection[paper]
Ziwei Dong (Waseda University)*; Tingting Hu (Waseda University); Ryuji Fuchikami (Panasonic); Takeshi Ikenaga (Waseda University)

Tuesday, July 27, 2021

Poster 3

P3-1Open-set Recognition with Supervised Contrastive Learning[paper]
Yuto Kodama (The University of Tokyo); Yinan Wang (The University of Tokyo)*; Rei Kawakami (Tokyo Institute of Technology); Takeshi Naemura (The University of Tokyo)[supplementary]
P3-2Learning VAE with Categorical Labels for Generating Conditional Handwritten Characters[paper]
Keita Goto (Tokyo Institute of Technology)*; Nakamasa Inoue (Tokyo Institute of Technology)
P3-3Understanding the Reason for Misclassification by Generating Counterfactual Images[paper]
Muneaki Suzuki (Meijo University); Yoshitaka Kameya (Meijo University)*; Takuro Kutsuna (DENSO CORPORATION); Naoki Mitsumoto (DENSO CORPORATION)
P3-4Adversarial Defense Through High Frequency Loss Variational Autoencoder Decoder and Bayesian Update With Collective Voting[paper]
Zhixun He (University of California, Merced)*; Mukesh Singhal (UC Merced)
P3-5Weakly Supervised Domain Adaptation using Super-pixel labeling for Semantic Segmentation[paper]
Masaki Yamazaki (Honda)*; Xingchao Peng (Boston University); Kuniaki Saito (Boston University); Ping Hu (Boston University); Kate Saenko (Boston University); Yasuhiro Taniguchi (Honda)
P3-6Output augmentation works well without any domain knowledge[paper]
Shu Eguchi (Fukuoka University)*; Ryo Nakamura (Fukuoka University); Masaru Tanaka (Fukuoka University)
P3-7Cut and paste curriculum learning with hard negative mining for point-of-sale systems[paper]
Jaechul Kim (Kyocera Corporation)*; Xiaoyan Dai (Kyocera Coporation); Yisan Hsieh (Kyocera Coporation); Hiroki Tanimoto (Kyocera Coporation); Hironobu Fujiyoshi (Chubu University)
P3-8Synthetically Generating Motion Blur in a Depth Map from Time-of-Flight Sensors[paper]
Bryan D Rodriguez (Southern Methodist University)*; Xinxiang Zhang (Southern Methodist University); Dinesh Rajan (Southern Methodist University)
P3-9Bi-directional Recurrent MVSNet for High-resolution Multi-view Stereo[paper]
Taku Fujitomi (Aoyama Gakuin University)*; Seiya Ito (Aoyama Gakuin University); Naoshi Kaneko (Aoyama Gakuin University); Kazuhiko Sumi (Aoyama Gakuin University)[supplementary]
P3-10Video-Based Camera Localization Using Anchor View Detection and Recursive 3D Reconstruction[paper]
Hajime Taira (Tokyo Institute of Technology)*; Koki Onbe (Tokyo Institute of Technology); Naoyuki Miyashita (Olympus R&D. ); Masatoshi Okutomi (Tokyo Institute of Technology)
P3-11Multiple Fisheye Camera Calibration and Stereo Measurement Methods for Uniform Distance Errors throughout Imaging Ranges[paper]
Nobuhiko Wakai (Panasonic Corporation)*; Takeo Azuma (OmniVision Technologies, Inc); Kunio Nobori (Panasonic Corporation)[supplementary][media]