Tao Wang

Publications

Learning to Zoom and Detect Aerial Objects (DescReg)

Tao Wang, Chenyu Lin, Zhengqing Zang, Chenwei Tang, Ji-Zhe Zhou, Li Yuan, Jian Zhao, Jiancheng Lv

NeurIPS 2024 (under review)

We design a novel zoom-in and detect framework to conduct adaptive non-uniformly zooming on the input images and then detect the objects, which significantly improves small object detection.

Zero-Shot Aerial Object Detection with Visual Description Regularization (DescReg)

Zhengqing Zang, Chenyu Lin, Chenwei Tang, Tao Wang†(Corresponding Author), Jiancheng Lv

AAAI 2024

We identify the weak semantic-visual correlation challenge in zero-shot aerial object detection domain and propose a visual description regularization method to improve zero-shot detection.

Learning Box Regression and Mask Segmentation under Long-tailed Distribution with

Gradient Transfusing (CRAT)

Tao Wang, Li Yuan, Jiashi Feng and Xinchao Wang,

IJCV 2024

We study how box regression and mask segmentation are affected by long-tailed distribution and propose CRAT, which is guided by Fisher to augment tail class training during back-propagation.

Learning to Detect and Segment for Open Vocabulary Object Detection Supplementary (CondHead)

Tao Wang, Nan Li

CVPR 2023

CondHead conditions the bounding box regression and mask segmentation on the text embeddings, to facilitate open vocabulary object detection.

PoseTriplet: Co-evolving 3D Human Pose Estimation, Imitation, and Hallucination under Self-supervision

Kehong Gong*, Bingbing Li*, Jianfeng Zhang*, Tao Wang*, Jing Huang, Michael Bi Mi, Jiashi Feng, Xinchao Wang (* equal contribution)

CVPR 2022 Oral, [Paper][Code][Video]

We construct an effective self-supervised framework for 3D human pose estimation, it is self-improving by generating physically plausible 2D-3D training pose data.

SODAR: Segmenting Objects by Dynamically Aggregating Neighboring Mask Representations

Tao Wang, Jun Hao Liew, Yu Li, Yunpeng Chen, Jiashi Feng

TIP 2022, [Paper][Code]

We reveal the usefulness of neighboring mask predictions and introduce a simple and efficient neighbor aggregation method to improve dense instance segmentation models.

Learnable Central Similarity Quantization for Efficient Image and Video Retrieval

Li Yuan, Tao Wang, Xiaopeng Zhang, Francis EH Tay, Zequn Jie, Yonghong Tian, Wei Liu, Jiashi Feng

TNNLS 2022, [Paper][Code]

We propose a novel concept “Hash Center” to formulate the central similarity for deep hash learning.

Detect Multi-person with 3D Pose Directly from Multi-view images (Multi-view Pose Transformer, MvP)

Tao Wang, Jianfeng Zhang, Yujun Cai, Shuicheng Yan, Jiashi Feng

NeurIPS 2021, [Paper][Code][XRMoCap implementation][Video][Slides]

We develop a simple transformer algorithm that directly detects multi-person and predicts their 3D pose from multi-view images.

Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet

Li Yuan, Yunpeng Chen, Tao Wang, Weihao Yu, Yujun Shi, Zihang Jiang, Francis E.H. Tay, Jiashi Feng, Shuicheng Yan

ICCV 2021, [Paper][Code][Video]

We introduce a Tokens-to-Token (T2T) transformation scheme to progressively structurize the image to tokens by recursively aggregating neighboring Tokens.

PnP-DETR: Towards Efficient Visual Analysis with Transformers

Tao Wang, Li Yuan, Yunpeng Chen, Jiashi Feng, Shuicheng Yan

ICCV 2021, [Paper][Code][Video]

We propose Poll and Pool sampling to reduce the spatial redundancy of image features for efficient transformer processing.

Overcoming Classifier Imbalance for Long-tail Object Detection with Balanced Group Softmax

Yu Li, Tao Wang, Bingyi Kang, Sheng Tang, Chunfeng Wang, Jintao Li, Jiashi Feng

CVPR 2020 Oral, [Paper][Code][Video]

We propose a specifically re-designed softmax classification module which further improves over SimCal on long-tail object detection and instance segmentation.

The Devil is in Classification: A Simple Framework for Long-tail Instance Segmentation

Tao Wang, Yu Li, Bingyi Kang, Junnan Li, Junhao Liew, Sheng Tang, Steven Hoi, and Jiashi Feng

ECCV 2020, [Paper][Code][Video]

We propose Poll and Pool sampling to reduce the spatial redundancy of image features for efficient transformer processing.

Toward Accurate Person-level Action Recognition in Videos of Crowded Scenes

Li Yuan, Yichen Zhou, Shuning Chang, Yupeng Chen, Xuecheng Nie, Tao Wang, Jiashi Feng, Shuicheng Yan

ACM Multimedia 2020, [Paper]

We focus on improving spatio-temporal action recognition by fully utilizing the information of scenes and collecting new data.

Classification Calibration for Long-tail Instance Segmentation, Joint COCO and Mapillary Workshop at ICCV 2019: LVIS Challenge Track

Tao Wang, Yu Li, Bingyi Kang, Junnan Li, Junhao Liew, Sheng Tang, Steven Hoi, Jiashi Feng

ECCV 2020, [Paper]

Winner solution for the first LVIS challenge at ICCV 2019

Revisiting Knowledge Distillation via Label Smoothing Regularization

Li Yuan, Francis EH Tay, Guilin Li, Tao Wang, Jiashi Feng

CVPR 2020 Oral, [Paper][Code][Video]

We reveal that knowledge distillation (KD) works as a learned label smoothing regularization, and further propose a novel Teacher-free Knowledge Distillation (Tf-KD) framework.

Central Similarity Quantization for Efficient Image and Video Retrieval

Li Yuan, Tao Wang, Xiaopeng Zhang, Francis EH Tay, Zequn Jie, Wei Liu, Jiashi Feng

CVPR 2020, [Paper][Code][Video]

We propose a novel concept “Hash Center” to formulate the central similarity for deep hash learning.

Distilling Object Detectors with Fine-grained Feature Imitation

Tao Wang, Li Yuan, Xiaopeng Zhang, Jiashi Feng

CVPR 2019, [Paper][Code]

We develop a knowledge distillation framework for object detection, based on feature-level imitation of the estimated foreground object regions.

Few-shot Adaptive Faster R-CNN

Tao Wang, Xiaopeng Zhang, Li Yuan, Jiashi Feng

CVPR 2019, [Paper][Code]

We reveal that knowledge distillation (KD) works as a learned label smoothing regularization, and further propose a novel Teacher-free Knowledge Distillation (Tf-KD) framework.