Email address: twangnh[dot]ai[at]gmail[dot]com
Other links: Google Scholar, GitHub
(2025/05) Congrats to my master student Zhengqin Zang, for obtaining Ph.D. scholarship to Zhejiang University(ZJU)!
(2025/03) Obtaining Emei Talent Program support(天府峨眉计划/原四川省千人计划)!
(2024/11) Congrats to my first master student Chenyu Lin, for obtaining Ph.D. scholarship to Hong Kong Baptist University!
(2024/10) 3rd tire award of Jittor Competition, with 10000RMB, congrats to chenyu and other team members!
(2024/05) One paper is accepted by IJCV!
(2024/02) One paper is accepted by TNNLS!
(2024/12) Obtaining Overseas Talent Program support(教育部海外引才专项计划)!
(2023/12) One paper is accepted by AAAI'24! which is our top entry method submission at VisDrone2023 Zero-shot Aerila Object Detection challenge.
(2023/10) 3rd place at Visual Continual Learning Object Detection Challenge at ICCV 2023, congratulations to Chenyu.
(2023/08) One paper is accepted by TNNLS!
(2023/03) PnP-DETR is integrated into detrex! which is a new open-source codebase for detection transformers!
(2023/02) One paper is accepted by CVPR'23!
(2023/02) Our Object Detector Distillation technique is implemented in Yolov5!
(2022/11) CondHead is available on arxiv!
(2022/09) MvP is integrated into XRMoCap, a new open-source PyTorch-based codebase for the use of multi-view motion capture, from OpenXRLab
(2022/03) One paper is accepted by CVPR'22 Oral
(2022/03) T2T-ViT is included in Most Influential ICCV Papers by Paper Digest (rank 3rd in ICCV 2021)
(2022/02) Offered Research Scientist Internship at Facebook AI Research (FAIR).
(2022/01) One paper is accepted by TIP'22.
(2021/09) One paper is accepted by NeurIPS'21.
(2021/07) Two papers accepted by ICCV'21.
(2020/03) Internship at Sea AI Lab
(2020/10) We are best grand challenge winner at ACM MM 2020!
(2020/07) One paper accepted by ECCV'20.
(2020/06) We win 1st place at the ACM MM grand challenge Human in Events Track4.
(2020/06) We win 2st place at the ACM MM grand challenge Human in Events Track2.
(2020/02) Three papers accepted by CVPR'20, two as Oral
(2020/01) Internship at Yitu Tech
(2019/11) Invited talk at ICCV 2019 to present our winner solution on LVIS, glad to meet Ross Girshick!
(2019/10) We win 1st place in the LVIS challenge!
(2019/08) Distilling object detection technology is integrated into product developement at Huawei SG.
(2019/04) Two papers accepted by CVPR'19
Chenyu Lin 2022.09-now Master
Zhengqin Zang 2022.09-now Master
Yifan Wang 2023.09-now Master
Yusheng He 2024.09-now Ph.D.
Xingyu Wang 2024.09-now Master
Jieyu Liu 2024.09-now Master
Learning Box Regression and Mask Segmentation under Long-tailed Distribution with
Gradient Transfusing (CRAT)
Tao Wang, Li Yuan, Jiashi Feng and Xinchao Wang,
We study how box regression and mask segmentation are affected by long-tailed distribution and propose CRAT, which is guided by Fisher to augment tail class training during back-propagation.
Overcoming Classifier Imbalance for Long-tail Object Detection with Balanced Group Softmax
Yu Li, Tao Wang, Bingyi Kang, Sheng Tang, Chunfeng Wang, Jintao Li, Jiashi Feng
CVPR 2020 Oral, [Paper][Code][Video]
Widely adopted by LVIS challenge 2020 and 2021 top-entries
We propose a specifically re-designed softmax classification module which further improves over SimCal on long-tail object detection and instance segmentation.
The Devil is in Classification: A Simple Framework for Long-tail Instance Segmentation (SimCal)
Tao Wang, Yu Li, Bingyi Kang, Junnan Li, Junhao Liew, Sheng Tang, Steven Hoi, and Jiashi Feng
ECCV 2020, [Paper][Code][Video]
Based on our LVIS winner solution, we further extend it and improve the performance by discovering a better initialization strategy.
Joint COCO and Mapillary Workshop at ICCV 2019: LVIS Challenge Track: Classification Calibration for Long-tail Instance Segmentation
Tao Wang, Yu Li, Bingyi Kang, Junnan Li, Junhao Liew, Sheng Tang, Steven Hoi, Jiashi Feng
Winner solution for the 1st LVIS challenge at ICCV 2019 [Tech Report]
Zero-Shot Aerial Object Detection with Visual Description Regularization (DescReg)
Zhengqing Zang, Chenyu Lin, Chenwei Tang, Tao Wang†(Corresponding Author), Jiancheng Lv
We identify the weak semantic-visual correlation challenge in zero-shot aerial object detection domain and propose a visual description regularization method to improve zero-shot detection.
Learning to Detect and Segment for Open Vocabulary Object Detection (CondHead)
Tao Wang, Nan Li
CVPR 2023, [Paper][Code]
CondHead conditions the bounding box regression and mask segmentation on the text embeddings, to facilitate open vocabulary object detection.
PoseTriplet: Co-evolving 3D Human Pose Estimation, Imitation, and Hallucination under Self-supervision
Kehong Gong*, Bingbing Li*, Jianfeng Zhang*, Tao Wang*, Jing Huang, Michael Bi Mi, Jiashi Feng, Xinchao Wang (* equal contribution)
CVPR 2022 Oral, [Paper][Code][Video]
We construct an effective self-supervised framework for 3D human pose estimation, it is self-improving by generating physically plausible 2D-3D training pose data.
Revisiting Knowledge Distillation via Label Smoothing Regularization
Li Yuan, Francis EH Tay, Guilin Li, Tao Wang, Jiashi Feng
CVPR 2020 Oral, [Paper][Code][Video]
We reveal that knowledge distillation (KD) works as a learned label smoothing regularization, and further propose a novel Teacher-free Knowledge Distillation (Tf-KD) framework.
Distilling Object Detectors with Fine-grained Feature Imitation
Tao Wang, Li Yuan, Xiaopeng Zhang, Jiashi Feng
Highly cited work for knowledge distillation of object detection model.
We develop a knowledge distillation (KD) framework for object detection, based on feature-level imitation of the estimated foreground object regions.
Few-shot Adaptive Faster R-CNN
Tao Wang, Xiaopeng Zhang, Li Yuan, Jiashi Feng
CVPR 2019, [Paper][Code]
We reveal that knowledge distillation (KD) works as a learned label smoothing regularization, and further propose a novel Teacher-free Knowledge Distillation (Tf-KD) framework.
SODAR: Segmenting Objects by Dynamically Aggregating Neighboring Mask Representations
Tao Wang, Jun Hao Liew, Yu Li, Yunpeng Chen, Jiashi Feng
We reveal the usefulness of neighboring mask predictions and introduce a simple and efficient neighbor aggregation method to improve dense instance segmentation models.
Detect Multi-person with 3D Pose Directly from Multi-view images (Multi-view Pose Transformer, MvP)
Tao Wang, Jianfeng Zhang, Yujun Cai, Shuicheng Yan, Jiashi Feng
NeurIPS 2021, [Paper][Code][Industrial Recognition by XRMoCap][Video][Slides]
We develop a simple transformer algorithm that directly detects multi-person and predicts their 3D pose from multi-view images.
Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet
Li Yuan*, Yunpeng Chen, Tao Wang*, Weihao Yu, Yujun Shi, Zihang Jiang, Francis E.H. Tay, Jiashi Feng, Shuicheng Yan (Work done during internship at Yitu)
ICCV 2021, [Paper][Code][Video][Most Influential ICCV Papers]
We introduce a Tokens-to-Token (T2T) transformation scheme to progressively structurize the image to tokens by recursively aggregating neighboring Tokens.