Downloads | Wenhao Chai

Featured

Codebases

SAMURAI

Visual object tracking.

Aurora

Efficient multimodal large language model.

StableVideo

Diffusion-based video editing.

MovieChat

LMM for long-form video understanding.

Featured

Datasets

VDC

The first benchmark for detailed video captioning, featuring over one thousand videos with significantly longer and more detailed captions.

MovieChat

A manually labeled long video QA and caption dataset, contains 1,000 video, for each longer than ten thousands frames.

VFD-2000

A video fight detection dataset collected from YouTube, contains 2,000 video clips in diverse scenarios.

Featured

Surveys

Deep vision multimodal learning: Methodology, benchmark, and trend

Applied Science

Deep Learning Methods for Small Molecule Drug Discovery: A Survey

IEEE Transactions on Artificial Intelligence

A Survey of Deep Learning in Sports Applications: Perception, Comprehension, and Decision

arXiv Preprint.

Awesome-list: Vector Quantized Variational Autoencoder (VQ-VAE)

GitHub Repo

Featured

Templates

arXiv Template

Overleaf

Download

Curriculum Vitae (CV) Template

Overleaf

Download

Project Page Template

Github

Download

Code | Datasets

Featured

Codebases

SAMURAI

Aurora

StableVideo

MovieChat

Featured

Datasets

VDC

MovieChat

VFD-2000

Featured

Surveys

Deep vision multimodal learning: Methodology, benchmark, and trend

Deep Learning Methods for Small Molecule Drug Discovery: A Survey

A Survey of Deep Learning in Sports Applications: Perception, Comprehension, and Decision

Awesome-list: Vector Quantized Variational Autoencoder (VQ-VAE)

Featured

Templates

arXiv Template

Curriculum Vitae (CV) Template

Project Page Template