Code | Datasets

Featured

Codebases

SAMURAI GitHub stars GitHub forks

Visual object tracking.

View more
Aurora GitHub stars GitHub forks

Efficient multimodal large language model.

View more
StableVideo GitHub stars GitHub forks

Diffusion-based video editing.

View more
MovieChat GitHub stars GitHub forks

LMM for long-form video understanding.

View more

Featured

Datasets

VDC

The first benchmark for detailed video captioning, featuring over one thousand videos with significantly longer and more detailed captions.

View more
MovieChat

A manually labeled long video QA and caption dataset, contains 1,000 video, for each longer than ten thousands frames.

View more
VFD-2000

A video fight detection dataset collected from YouTube, contains 2,000 video clips in diverse scenarios.

View more

Featured

Surveys

Deep vision multimodal learning: Methodology, benchmark, and trend

Applied Science

View more
Deep Learning Methods for Small Molecule Drug Discovery: A Survey

IEEE Transactions on Artificial Intelligence

View more
A Survey of Deep Learning in Sports Applications: Perception, Comprehension, and Decision

arXiv Preprint.

View more
Awesome-list: Vector Quantized Variational Autoencoder (VQ-VAE)

GitHub Repo GitHub stars GitHub forks

View more

Featured

Templates

arXiv Template

Overleaf

Download
Curriculum Vitae (CV) Template

Overleaf

Download
Project Page Template

Github

Download