Code and Datasets
Aurora Series
As an open source project for efficient large multimodal models, we provide training, evaluation, and deployment codebases with all the models and data. We are still working on the release of next-generation models with better performance and easy-to-use codes.
View moreFeatured
Datasets
VDC
The first benchmark for detailed video captioning, featuring over one thousand videos with significantly longer and more detailed captions.
View moreMovieChat
A manually labeled long video QA and caption dataset, contains 1,000 video, for each longer than ten thousands frames.
View moreVFD-2000
A video fight detection dataset collected from YouTube, contains 2,000 video clips in diverse scenarios.
View more