Papers, code and datasets about deep learning and multi-modal learning for video analysis
-
Updated
Oct 10, 2021
Papers, code and datasets about deep learning and multi-modal learning for video analysis
Generic PyTorch dataset implementation to load and augment VIDEOS for deep learning training loops.
500,000 multimodal short video data and baseline models. 50万条多模态短视频数据集和基线模型(TensorFlow2.0)。
Tools for loading video dataset and transforms on video in pytorch. You can directly load video files without preprocessing.
Summary about Video-to-Text datasets. This repository is part of the review paper *Bridging Vision and Language from the Video-to-Text Perspective: A Comprehensive Review*
Surveillance Perspective Human Action Recognition Dataset: 7759 Videos from 14 Action Classes, aggregated from multiple sources, all cropped spatio-temporally and filmed from a surveillance-camera like position.
Official Code for VideoLT: Large-scale Long-tailed Video Recognition (ICCV 2021)
Official project page of AVCAffe - AAAI 2023
Trailers12k: Improving Transfer Learning with a Dual Image and Video Transformer for Multi-label Movie Trailer Genre Classification
The repository contains the code for extracting image and mask from a video segmentation dataset by using the OpenCV library in the Python programming language.
LIVE-YT-HFR Video Quality Assessment Database
Dataset repository of "MetaVD: A Meta Video Dataset for enhancing human action recognition datasets."
Synthetically Generated Surveillance Perspective Human Action Recognition Dataset: 6901 Videos from 10 action classes, made by a 3D Simulation, all cropped spatio-temporally and filmed from a surveillance-camera like position.
Add a description, image, and links to the video-dataset topic page so that developers can more easily learn about it.
To associate your repository with the video-dataset topic, visit your repo's landing page and select "manage topics."