Massive Data
1.1K Movies, 60K trailers, 375K meta etc., all freely avaliable at MovieNet.
Rich Annotation
Various annotations of actors, scene and cinematic styles are provided.
Diverse Tasks
Support several vision and language tasks, from low-level to high-level.
Opensource Toolbox
Easy-to-use toolbox designed for experiments and data processing.

Explore MovieNet

Data in MovieNet

๐ŸŽฌ  Movie
There are 1100 movies in MovieNet. We select the movie that covers a wide range of years, countries and genres.
Examples & Statistics View Example on Desktop
๐ŸŽž๏ธ  Trailer
There are 60K trailers from 33K unique movies in MovieNet. The url links of the trailers are provided.
Examples & Statistics View Example on Desktop
๐Ÿ“ธ   Photo
There are 3.9M photos from 7 types in MovieNet. We provide all the photos as well as the processed results.
Examples & Statistics View Example on Desktop
๐Ÿ“’  Text
There are 3 main types of texts in MovieNet, namely Subtitle, Synopsis and Script.
Examples & Statistics View Example on Desktop
๐Ÿ“  Meta Data
MovieNet contains meta data of 375K movies. Each meta data is written in a json file.
Examples & Statistics View Example on Desktop

Annotation in MovieNet

๐Ÿƒโ€โ™€๏ธ  Character BBox & ID
Character plays an important role in movie anlysis, we annotate 1.1M instances of 3087 unique credited cast.
See Examples View Example on Desktop
๐Ÿ  Scene Segmentation
We annotate scene boundaries to support the researches on scene segmentation, resulting in 42K scenes.
See Examples View Example on Desktop
๐Ÿ‘Š  Action & Place Tags
We annotate 42K segments with 19.6K place tags and 45K action tags. There are 80 action classes and 90 place classes.
See Examples View Example on Desktop
๐Ÿ“ฝ๏ธ   Shot Cinematic Style
We annotated the commonly used two kinds of cinematic tags of a shot, shot scale and shot movement.
See Examples View Example on Desktop
๐Ÿ“ฐ   Movie & Synopsis
To support the movie segment retrieval task, we manually associate movie segments and synopsis paragraphs.
See Examples View Example on Desktop

Benchmarks

 Card image cap
Detection & Identification
 Card image cap
Scene Segmentation
 Card image cap
Scene Tag Prediction
 Card image cap
Cinematic Style Prediction
 Card image cap
Movie Synopsis Association
Easy-to-Use Toolbox
We provide the following easy-to-use toolbox and codebase for accessing, processing and managing the dataset, extracting features and conducting experiments. (View All)
movienet-tools (View on GitHub)
Basic Toolbox for movie understanding reasearch. movienet-tools provides tools for:
1. Crawler of movie related websites. For example, IMDb.
2. Useful tools for video/image/audio preprocessing, e.g. shot detector.
3. Multimodality feature extractors (e.g. action feature extractor) for experiment preparation.
movienet-core (View on GitHub)
The python package movienet-core contains multiple modules for processing the dataset, including:
1. Core representation and parsers for data and annotation in MovieNet.
2. Python extension utilities for movie analysis.
movienet-host (View on GitHub)
The python package movienet-host is a flexiable MovieNet data manager that help you deploy MovieNet data server remotely or locally.
Tutorial and Documentation
To help anyone quickly starts with researches on MovieNet, we provided detailed tutorial and documentations including:
1. Wiki: Foundamental knowledges of movie analysis and details about MovieNet.
2. documents for codebases like movienet-tools, movienet, etc.

Download MovieNet

Detected that you are browsing this site on mobile, please download the data on desktop.
Basic Download
Download all the data (excluding movies), annotation and pretrained features through OpenDataLab. You will need to register an account and download the data under User Service Agreement and Privacy Policy.
Specifically, it contains the following:
1. Annotation (Last updated on 02/08/2020, size: 53MB, after unzip: 881MB)
2. Meta (Last updated on 06/08/2020, size: 537MB, after unzip: 2.3GB)
3. Movie per-shot keyframes (240P) (Last updated on 29/08/2020, size: 161GB, after unzip: 161GB)
4. Movie List (1100) (Last updated on 29/08/2020, size: 10KB)
5. Movie1K train/val/test split (Last updated on 29/08/2020, size: 30KB)
6. Poster4M image meta information (Last updated on 02/09/2020, size: 1.3GB, one could download images from the urls provided in the json file.)
7. Subtitles (815 files) from movie1K (Last updated on 05/09/2020, size: 29.9MB, after unzip: 84.4MB)
8. Script (479 files) from movie1K (Last updated on 05/09/2020, size: 27.9MB, after unzip: 101.8MB)
9. Shot detection result for movie1K (Last updated on 05/09/2020, size: 20.9MB, after unzip: 50.7MB)
10. Audio feature for movie1K (Last updated on 06/09/2020, size: 89.7GB)
11. Place feature for movie1K (Last updated on 06/09/2020, size: 11GB)
12. Video info for movie1K movies, including fps, frame_count, etc (Last updated on 09/09/2020, size: 118KB)
13. Trailer 30K URLs (Last updated on 17/10/2020, size: 2.155MB)
Download Movies
According to copyright restrictions, we plan to release movies under very strict conditions. If you would like to download the movies, please see the instruction. See Instruction and apply now (coming soon)
FAQ: When do you plan to release the movies?
We've prepared the agreements for users to sign. The lawyers have polished the agreement. Currently we are waiting for approval from the university (CUHK) legal team.
Extra Download
If you would like to download the corresponding data of a perticualr MovieNet related paper (e.g. Cast In Movies Dataset, CVPR18), please find the download option in the project page of the paper. (See Project Section)

MovieNet Projects

Holistic Movie Understanding Dataset

MovieNet: A Holistic Dataset for Movie Understanding
Qingqiu Huang*, Yu Xiong*, Anyi Rao, Jiaze Wang, Dahua Lin
European Conference on Computer Vision (ECCV), 2020 (Spotlight)

Movie Cinematic Style Analysis

A Unified Framework for Shot Type Classification Based on Subject Centric Lens
Anyi Rao, Jiaze Wang, Linning Xu, Xuekun Jiang, Qingqiu Huang, Bolei Zhou, Dahua Lin
European Conference on Computer Vision (ECCV), 2020

Unsupervised Face Recognition

Caption-Supervised Face Recognition: Training a State-of-the-Art Face Model without Manual Annotation
Qingqiu Huang, Lei Yang, Huaiyi Huang, Tong Wu, Dahua Lin
European Conference on Computer Vision (ECCV), 2020

Online Person Search

Online Multi-modal Person Search in Videos
Jiangyue Xia, Anyi Rao, Qingqiu Huang, Linning Xu, Jiangtao Wen, Dahua Lin
European Conference on Computer Vision (ECCV), 2020

Movie Scene Temporal Segmentation

A Local-to-Global Approach to Multi-modal Movie Scene Segmentation
Anyi Rao, Linning Xu, Yu Xiong, Guodong Xu, Qingqiu Huang, Bolei Zhou, Dahua Lin
Computer Vision and Pattern Recognition (CVPR), 2020
Also appear at LUV 2020 Workshop (15-min talk) and Sight and Sound 2020 Workshop (5-min talk)

Movie Synopsis Association

A Graph-based Framework to Bridge Movies and Synopses
Yu Xiong, Qingqiu Huang, Lingfeng Guo, Hang Zhou, Bolei Zhou, Dahua Lin
International Conference in Computer Vision (ICCV), 2019 (Oral)

Person Search

Person Search in Videos with One Portrait Through Visual and Temporal Links
Qingqiu Huang, Wentao Liu, Dahua Lin
European Conference on Computer Vision (ECCV), 2018

Person Recognition

Unifying Identification and Context Learning for Person Recognition
Qingqiu Huang, Yu Xiong, Dahua Lin
Computer Vision and Pattern Recognition (CVPR), 2018

Trailer Analysis

From Trailers to Storylines: An Efficient Way to Learn from Movies
Qingqiu Huang, Yuanjun Xiong, Yu Xiong, Yuqi Zhang, Dahua Lin
Arxiv:1806.05341, 2018

News

Jul. 2020 movienet-tools v0.0.1 is released.
Jul. 2020 MovieNet Dataset v0.1 is released.

Contact Us

Huang Qingqiu (้ป„้’่™ฌ):   hqqasw@gmail.com  |  Homepage  | 
Xiong Yu (็†Šๅฎ‡):   xiongyuxy@gmail.com  |  Homepage  | 
Rao Anyi (้ฅถๅฎ‰้€ธ):   anyirao@link.cuhk.edu.hk  |  Homepage  | 
We find that some e-mails sent to ie.cuhk.edu.hk will be identified as spam mail and directly deleted by our department mail system. So if we do not reply your e-mail, please contact movienet.mmlab@gmail.com since the original e-mail sent to the above addresses maybe invisible to us without any notification.
BibTeX for MovieNet: A Holistic Dataset for Movie Understanding
@inproceedings{huang2020movienet,
	title={MovieNet: A Holistic Dataset for Movie Understanding},
	author={Huang, Qingqiu and Xiong, Yu and Rao, Anyi and Wang, Jiaze and Lin, Dahua},
	booktitle = {The European Conference on Computer Vision (ECCV)}, 
	year={2020}
}
BibTeX for A Unified Framework for Shot Type Classification Based on Subject Centric Lens
@inproceedings{rao2020unified,
	title={A Unified Framework for Shot Type Classification Based on Subject Centric Lens},
	author={Rao, Anyi and Wang, Jiaze and Xu, Linning and Jiang, Xuekun and Huang, Qingqiu and Zhou, Bolei and Lin, Dahua},
	booktitle = {The European Conference on Computer Vision (ECCV)}, 
	year={2020}
}
BibTeX for Caption-Supervised Face Recognition: Training a State-of-the-Art Face Model without Manual Annotation
@inproceedings{huang2020caption,
	title={Caption-Supervised Face Recognition: Training a State-of-the-Art Face Model without Manual Annotation},
	author={Huang, Qingqiu and Yang, Lei and Huang, Huaiyi and Wu, Tong and Lin, Dahua},
	booktitle = {The European Conference on Computer Vision (ECCV)}, 
	year={2020}
}
BibTeX for Online Multi-modal Person Search in Videos
@inproceedings{xia2020online,
	title={Online Multi-modal Person Search in Videos},
	author={Xia, Jiangyue and Rao, Anyi and Xu, Linning and Huang, Qingqiu and Wen, Jiangtao and Lin, Dahua},
	booktitle = {The European Conference on Computer Vision (ECCV)}, 
	year={2020}
}
BibTeX for A Local-to-Global Approach to Multi-modal Movie Scene Segmentation
@inproceedings{rao2020local,
	title={A Local-to-Global Approach to Multi-modal Movie Scene Segmentation},
	author={Rao, Anyi and Xu, Linning and Xiong, Yu and Xu, Guodong and Huang, Qingqiu and Zhou, Bolei and Lin, Dahua},
	booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
	pages={10146--10155},
	year={2020}
}
BibTeX for A Graph-based Framework to Bridge Movies and Synopses
@InProceedings{Xiong_2019_ICCV,
	author = {Xiong, Yu and Huang, Qingqiu and Guo, Lingfeng and Zhou, Hang and Zhou, Bolei and Lin, Dahua},
	title = {A Graph-Based Framework to Bridge Movies and Synopses},
	booktitle = {The IEEE International Conference on Computer Vision (ICCV)},
	month = {October},
	year = {2019}
}
BibTeX for Person Search in Videos with One Portrait Through Visual and Temporal Links
@inproceedings{huang2018person,
	title={Person Search in Videos with One Portrait Through Visual and Temporal Links},
	author={Huang, Qingqiu and Liu, Wentao and Lin, Dahua},
	booktitle={Proceedings of the European Conference on Computer Vision (ECCV)},
	pages={425--441},
	year={2018}
}
BibTeX for Unifying Identification and Context Learning for Person Recognition
@InProceedings{Huang_2018_CVPR,
	author = {Huang, Qingqiu and Xiong, Yu and Lin, Dahua},
	title = {Unifying Identification and Context Learning for Person Recognition},
	booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
	month = {June},
	year = {2018}
}
BibTeX for From Trailers to Storylines: An Efficient Way to Learn from Movies
@article{huang2018trailers,
	title={From Trailers to Storylines: An Efficient Way to Learn from Movies},
	author={Huang, Qingqiu and Xiong, Yuanjun and Xiong, Yu and Zhang, Yuqi and Lin, Dahua},
	journal={arXiv preprint arXiv:1806.05341},
	year={2018}
}