About
filmglot
Filmglot helps Mandarin learners choose films and shows by vocabulary difficulty, then pre-study the words that are likely to interrupt watching.
How It Works
Each title is scored from its subtitle vocabulary. Filmglot segments Chinese text, normalizes traditional and simplified forms, resolves words through open dictionaries and HSK lists, then ranks titles and word lists by approximate learner difficulty.
The difficulty score is a practical study aid, not an official proficiency claim. Chengyu and literary phrases are weighted more heavily because they tend to be the words that break comprehension during a scene.
References
- CC-CEDICT via MDBG for Chinese-English dictionary entries, pinyin, and definitions.
- krmanik/HSK-3.0 for HSK 3.0 vocabulary, character, and grammar reference data.
- pwxcoo/chinese-xinhua for chengyu, xiehouyu, word, and character data.
- wordfreq for broad word-frequency signals.
- nodejieba and jieba for Chinese word segmentation.
- opencc-js and OpenCC for traditional/simplified Chinese conversion.
- TMDB for title metadata, posters, and cast information. Filmglot uses the TMDB API but is not endorsed or certified by TMDB.
- OMDb API for IMDb and Rotten Tomatoes rating metadata where available.
Contact
Recommendations, corrections, and source suggestions are welcome at [email protected].