Select a topic to present in class. Must be approved by instructor.
- Video Analysis The DeepFake Detection Challenge Dataset (link) CLEVRER: Collision Events For Video Representation And Reasoning (link) - Semantic Video Understanding Action recognition with spatial-temporal discriminative filter banks (link) Rethinking Zero-shot Video Classification: End-to-end Training for Realistic Applications (link) - Tracking Combining detection and tracking for human pose estimation in videos (link) Multiple Object Tracking with Siamese Track-RCNN (link) - Efficient Learning and Inference X3D: Expanding Architectures for Efficient Video Recognition (link) A Multigrid Method for Efficiently Training Video Models (link) - Multimodal Audiovisual SlowFast Networks for Video Recognition (link) Music Gesture for Visual Sound Separation (link) VideoBERT: A Joint Model for Video and Language Representation Learning (link) - Lip Reading Deep Lip Reading: A comparison of models and an online application (link) Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis (link) Lip Reading Sentences in the Wild (link) LipNet: End-to-End Sentence-level Lipreading (link) - VR MEgATrack: Monochrome Egocentric Articulated Hand-Tracking for Virtual Reality (link) The Eyes Have It: An Integrated Eye and Face Model for Photorealistic Facial Animation (link)