From Video to Virtual: Object-centric 3D scene understanding from videos

Virtual: https://events.vtools.ieee.org/m/468355

The growing demand for immersive, interactive experiences has underscored the importance of 3D data in understanding our surroundings. Traditional methods for capturing 3D data are often complex and equipment-intensive. In contrast, my research aims to utilize unconstrained videos, such as those from augmented reality glasses, to effortlessly capture scenes and objects in their full 3D complexity. As a first step, I will describe a method to incorporate Epipolar Geometry priors in multi-view Transformer models to enable identifying objects across extreme pose variations. Next, I will discuss my work "Contrastive Lift" on 3D object segmentation using 2D pre-trained foundation models, following which I will talk about addressing the same problem using language. Speaker(s): Yash Bhalgat Agenda: - Invited talk from Yash Bhalgat the final year PhD student at University of Oxford's Visual Geometry Group (VGG) supervised by Andrew Zisserman, Andrea Vedaldi, Joao Henriques and Iro Laina. - Q/A Session Virtual: https://events.vtools.ieee.org/m/468355