Bootcamp Keynote Session on Vision Transformers

This live session features a deep dive into recent advances in computer vision: from Convolutional Nets to Spectral Transformers, with Dr. Vijay Srinivas Agneeswaran, Senior Director and ML Research Leader at Microsoft.

This session sheds light on the latest in computer vision, vision transformers, and their role in Large Language Models (LLMs).

Towards the end, our host Mudit Srivastava from Pathway interacts with Vijay as he addresses curated questions from your fellow learners who had joined live.

What you'll learn from this session

  • The transition from traditional convolutional networks to pre-trained transformers in computer vision.

  • The synergy between these advanced transformers and LLMs leading to enhanced image classification and other tasks.

  • Insight into Scattering Vision Transformers (SVT), detailing their development, technical aspects, and performance.

  • Demonstration of SVT's leading performance in tasks like image classification (ImageNet dataset) and instance segmentation (MSCoco dataset).

More about the keynote speaker

Dr. Vijay Srinivas Agneeswaran, Sr. Director and ML Research Leader at Microsoft, brings over two decades of expertise in AI, machine learning, and data science. Holding a Ph.D. from IIT Madras and a postdoctoral from EPFL, his specializations include computer vision, efficient transformers, and large language models. At Microsoft, he has led pioneering research in AI for C+AI data and developed spectral transformers for computer vision, showcased at NeurIPS 2023. He is a champion of responsible AI, ensuring compliance for nearly 50 AI models, and has led teams in organizations like Walmart Global Tech, Oracle, and Cognizant, attesting to his significant industry impact. Dr. Agneeswaran also holds five US patents and is a prolific contributor to tech conferences and publications.

Last updated