Looking at People: The past, the present and the future
Tutorial at CVPR 2012, Providence, Rhode Island, USA, 2012
|Organizers:||Thomas B. Moeslund (email@example.com), Aalborg University, Denmark|
|Leonid Sigal (firstname.lastname@example.org), Disney Research, Pittsburgh, USA|
|Adrian Hilton (A.Hilton@surrey.ac.uk), University of Surrey, UK|
|Volker Krüger (email@example.com), Aalborg University, Denmark|
|Intructors:||Aaron Bobick, Georgia Tech, USA|
|Amit Roy Chowdhury, UC Riverside, USA
|Jeffrey Cohn, CMU, USA
|Rogerio Feris, IBM T.J. Watson Research Center, New York|
|David Fleet, University of Toronto, Canada
|Shaogang Gong, Queen Mary University, UK
|Raghuraman Gopalan (on behalf of Rama Chellappa), AT&T Labs-Research, USA|
|Haowei Liu, Intel Santa Clara|
|Deva Ramanan, UC Irvine, USA|
|Fernando De la Torre, CMU,
|Mohan Trivedi, UC San Diego, USA|
|Time:||June 21st, 2012
|Duration:||Full-day (~8 hours)|
Visual Analysis of Humans: Looking at People published by Springer in 2012. The book is a collection of chapters that are written by the top experts in the field; the organizers of the tutorial are also the editors of the upcoming book. The list of contributing authors and content of the book can be found here. The book is intended to serve the dual purpose of being a reference and a tutorial to the people entering the field. Because this tutorial is an extension of this idea, it will similarly consists of a series of talks by experts in the corresponding fields. Tutorial will be broken down into 4 parts: (1) detection and tracking, (2) articulated pose estimation and tracking, (3) activity recognition, and (4) applications. In each part we will have 2-3 invited lecturers. Each invited lecturer will give a talk on a focused subject within a larger context of looking at people lasting roughly 35 minutes. The lectures will be geared towards general CV audience and will outline the key advances and future challenges in the problems involved. The rough schedule, list of the proposed invited lecturers, and the topics covered are listed below.
- [8:40 - 8:50] Introduction, motivation and welcome remarks by the organizers
- [8:50 - 10:00] Detection and tracking
coffee break (30 minutes)
- [10:30 - 11:40] Articulated pose estimation and tracking
- [11:40 - 12:15] Activity recognition
- On human action (by Aaron Bobick)
lunch (1h 35min)
- [1:50 - 3:00] Activity recognition
coffee break (30 minutes)
- [3:30 - 4:40] Applications
Recently, Dr. Bobick has also explored the development of interactive environments where advanced sensing modalities provide input based upon the users' actions and, hopefully, intentions. The intriguing element of interactive environments is that the context of the situation can be exploited in the interpretation of the user's behavior. An example of such an environment is the KidsRoom, the world's first, interactive narrative play-space for children. The room employed large-scale video and sound to take the children through a fantasy story; all the sensing was accomplished using computer vision. A more current and ambitious project is the Aware Home Research Initiative. The goal of that effort is to impart sufficient perception and interface capabilities to a house such that it can enhance the quality of life of the inhabitants. A domestic setting provides a wealth of contextual information that will be needed to assist in understanding the activities of the people within.
David J Fleet received the PhD in Computer Science from the University of Toronto in 1991. He was on faculty at Queen's University in Kingston from 1991 to 1998, and then Area Manager and Research Scientist at the Palo Alto Research Center (PARC) from 1999 to 2003.
In 2004 he joined the University of Toronto as Professor of Computer Science.
His research interests include computer vision, image processing, visual perception, and visual neuroscience. He has published research articles, book chapters and one book on various topics including the estimation of optical flow and stereoscopic disparity, probabilistic methods in motion analysis, modeling appearance in image sequences, motion perception and human stereopsis, hand tracking, human pose tracking, latent variable models, and physics-based models for human motion analysis. In 1996 Dr. Fleet was awarded an Alfred P. Sloan Research Fellowship for his work on computational models of perception. He has won paper awards at ICCV 1999, CVPR 2001, UIST 2003, BMVC 2009. In 2010 he was awarded the Koenderink Prize for his work with Michael Black and Hedvig Sidenbladh on human pose tracking. He has served as Area Chair for numerous computer vision and machine learning conference. He was Program Co-chair for the 2003 IEEE Conference on Computer Vision and Pattern Recognition. He will be Program Co-Chair for the 2014 European Conference on Computer Vision. He has been Associate Editor, and Associate Editor-in-Chief for IEEE TPAMI, and currently serves on the TPAMI Advisory Board.
Raghuraman Gopalan is a senior member of technical staff at the AT&T Labs-Research. He received his Ph.D. in Electrical and Computer Engineering at the University of Maryland, College Park in 2011. His research interests are in computer vision and machine learning, with a specific focus on object recognition and video understanding problems.
Haowei Liu is a research engineer in Perceptual Computing Group, Intel Santa Clara. He received his PhD degree from University of Washington in June, 2011. He has interned in major research organizations during his PhD study including Intel Lab Seattle and IBM T.J. Watson Research Center. Prior to his PhD study, he was a software design engineer in Microsoft. He holds an MS and BS in Computer Science from University of California, San Diego and National Taiwan University.
Deva Ramanan Deva Ramanan is an assistant professor of Computer Science and the co-director of the Computational Vision Lab at the University of California at Irvine. Prior to joining UCI, he was a Research Assistant Professor at the Toyota Technological Institute at Chicago (2005-2007). He also held visiting researcher positions in the Robotics Institute at Carnegie Mellon University in 2006 and Microsoft Research in 2008. He received his B.S. degree with distinction in computer engineering from the University of Delaware in 2000, graduating summa cum laude. He received his Ph.D. in Electrical Engineering and Computer Science with a Designed Emphasis in Communication, Computation, and Statistics from UC Berkeley in 2005. His research interests span computer vision, machine learning, and computer graphics, with a focus on the application of understanding people through images and video. His past work focused on articulated tracking, while recent work has focused on object recognition. His work in this area won or received special recognition at the PASCAL Visual Object Class Challenge, 2007-2010, including a Lifetime Achievement Prize in 2010. His work on contextual object modeling won the 2009 David Marr prize. He was awarded an NSF Career Award in 2010. His work is supported by NSF, ONR, DARPA, as well as industrial collaborations with the Intel Science and Technology Center for Visual Computing, Google Research, and Microsoft Research. He serves on the editorial board of the International Journal of Computer Vision (IJCV), is a senior program committee member for the IEEE Conference of Computer Vision and Pattern Recognition (CVPR), and has served on multiple NSF panels for computer vision and machine learning.
Mohan Trivedi received his PhD in Electrical Engineering from Utah State University in 1979, after completing undergraduate work in India. At Utah State, he received a Graduate Research Scholarship, and went on to teach at .... He has published extensively and has edited over a dozen volumes including books, special issues, video presentations, and conference proceedings. Trivedi is a recipient of the Pioneer Award and the Meritorious Service Award from the IEEE Computer Society; and the Distinguished Alumnus Award from Utah State University. He is a Fellow of the International Society for Optical Engineering (SPIE). He is a founding member of the Executive Committee of the UC System-wide Digital Media Innovation Program (DiMI). Trivedi is also Editor-in-Chief of Machine Vision & Applications.
Adrian Hilton is Professor of Computer Vision and Graphics and Head of the Visual Media Research Group at the University of Surrey, UK. Over the past decade he has published over 100 refereed journal and international conference research articles in robust computer vision techniques to build models of real world objects from images to meet the requirements of the entertainment and communication industries. Scientific contributions have been recognized by two journal and one conference best paper awards. Innovative contributions of this research led to the first commercial hand-held 3D scanner and the first system for capturing animated models of people have been recognized through two EU IST Awards for Innovation, a DTI Manufacturing Industry Achievement Award and a Computer Graphics World Innovation Award. He currently serves as an area editor for the journal Computer Vision and Image Understanding, the EPSRC Peer Review College for UK funding applications and the Executive of the IEE Professional Network in Multimedia Communications. He is a Chartered Engineer and member of IEE, IEEE and ACM.