The superior performance of natural over artificial
intelligence rests on the ability of the human brain to integrate and
process complex sensory information for useful actions. Future advances
in our understanding of the human brain will need integrating approaches
across disciplines, including psychology, computer science, robotics,
In Bülthoff´s department a group of about 70 biologists, computer scientists, mathematicians, physicists and psychologists study cognitive processes including object recognition and categorization, perception and action in virtual environments, human-robot interaction and perception, computer graphics and computer vision. Traditional psychophysical methods emphasize the analysis of perception using simple stimuli; however, computer vision studies have made it clear that further advances in our understanding of perception and cognition will rely on the use of realistic stimuli and tasks.
In our new Cyberneum building we use methods developed from computer graphics and virtual reality to build simulated naturalistic environments under precise experimental control in order to investigate cognition in a closed perception-action loop. In psychophysical studies we could show that humans can integrate multimodal sensory information in a statistically optimal way, in which cues are weighted according to their reliability.
Many of our results from basic research in perception and cognition are further developed into useful application. Our group leads or participates in several of the European Research projects: myCopter, SUPRA, TANGO, VR-Hyperspace.
|Google Scholar Citations|
|Publication list and citation metrics from ISI|
Research fields in my department are:
Recognition and Categorization
Ongoing EU projects:
myCopter - Enabling Technologies for Personal Aerial Transportation Systems
SUPRA - Simulation of Upset Recovery in Aviation
TANGO - Emotional interaction grounded in realistic context
VR-HYPERSPACE - research and development leading to a paradigm shift in relation to passenger comfort
Heinrich Bülthoff is scientific member of the Max Planck Society and director at the Max Planck Institute for Biological Cybernetics in Tübingen. He is head of the Department Human Perception, Cognition and Action in which a group of about 70 researchers investigate psychophysical and computational aspects of higher level visual processes in object and face recognition, sensory-motor integration, spatial cognition, and perception and action in virtual environments. He holds a Ph.D. degree in the natural sciences from the Eberhard-Karls-Universität in Tübingen. From 1980 to 1988 he worked as a research scientist at the Max Planck Institute for Biological Cybernetics and the Massachusetts Institute of Technology. He was Assistant, Associate and Full Professor of Cognitive Science at Brown University in Providence from 1988-1993 before becoming director at the Max Planck Institute for Biological Cybernetics. He is Honorary Professor at the Eberhard-Karls-Universität (Tübingen) and Korea University (Seoul) and Editor of several international journals. Heinrich Bülthoff is involved in many international collaborations and member of several European research networks. He has participated in many projects funded by the European Commission and is currently leading the EU project myCopter.
|Full CV (English)||Google Scholar Citations|
|•||Bülthoff HH (August-28-2014) Invited Lecture: My home is my airport, International Aviation and Space Symposium (AIR14), Payerne, Switzerland. |
|•||Bülthoff HH , de la Rosa S and Chang D-S (August-2014) Abstract Talk: Action recognition and the semantic meaning of actions: How does the brain categorize different social actions?, 12th Biannual Conference of the German Cognitive Science Society (KogWis 2014), Tübingen, Germany. |
|•||Bülthoff HH (June-18-2014) Invited Lecture: Projekt Mycopter: Die Autos der Zukunft werden fliegen, 13. Zukunftskongress 2014, Wolfsburg, Germany. |
|•||Bülthoff HH and de la Rosa S (May-16-2014) Keynote Lecture: What are you doing? Recent advances in visual action recognition research, 14th Annual Meeting of the Vision Sciences Society (VSS 2014), St. Pete Beach, FL, USA14 9. |
The visual recognition of actions is critical for humans when interacting with their physical and social environment. The unraveling of the underlying processes has sparked wide interest in several fields including computational modeling, neuroscience, and psychology. Recent research endeavors on how people recognize actions provide important insights into the mechanisms underlying action recognition. Moreover, they give new ideas for man-machine interfaces and have implications for artificial intelligence. The aim of the symposium is to provide an integrative view on recent advances in our understanding of the psychological and neural processes underlying action recognition. Speakers will discuss new and related developments in the recognition of mainly object- and human-directed actions from a behavioral, neuroscientific, and modeling perspective. These developments include, among other things, a shift from the investigation of isolated actions to the examination of action recognition under more naturalistic conditions including contextual factors and the human ability to read social intentions from the recognized actions. These findings are complemented by neuroscientific work examining the action representation in motor cortex. Finally, a novel theory of goal-directed actions will be presented that integrates the results from various action recognition experiments. The symposium will first discuss behavioral and neuroscientific aspects of action recognition and then will shift its attention to the modeling of the processes underlying action recognition. More specifically, Nick Barraclough will present research on action recognition using adaptation paradigms and object-directed and locomotive actions. He will talk about the influence of the observer's mental state on action recognition using displays that present the action as naturalistic as possible. Cristina Becchio will talk about actions and their ability to convey social intentions. She will present research on the translation of social intentions into kinematic patterns of two interacting persons and discuss the observers' ability to visually use these kinematic cues for inferring social intentions. Stephan de la Rosa will focus on social actions and talk about the influence of social and temporal context on the recognition of social actions. Moreover, he will present research on the visual representation underlying the recognition of social interactions. Ehud Zohary will discuss the representation of actions within the motor pathway using fMRI and the sensitivity of the motor pathway to visual and motor aspects of an action. Martin Giese will wrap up the symposium by presenting a physiologically plausible neural theory for the perception of goal-directed hand actions and discuss this theory in the light of recent physiological findings. The symposium is targeted towards the general VSS audience and provides an comprehensive and integrative view about an essential ability of human visual functioning.
|•||Bülthoff HH , de la Rosa S and Streuber S (May-16-2014) Abstract Talk: The influence of context on the visual recognition of social
actions, 14th Annual Meeting of the Vision Sciences Society (VSS 2014), St. Pete Beach, FL, USA, Journal of Vision14 (10) 1469. |
Actions do not occur out of the blue. Rather, they are often a part of human interactions and are, therefore, embedded in an action sequence. Previous research on visual action recognition has primarily focused on elucidating the perceptual and cognitive mechanisms in the recognition of individual actions. Surprisingly, the social and temporal context, in which actions are embedded, has received little attention. I will present studies examining the importance of context on action recognition. Specifically, we examined the influence of social context (i.e. competitive vs. cooperative interaction settings) on the observation of actions during real life interactions and found that social context modulates action observation. Moreover, we investigated the perceptual and temporal factors (i.e. action context as provided by visual information about preceding actions) on action recognition using an adaptation paradigm. Our results provide evidence that experimental effects are modulated by temporal context. These results in the way that action recognition is not guided by the immediate visual information but also by temporal and social contexts.
|•||Bülthoff HH (March-28-2014) Invited Lecture: From Flying Robots to Flying Cars, Technische Universität Kaiserslautern: Wahrnehmung - Public talk series, Kaiserslautern, Germany. |
|•||Bülthoff HH (December-4-2013) Invited Lecture: Novel Technologies for a Personal Air Transport System, Korea Aerospace Research Institute (KARI), Daejeon, South Korea. |
Our brain is constantly processing a vast amount of sensory and intrinsic information in order to understand and interact with the world around us. In my department at the Max Planck Institute for Biological Cybernetics in Tübingen and also in my research group in the Biological Cybernetics Lab at Korea University we aim to best model human perception and action and to test these models to predict human action for example in the context of driving and flying. To this end, we use systems and control theory, computer vision, and psychophysical techniques while conducting experiments with the most advanced state of the art motion simulators. I will briefly present our research philosophy of basic research at the Max Planck Institute before presenting a novel framework to overcome the congestion problems with current ground-based transportation. In the myCopter project (www.mycopter.eu) we study together with other European partners the enabling technologies for traveling between homes and working places, and for flying in swarms at low altitude in urban environments. The project focuses on three research areas: human-machine interfaces and training, automation technologies, and social acceptance. Within the project, developments for automation technologies have focused on vision-based algorithms. We have integrated such algorithms in the control and navigation architecture of unmanned aerial vehicles (UAVs). Detecting suitable landing spots from monocular camera images recorded in flight has proven to reliably work off-line, but further work is required to be able to use this approach in real time. Furthermore, we have built multiple low-cost UAVs and equipped them with sensors to test collision avoidance strategies in real flight. Such algorithms are currently under development and will take inspiration from crowd simulations. Finally, using technology assessment methodologies, we have assessed potential markets for PAVs and challenges for its integration into the current transportation system. This will lead to structured discussions on expectations and requirements of potential PAV users.
|•||Bülthoff HH (November-11-2013) Keynote Lecture: The Cybernetics of Aerial Machines: From Perception and Action for Aerial Robots to a Transport System based on Personal Aerial Vehicles, 3rd IFAC Symposium on Telematics Applications (TA 2013), Seoul, South Korea. |
Our brain is constantly processing a vast amount of sensory and intrinsic information in order to understand and interact with the world around us. In my department at the Max Planck Institute for Biological Cybernetics in Tübingen and also in my research group in the Biological Cybernetics Lab at Korea University we aim to best model human perception and action and to test these models to predict human action for example in the context of driving and flying. To this end, we use systems and control theory, computer vision, and psychophysical techniques while conducting experiments with the most advanced state of the art motion simulators. I will present two examples to illustrate our research philosophy, the first in the area of Telepresence and the second about the enabling technologies of futuristic transportations systems: (1) An ideal telepresence system should enable the user to perceive and act on the remote environment as if sensed directly. In this context, we study new ways to interface human operators and teams of autonomous remote robots in a shared bilateral control architecture. (2) A novel framework to overcome the congestion problems with current ground-based transportation is a personal air transport system (PATS). In the myCopter project (www.mycopter.eu) we study together with other European partners the enabling technologies for traveling between homes and working places, and for flying in swarms at low altitude in urban environments. All our efforts are guided by the accepted vision that in the future humans and machines will seamlessly cooperate in shared or remote spaces, thus becoming an integral part of our daily life. For instance, robots or vehicles should be able to autonomously reason about their remote environment, i.e., to possess a significant level of autonomy in order to perform local tasks and take decisions.
|•||Bülthoff HH (September-27-2013) Invited Lecture: MYCOPTER: Enabling Technologies for Personal Air-Transport Systems , AirTN Forum: Enabling and promising technologies for achieving the goals of Europe's Vision Flightpath 2050, Cranfield, UK. |
|•||Bülthoff HH (July-5-2013) Keynote Lecture: Wahrnehmen und Handeln aus kybernetischer Sicht: Implikationen für sozio-technische Systeme, Konferenz für Wirtschafts- und Sozialkybernetik 2013, Bern, Switzerland. |
|•||Bülthoff HH (June-29-2013) Invited Lecture: Virtual Reality and Simulation Research in the Max Planck Cyberneum, Workshop on Human Perception in Virtual Environments, York University, Toronto, Canada. |
|•||Bülthoff HH (June-23-2013) Invited Lecture: Und wenn wir einfach zur Arbeit fliegen?, Fachforum auf dem Heliday 2013, Kelheim, Germany. |
|•||Bülthoff HH (June-22-2013) Invited Lecture: Neue Konzepte für Autopiloten durch wahrnehmungsbasierte Flugsimulationen, Rollout des neuen Fama-Jetkopters K209, Giebelstadt, Germany. |
|•||Bülthoff HH (June-20-2013) Invited Lecture: Cyberneum reloaded: Virtual Reality and Simulation Research, Opening presentation for the new Cyberneum building at the Max Planck Campus, Tübingen, Germany. |
|•||Bülthoff HH , de la Rosa S , Curio C , Streuber S and Giese M (May-11-2013) Abstract Talk: Visual adaptation aftereffects to actions are modulated by
high-level action interpretations, 13th Annual Meeting of the Vision Sciences Society (VSS 2013), Naples, FL, USA, Journal of Vision13 (9) 126. |
Action recognition is critical for successful human interaction. Previous research highlighted the importance of the motor system to visual action recognition. Little is known about the visual tuning properties of processes involved in action recognition. Here we examined the visual tuning properties of processes involved in action recognition by means of a behavioral adaptation paradigm. Participants looked at an adaptor image (showing a person hitting or waving) for 4s and subsequently categorized a briefly presented test image as either hitting or waving. The test images were sampled from a video sequence showing a person moving from a hitting to a waving pose. We found the perception of the ambiguous test image to be significantly biased away from the adapted action (action adaptation aftereffect (AAA)). In subsequent experiments we investigated the origin of the AAA. The contrast inversion and mirror flipping of the adaptor image relative to the test images did not abolish the AAA suggesting that local contrastive sensitive units are not solely responsible for the AAA. Similarly the AAA was present when we chose adaptor images that were equated in terms of their emotional content indicating that the AAA is not merely mediated by units sensitive to the emotional content of an action. Moreover presenting words (e.g. "hitting" or "waving") instead of images as adaptors led to the disappearance of the AAA providing evidence that abstract high level linguistic cues about actions alone did not induce the AAA. Finally we changed the action interpretation of the adaptors leaving their physical properties unchanged by means of priming. We found that the priming of the action interpretation of the adaptors modulated the size of the AAA. Im summary these results suggest that mechanisms underlying action recognition are particularly sensitive to the high-level interpretation of an action.
|•||Bülthoff HH (December-12-2012) Invited Lecture: Flying Robots and Flying Cars, Korea Advanced Institute of Science and Technology: Robotics and Simulation Laboratory, Daejeon, South Korea. |
|•||Bülthoff HH , Mulder M and Nieuwenhuizen FM (November-29-2012) Abstract Talk: Changes in Pilot Control Behaviour across Stewart Platform Motion Systems, Autumn Flight Simulation Conference: Flight Simulation Research New Frontiers, London, UK. |
Low-cost motion systems have been proposed for certain training tasks that would otherwise be performed on high-performance full flight simulators. These systems have shorter stroke actuators, lower bandwidth, and higher noise. The influence of these characteristics on pilot perception and control behaviour is unknown, and can be investigated by simulating a model of a simulator with limited capabilities on a high-end simulator. The platform limitations, such as a platform filter, time delay, and simulator noise characteristics, can then be removed one by one and their effect on control behaviour studied in isolation. By applying a cybernetic approach, human behaviour can be measured objectively in target-following disturbance-rejection control tasks. Experimental results show that small changes in time delay and simulator noise characteristics do not negatively affect human behaviour in these tasks. However, the motion system bandwidth has a significant effect on performance and control behaviour. Participants barely use motion cues when these have a low bandwidth, and instead rely on visual cues to generate lead to perform the control task. Therefore, simulator motion cues must be considered carefully in piloted control tasks in simulators and measured results depend on simulator characteristics as pilots adapt their control behaviour to the available cues.
|•||Bülthoff HH (November-26-2012) Invited Lecture: Cognitive Science and its Impact on Future Convergence Technology, Future Convergence Technology Forum & Exhibition 2012, Seoul, South Korea. |
|•||Bülthoff HH (November-6-2012) Invited Lecture: What Computer Vision and Computer Graphics can learn about Faces from Human Psychophysics , ACCV 2012 Workshop on Face Analysis: The Intersection of Computer Vision and Human Perception, Daejeon, South Korea. |
|•||Bülthoff HH , Chuang L and Nieuwenhuizen F (November-2012) Abstract Talk: myCopter: Enabling Technologies for Personal Aerial Transportation Systems A progress report, 4th International HELI World Conference at the International Aerospace Supply Fair AIRTEC 2012 , Frankfurt a.M., Germany. |
The volume of both road and air transportation continues to increase despite many concerns regarding its financial and environmental impact. The European Union ‘Out of the Box’ study suggests a personal aerial transportation system (PATS) as an alternative means of transport for daily commuting. The aim of the myCopter project is to determine the social and technical aspects needed to set up such a transportation system based on personal aerial vehicles (PAVs). The project focuses on three research areas: the human-machine interface and training, automation technologies, and social acceptance. In the first phase of the project, requirements were defined for automation technologies in terms of sensors and test platforms. Additionally, desirable features for PAVs were investigated to support the design and evaluation of technologies for an effective human-machine interface. Furthermore, an overview of the social-technological environment provided insight into the challenges and issues that surround the realisation of a PATS and its integration into the current transportation system in Europe. The presentation will elaborate on the second phase of the myCopter project, in which initial designs for a human-machine interface and training are developed. These are evaluated experimentally with a focus on aiding non-expert pilots in closed-loop control scenarios. Additionally, first evaluations of novel automation technologies are performed in simulated environments and evaluations on flying test platforms. At the same time, technological issues are evaluated that contribute towards a reflexive design of PAV technologies based on criteria that are acceptable to the general public. The presentation will also focus on the next stages of the project, in which further experimental evaluations will be performed on technologies for human-machine interfaces, and where developed automation technologies will be fully tested on unmanned flying vehicles. The expectations and perspectives of potential PAV user will be evaluated in group interviews in different European countries. Interesting technological and regulatory challenges need to be resolved for the development of a transportation system based on PAVs. The myCopter consortium combines the expertise from several research fields to tackle these challenges and to develop the technological and social aspects of a personal aerial transportation system.
|•||Bülthoff HH , Mohler BJ and Volkova EP (November-2012) Abstract Talk: Motion Capture of Emotional Body Language in Narrative Scenarios, 13th Conference of the Junior Neuroscientists of Tübingen (NeNA 2012), Schramberg, Germany13 9. |
We interact with the world we live in by moving in it. The interaction is versatile and includes communications through speech and gestures, which serve as media to transmit ideas and emotions. A narrator, be it a professional actor on the stage or a friend telling an anecdote, expresses her ideas (the content) and feelings (the emotional colouring) through the choice of words and syntactical structures, her prosody, facial expressions and body language. Our present focus is on emotional body language, which became a field of intensive research several decades ago. Before psychopsysical experiments or trajectory analysis can take place, a set of mocap (motion capture) data has to be accumulated. This can be done with different equipment setups and by now human motion can be captured fairly precisely at a high frame rate. One of the major decisions for the researchers however is the choice of scenarios according to which the actors are to perform motion. This question is especially tricky when we deal with emotions, since the problems of sincerity and naturalness come into play. There are several ways to induce emotions and moods in people, but for motion capture the socalled imagination technique has been used most frequently. The actors are asked to evoke an emotion in themselves by recalling a past event. The main drawbacks of this technique in mocap are the following: (1) it is still impossible to ensure that the emotions are sincere and the motion is natural and not artificial or exaggerated; (2) the emotional categories often rapidly succeed each other in random fashion; (3) the emotional scenarios can be very abstract and taken out of context.We have developed an experimental setup where the emotional body language can be captured in a maximally natural yet controlled manner. The participants are asked to imagine they are narrating a fairy-tale to children. They perform several tasks on the text before their acting in recorded. The setup allows the actors to narrate the story at their own pace, move freely and does not require them to learn the text by heart, yet the recorded data can be easily extracted and processed after the motion capture session. The resulting extracted data can then analysed for various features or used in perceptual experiments.
|•||Bülthoff HH and Nieuwenhuizen F (October-24-2012): myCopter – Enabling Technologies for Personal Aerial Transportation Systems , Joint EU–US Workshop on Small Aircraft and Personal Planes Systems, Brussels, Belgium. |
|•||Bülthoff HH (October-15-2012) Keynote Lecture: A Cybernetics Approach to Perception and Action, IEEE International Conference on Systems, Man, and Cybernetics (SMC 2012), Seoul, South Korea. |
|•||Bülthoff HH (October-14-2012) Invited Lecture: The MPI View on Shared Control, SMC 2012 Workshop on Shared Control, Seoul, South Korea. |
|•||Bülthoff HH (October-10-2012) Invited Lecture: Flying Robots and Flying Cars, College of Information and Communications: Korea University, Seoul, South Korea. |
We all know that our brain is constantly processing a vast amount of sensory and intrinsic information with which our behavior is coordinated accordingly. Interestingly, how the brain actually does it is less well understood. At the Max Planck Institute for Biological Cybernetics in Germany we aim to best model human perception and action and to test these models to predict human action for example in the context of driving and flying. To this end, we use systems and control theory, computer vision, and psychophysical techniques while conducting experiments with the most advanced state of the art motion simulators. In my talk I will present two examples that illustrate our research philosophy: (1) a telepresence scenario with flying robots (quadcopters) in which we study new ways to interface human operators and teams of autonomous remote robots in a shared bilateral control architecture. (2) a futuristic transportation scenario based on a European project (www.mycopter.eu) in which we are studying the enabling technologies for flying between homes and work place in swarms at low altitude. Our efforts are guided by the vision that in the future humans and machines will seamlessly cooperate in shared or remote spaces, and thus robots or flying cars become an integral part of our daily life.
|•||Nolan H, Butler JS , Whelan R, Bülthoff HH , Desanctis P, Reilly O and Foxe J (October-2012) Abstract Talk: High-density electrical mapping during active and passive self-motion , 42nd Annual Meeting of the Society for Neuroscience (Neuroscience 2012), New Orleans, LA, USA42 (828.06) . |
The perception of self-motion is a product of the integration of information from both visual and nonvisual cues, to which the vestibular system is a central contributor. It is well documented that self-motion dysfunction leads to impaired movement and balance, dizziness and falls, and yet our knowledge of the neuronal processing of self-motion signals remains relatively sparse. Here we present two studies extending an emerging line of research trying to obtain electroencephalographic (EEG) recordings while participants engage in real-world tasks. The first study investigated the feasibility of acquiring high-density event-related brain potential (ERP) recordings during treadmill walking. Participants performed a visual response inhibition task - designed to evoke a P3 component for correct response inhibitions and an error-related negativity (ERN) for incorrect commission errors - while speed of walking was experimentally manipulated. Robust P3 and ERN components were obtained under all experimental conditions - while participants were stationary, walking at moderate speed (2.4 km/hour), or walking rapidly (5km/hour). Signal-to-noise ratios were remarkably similar across conditions, pointing to the feasibility of high-fidelity ERP recordings under relatively vigorous activity regimens. In the second study, high-density electroencephalographic recordings were deployed to investigate the neural processes associated with vestibular detection of changes in heading. Participants were translated linearly 7.8 cm on a motion platform using a one second motion profile, at a 45 angle leftward or rightward of straight ahead. These headings were presented with a stimulus probability of 80-20 %. Participants responded when they detected the infrequent direction change via button-press. Statistical parametric mapping showed that ERP to standard and target movements differed significantly from 490 to 950 ms post-stimulus. Topographic analysis showed that this difference had a typical P3 topography. These studies provide highly promising methods for gaining insight into the neurophysiological correlates of self-motion in more naturalistic environmental settings.
|•||Bülthoff HH , Venrooij J and Nieuwenhuizen FM (September-26-2012): What if we simply fly to work? myCopter – Enabling Technologies for Personal Aerial Transportation Systems, Deutsch-Italienische Handelskammer: Workshop zur Investorengewinnung, Vicenza, Italy. |
|•||Bülthoff HH , Venrooij J and Nieuwenhuizen FM (September-25-2012): What if we simply fly to work? myCopter – Enabling Technologies for Personal Aerial Transportation Systems, Deutsch-Italienische Handelskammer: Workshop zur Investorengewinnung, Torino, Italy. |
|•||Bülthoff HH and Nieuwenhuizen F (September-18-2012) Invited Lecture: „Und wenn wir einfach zur Arbeit fliegen“ – Sind fliegende Autos ein Verkehrsmittel der Zukunft?
, 127. Versammlung der Gesellschaft Deutscher Naturforscher und Ärzte e.V. (GDNÄ) , Göttingen, Germany. |
Ein allmorgendliches Szenario: Stau auf den Autobahnen, die Hauptverkehrsstraßen der Städte sind verstopft, Züge und Busse sind hoffnungslos überfüllt. Der Pendlerverkehr ist längst an seine Grenzen gestoßen und Abhilfe kann der Ausbau des bestehenden Verkehrsnetzes nur noch bedingt schaffen. Vielerorts fehlt es einfach an dem benötigten Platz für neue Straßen und auch die Instandhaltung bestehender kostet schon Unsummen. Doch wie sehen die Alternativen aus? Ganz einfach: Der Individualverkehr hebt ab in die dritte Dimension! Diese Vision verfolgt Prof. Heinrich Bülthoff vom Max-Planck-Institut für biologische Kybernetik in Tübingen mit dem EU-Projekt „myCopter“. Ziel ist nicht, ein fliegendes Auto zu bauen, sondern vielmehr die technischen und gesellschaftlichen Bedingungen zu klären, unter denen diese zu einem von der Gesellschaft akzeptierten und brauchbaren Verkehrsmittel werden könnten. Damit wird – in hoffentlich nicht allzu ferner Zukunft - unser Weg zur Arbeit wieder entspannter sein. Zum Konsortium gehören neben dem MPI für biologische Kybernetik, die Universität Liverpool, die École Polytechnique in Lausanne, die ETH Zürich, das Karlsruher Institut für Technologie und das Deutsche Zentrum für Luft- und Raumfahrt.
|•||Bülthoff HH (September-14-2012) Invited Lecture: What do we read from a face? The role culture and expertise , 2012 World Class University International Conference (WCU IC), Seoul, South Korea. |
|•||Bülthoff HH , Mohler BJ and Dobricki M (September-7-2012) Abstract Talk: The ownership of a virtual body induced by visuo-tactile stimulation indicates the alteration of self-boundaries, 5th International Conference on Spatial Cognition (ICSC 2012), Roma, Italy, Cognitive Processing13 (Supplement 1) S18. |
|•||Bülthoff HH , Curio C , Giese M and de la Rosa S (September-2012) Abstract Talk: Motor-visual effects in the recognition of dynamic facial expressions, 35th European Conference on Visual Perception, Alghero, Italy, Perception41 (ECVP Abstract Supplement) 44. |
Current theories on action understanding suggest a cross-talk between the motor and the visual system during the recognition of other persons'actions. We examined the effect of the motor execution on the visual recognition of dynamic emotional facial expressions using an adaptation paradigm. Previous research on facial expression adaptation has shown that the prolonged visual exposure to a static facial expression biases the percept of an ambiguous static facial expression away from the adapted facial expression. We used a dynamic 3D computational face model (Curio et al, 2010, MIT Press, 47-65) to examine motor-visual interactions in the recognition of happy and fearful facial expressions. During the adaptation phase participants (1) looked for a prolonged amount of time at a facial expression (visual adaptation); (2) executed repeatedly a facial expression (motor adaptation); (3) imagined the emotion corresponding to a facial expression (imagine adaptor). In the test phase participants always had to judge an ambiguous facial expression as either happy or fearful. We found an adaptation effect in the visual adaptation condition, and the reversed effect (priming effect) in the motor and imagine condition. Inconsistent with simple forms of motor resonance, this shows antagonistic influences of visual and motor adaptation.
|•||Bülthoff HH , Schultz J , Kaulard K , de la Rosa S and Fernandez Cruz AL (September-2012) Abstract Talk: How are facial expressions represented in the human brain?, 35th European Conference on Visual Perception, Alghero, Italy, Perception41 (ECVP Abstract Supplement) 38. |
The dynamic facial expressions that we encounter every day can carry a myriad of social signals. What are the neural mechanisms allowing us to decode these signals? A useful basis for this decoding could be representations in which the facial expressions are set in relation to each other. Here, we compared the behavioral and neural representations of 12 facial expressions presented as pictures and videos. Behavioral representations of these expressions were computed based on the results of a semantic differential task. Neural representations of these expressions were obtained by multivariate pattern analysis of functional magnetic imaging data. The two kinds of representations were compared using correlations. For expression videos, the results show a significant correlation between the behavioral and neural representations in the superior temporal sulcus (STS), the fusiform face area, the occipital face area and the amygdala, all in the left hemisphere. For expression pictures, a significant correlation was found only in the left STS. These results suggest that of all tested regions, the left STS contains the neural representation of facial expressions that is closest to their behavioral representation. This confirms the predominant role of STS in coding changeable aspects of faces, which includes expressions.
|•||Bülthoff HH , Mohler BJ and Dobricki M (June-22-2012) Abstract Talk: The structure of self-experience during visuo-tactile stimulation of a virtual and the physical body , 13th International Multisensory Research Forum (IMRF 2012), Oxford, UK, Seeing and Perceiving25 (0) 214. |
The simultaneous visuo-tactile stimulation of an individual’s body and a virtual body (avatar) is an experimental method used to investigate the mechanisms of self-experience. Studies incorporating this method found that it elicits the experience of bodily ownership over the avatar. Moreover, as part of our own research we found that it has also an effect on the experience of agency, spatial presence, as well as on the perception of self-motion, and thus on self-localization. However, it has so far not been investigated whether these effects represent distinct categories within conscious experience. We stroked the back of 21 male participants for three minutes while they watched an avatar getting synchronously stroked within a virtual city in a head-mounted display setup. Subsequently, we assessed their avatar and their spatial presence experience with 23 questionnaire items. The analysis of the responses to all items by means of nonmetric multidimensional scaling resulted in a two-dimensional map (stress=0.151) on which three distinct categories of items could be identified: a cluster (Cronbach’s alpha=.89) consisting of all presence items, a cluster (Cronbach’s alpha=.88) consisting of agency-related items, and a cluster (Cronbach’s alpha=.93) consisting of items related to body ownership as well as self-localization. The reason that spatial presence formed a distinct category could be that body ownership, self-localization and agency are not reported in relation to space. Body ownership and self-localization belonged to the same category which we named identification phenomena. Hence, we propose the following three higher-order categories of self-experience: identification, agency, and spatial presence.
|•||Bülthoff HH , Robuffo Giordano P , Soyka F and Barnett Cowan M (June-22-2012) Abstract Talk: Temporal processing of self-motion: Translations are processed slower than rotations , 13th International Multisensory Research Forum (IMRF 2012), Oxford, UK, Seeing and Perceiving25 (0) 207-208 . |
Reaction times (RTs) to purely inertial self-motion stimuli have only infrequently been studied, and comparisons of RTs for translations and rotations, to our knowledge, are nonexistent. We recently proposed a model  which describes direction discrimination thresholds for rotational and translational motions based on the dynamics of the vestibular sensory organs (otoliths and semi-circular canals). This model also predicts differences in RTs for different motion profiles (e.g., trapezoidal versus triangular acceleration profiles or varying profile durations). In order to assess these predictions we measured RTs in 20 participants for 8 supra-threshold motion profiles (4 translations, 4 rotations). A two-alternative forced-choice task, discriminating leftward from rightward motions, was used and 30 correct responses per condition were evaluated. The results agree with predictions for RT differences between motion profiles as derived from previously identified model parameters from threshold measurements. To describe absolute RT, a constant is added to the predictions representing both the discrimination process, and the time needed to press the response button. This constant is approximately 160ms shorter for rotations, thus indicating that additional processing time is required for translational motion. As this additional latency cannot be explained by our model based on the dynamics of the sensory organs, we speculate that it originates at a later stage, e.g. during tilt-translation disambiguation. Varying processing latencies for different self-motion stimuli (either translations or rotations) which our model can account for must be considered when assessing the perceived timing of vestibular stimulation in comparison with other senses [2,3].
|•||Bülthoff HH , Robuffo Giordano P , Soyka F and Barnett-Cowan M (June-2012) Abstract Talk: Translations are processed slower than rotations: reaction times for self-motion stimuli predicted by vestibular organ dynamics, 27th Bárány Society Meeting, Uppsala, Sweden27 (0151) . |
Reaction times (RTs) to purely inertial self-motion stimuli have only infrequently been studied, and comparisons of RTs for translations and rotations, to our knowledge, are nonexistent. We recently proposed a model  which describes direction discrimination thresholds for rotational and translational motions based on the dynamics of the vestibular sensory organs. This model also predicts differences in RTs for different motion profiles (e.g., trapezoidal versus triangular acceleration profiles or varying profile durations). The model calculates a signal akin to the change in firing rate in response to a self-motion stimulus. In order to correctly perceive the direction of motion the intrinsic noise level of the firing rate has to be overcome. Based on previously identified model parameters from perceptual thresholds, differences in RTs between varying motion profiles can be predicted by comparing the times at which the firing rate overcomes the noise level. To assess these predictions we measured RTs in 20 participants for 8 supra-threshold motion profiles (4 translations, 4 rotations). A two-alternative forced-choice task, discriminating leftward from rightward motions, was used and 30 correct responses per condition were evaluated. The results are in agreement with predictions for RT differences between motion profiles. In order to describe absolute RT, a constant is added to the predictions representing both the discrimination process, and the time needed to press the response button. This constant is calculated as the mean difference between measurements and predictions. It is approximately 160ms shorter for rotations, thus indicating that additional processing time is required for translational motion. As this additional latency cannot be explained by our model based on the dynamics of the sensory organs, we speculate that it originates at a later stage, e.g. during tilt-translation disambiguation.
|•||Bülthoff HH (May-23-2012) Invited Lecture: The Cybernetic Approach to Perception and Action , CITEC Colloquium "Vision Science" - Universität Bielefeld, Bielefeld, Germany. |
|•||Bülthoff HH (March-7-2012) Invited Lecture: myCopter: Enabling Technologies for Personal Aerial Transportation Systems, Abu Dhabi Air Expo: Helicopter Conference , Abu Dhabi, United Arab Emirates. |
The helicopter is man’s best friend. Utilized all over the world for life saving missions, the helicopter is evolving rapidly to be able to integrate itself within the city limits. The Max Plank Institute of Technology is leading a research project funded by the European Union to identify future technologies allowing the use of helicopters within the cities. He shares the main goals of this research project. The audience will be able to interact with the various helicopter experts on all aspects of the use of rotary wing crafts today and tomorrow.
|•||Bülthoff HH (March-1-2012) Invited Lecture: Flying Robots and Flying Cars, 5th Schunk International Expertdays: Service Robotics, Hausen, Germany. |
|•||Bülthoff HH and Nieuwenhuizen FM (November-4-2011) Abstract Talk: myCopter: Enabling Technologies for Personal Aerial Transportation Systems, 3rd International HELI World Conference 2011 "HELICOPTER Technologies and Operations", Frankfurt a.M., Germany. |
|•||Bülthoff HH (November-2011): New Concepts for Personal Aerial Transportation Systems, 3rd International HELI World Conference 2011, Frankfurt a.M., Germany. |
|•||Bülthoff HH (October-5-2011) Invited Lecture: Science and Science Fiction: closing the loop between
Perception and Technology, Department of Brain and Cognitive Engineering, Korea University, Seoul, South Korea. |
|•||Bülthoff HH , Wallraven C , Gaissert N , Waterkamp S and van Dam L (October-2011) Abstract Talk: Efficient cross-modal transfer of shape information in visual and haptic object categorization, 12th International Multisensory Research Forum (IMRF 2011), Fukuoka, Japan, i-Perception2 (8) 822. |
Categorization has traditionally been studied in the visual domain with only a few studies focusing on the abilities of the haptic system in object categorization. During the first years of development, however, touch and vision are closely coupled in the exploratory procedures used by the infant to gather information about objects. Here, we investigate how well shape information can be transferred between those two modalities in a categorization task. Our stimuli consisted of amoeba-like objects that were parametrically morphed in well-defined steps. Participants explored the objects in a categorization task either visually or haptically. Interestingly, both modalities led to similar categorization behavior suggesting that similar shape processing might occur in vision and haptics. Next, participants received training on specific categories in one of the two modalities. As would be expected, training increased performance in the trained modality; however, we also found significant transfer of training to the other, untrained modality after only relatively few training trials. Taken together, our results demonstrate that complex shape information can be transferred efficiently across the two modalities, which speaks in favor of multisensory, higher-level representations of shape.
|•||Bülthoff HH (September-27-2011) Keynote Lecture: Plenary II: BioRobotics, IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2011), San Francisco, CA, USA. |
|•||Bülthoff HH (September-22-2011) Keynote Lecture: Perceptual Graphics: closing the loop between Perception, Graphics and Computer Vision, 19th Pacific Conference on Computer Graphics and Applications (Pacific Graphics 2011), Kaoshiung, Taiwan. |
In our Perceptual Graphics group at the Max Planck Institute for Biological Cybernetics we integrate methods from psychophysics, computer graphics and computer vision in order to understand fundamental perceptual and cognitive processes. The fusion of methods from these research areas has the potential to greatly advance our understanding of perception and cognition. Highly controllable, yet realistic computergenerated stimuli offer novel ways for psychophysical investigations. The results from those experiments can in turn be used to derive perceptual "shortcuts" to more efficient rendering approaches. Computer vision and machine learning algorithms can be used to model human cognition and action while conversely, the results from perceptual experiments can inform computer scientists how the brain solves problems and thus can lead to more efficient solutions of hard problems like recognition and categorization. In this presentation, I will highlight how the latest tools in computer vision, computer graphics, and virtual reality technology can be used to systematically understand the factors that determine how humans behave and solve tasks in realistic scenarios.
|•||Bülthoff HH , Mohler BJ and Linkenauger S (September-2011) Abstract Talk: Welcome to wonderland: The apparent size of the self-avatar hands and arms influences perceived size and shape in virtual environments, 34th European Conference on Visual Perception, Toulouse, France, Perception40 (ECVP Abstract Supplement) 46. |
Welcome to wonderland: The apparent size of the self-avatar hands and arms influences perceived size and shape in virtual environments S A Linkenauger, B J Mohler, H H Bülthoff According to the functional approach to the perception of spatial layout, angular optic variables that indicate extents are scaled to the body and its action capabilities [cf Proffitt, 2006 Perspectives on Psychological Science 1(2) 110–122]. For example, reachable extents are perceived as a proportion of the maximum extent to which one can reach, and the apparent sizes of graspable objects are perceived as a proportion of the maximum extent that one can grasp (Linkenauger et al, 2009 Journal of Experimental Psychology: Human Perceptiion and Performance; 2010 Psychological Science). Therefore, apparent sizes and distances should be influenced by changing scaling aspects of the body. To test this notion, we immersed participants into a full cue virtual environment. Participants’ head, arm and hand movements were tracked and mapped onto a first-person, self-representing avatar in real time. We manipulated the participants’ visual information about their body by changing aspects of the self-avatar (hand size and arm length). Perceptual verbal and action judgments of the sizes and shapes of virtual objects’ (spheres and cubes) varied as a function of the hand/arm scaling factor. These findings provide support for a body-based approach to perception and highlight the impact of self-avatars’ bodily dimensions for users’ perceptions of space in virtual environments.
|•||Bülthoff HH , Thornton IM , Canaird F and Mamassian P (September-2011) Abstract Talk: Exploring motion-induced illusory displacement using interactive games, 34th European Conference on Visual Perception, Toulouse, France, Perception40 (ECVP Abstract Supplement) 27-28. |
Motion-induced illusory displacement occurs when local motion within an object causes its perceived global position to appear shifted. Using two different paradigms, we explored whether active control of the physical position of the object can overcome this illusion. In Experiment 1, we created a simple joystick game in which participants guided a Gabor patch along a randomly curving path. In Experiment 2, participants used the accelerometer-based tilt control of the iPad to guide a Gabor patch through a series of discrete gates, as might be found on a slalom course. In both experiments, participants responded to local motion with overcompensating movements in the opposite direction, leading to systematic errors. These errors scaled with speed but did not vary in magnitude either within or across trials. In conclusion, we found no evidence that participants could adapt or compensate for illusory displacement given active control of the target.
|•||Bülthoff HH (August-31-2011): Brain and Cognitive Engineering: What can Engineers learn from Cognitive Scientists?, The 3rd International Symposium on Brain and Cognitive Engineering, Seoul, South Korea. |
|•||Bülthoff HH (August-11-2011) Keynote Lecture: Towards Artificial Systems: What Can We Learn From Human Perception?, Twenty-Fifth AAAI Conference on Artificial Intelligence (AAAI-11), San Francisco, CA, USA. |
Recent progress in learning algorithms and sensor hardware has led to rapid advances in artificial systems. However, their performance continues to fall short of the efficiency and plasticity of human behavior. In many ways, a deeper understanding of how humans process and act upon physical sensory information can contribute to the development of better artificial systems. In this presentation, Buelthoff will highlight how the latest tools in computer vision, computer graphics, and virtual reality technology can be used to systematically understand the factors that determine how humans behave and solve tasks in realistic scenarios.
|•||Bülthoff HH (July-30-2011) Invited Lecture: Wie kommt die Welt in den Kopf?: Von der Grundlagenforschung zur Anwendung, Lingelbachs Scheune – Optische Phänomene e.V., Abtsgmünd, Germany. |
|•||Bülthoff HH (July-6-2011) Keynote Lecture: Wahrnehmen, begreifen und handeln: Die Kommunikation des Menschen mit seinen Hifsmitteln, Tübinger Innovationstage 2011 der Industrie- und Handelskammer Reutlingen, Tübingen, Germany. |
|•||Bülthoff HH , Thornton IM , Mamassian P and Caniard F (July-2011) Abstract Talk: Active control does not eliminate motion-induced illusory displacement , 7th Asia-Pacific Conference on Vision (APCV 2011), Hong Kong, i-Perception2 (4) 209. |
When the sine-wave grating of a Gabor patch drifts to the left or right, the perceived position of the entire object is shifted in the direction of local motion. In the current work we explored whether active control of the physical position of the patch overcomes such motion induced illusory displacement. In Experiment 1 we created a simple computer game and asked participants to continuously guide a Gabor patch along a randomly curving path using a joystick. When the grating inside the Gabor patch was stationary, participants could perform this task without error. When the grating drifted to either left or right, we observed systematic errors consistent with previous reports of motion-induced illusory displacement. In Experiment 2 we created an iPad application where the built-in accelerometer tilt control was used to steer the patch through as series of “gates”. Again, we observed systematic guidance errors that depended on the direction and speed of local motion. In conclusion, we found no evidence that participants could adapt or compensate for illusory displacement given active control of the target.
|•||Bülthoff HH , Wallraven C , Armann R , Bülthoff I and Lee RK (July-2011) Abstract Talk: Investigating the other-race effect in different face recognition tasks, 7th Asia-Pacific Conference on Vision (APCV 2011), Hong Kong, i-Perception2 (4) 355. |
Faces convey various types of information like identity, ethnicity, sex or emotion. We investigated whether the well-known other-race effect (ORE) is observable when facial information other than identity varies between test faces. First, in a race comparison task, German and Korean participants compared the ethnicity of two faces sharing similar identity information but differing in ethnicity. Participants reported which face looked more Asian or Caucasian. Their behavioral results showed that Koreans and Germans were equally good at discriminating ethnicity information in Asian and Caucasian faces. The nationality of participants, however, affected their eye-movement strategy when the test faces were shown sequentially, thus, when memory was involved. In the second study, we focused on ORE in terms of recognition of facial expressions. Korean participants viewed Asian and Caucasian faces showing different facial expressions for 100ms to 800ms and reported the emotion of the faces. Surprisingly, under all three presentation times, Koreans were significantly better with Caucasian faces. These two studies suggest that ORE does not appear in all recognition tasks involving other-race faces. Here, when identity information is not involved in the task, we are not better at discriminating ethnicity and facial expressions in same race compared to other race faces.
|•||Bülthoff HH (June-20-2011) Invited Lecture: Science and Science Fiction: closing the loop between Cognition and Application, Università degli Studi di Genova, Genova, Italy. |
|•||Bülthoff HH and Nieuwenhuizen F (March-31-2011): myCopter: Enabling Technologies for Personal Aerial Transportation Systems, Sixth European Aerodays 2011, Madrid, Spain. |
|•||Bülthoff HH (January-11-2011): What can computer scientists learn from cognitive scientists?, Symposium “Defining Cognitive Informatics”, Wien, Austria. |
|•||Bülthoff HH (November-25-2010) Invited Lecture: Towards artificial systems: what can we learn from human perception
, The University of Hong Kong: Department of Psychology Seminar, Hong Kong, China. |
The question of how we perceive and interact with the world around us has been at the heart of cognitive and neuroscience research for the last decades. Despite tremendous advances in the field of computational vision made possible by the development of powerful learning techniques as well as the existence of large amounts of labeled training data for harvesting - artificial systems have yet to reach human performance levels and generaliza tion capabilities. In this talk I want to highlight some recent results from perceptual studies that could help to bring artificial systems a few steps closer to this grand goal. In particular, I focus on the issue of spatio-temporal object representations (dynamic faces), face synthesis, as well as the need for taking into account multisensory data in models of object categorization. Having understood the important role of haptic feedback for human perception, we also explored new ways of exploiting it for helping humans (pilots) in solving difficult control tasks. This recent work on human machine interfaces naturally extends to the case of autonomous or intelligent machines such as robots that are currently envisioned to be pervasive in our society and closely cooperate with humans in their tasks. In all of these perceptual research lines, the underlying research philosophy was to combine the latest tools in computer vision, computer graphics, and virtual reality technology in order to gain a deeper understanding of biological information processing. Conversely, I discuss how the perceptual results can feed back into the design of better and more efficient tools for artificial systems.
|•||Bülthoff HH , Wallraven C , de la Rosa S and Kaulard K (October-2010) Abstract Talk: Cognitive categories of emotional and conversational facial expressions are influenced by dynamic information, 11th Conference of Junior Neuroscientists of Tübingen (NeNa 2010), Heiligkreuztal, Germany11 16. |
Most research on facial expressions focuses on static, ’emotional’ expressions. Facial expressions, however, are also important in interpersonal communication (’conversational’ expressions). In addition, communication is a highly dynamic phenomenon and previous evidence suggests that dynamic presentation of stimuli facilitates recognition. Hence, we examined the categorization of emotional and conversational expressions using both static and dynamic stimuli. In a between-subject design, 40 participants were asked to group 55 different facial expressions (either static or dynamic) of ten actors in a free categorization task. Expressions were to be grouped according to their overall similarity. The resulting confusion matrix was used to determine the consistency with which facial expressions were categorized. In the static condition, emotional expressions were grouped as separate categories while participants confused conversational expressions. In the dynamic condition, participants uniquely categorized basic and sub-ordinate emotional, as well as several conversational facial expressions. Furthermore, a multidimensional scaling analysis suggests that the same potency and valence dimensions underlie the categorization of both static and dynamic expressions. Basic emotional expressions represent the most effective categories when only static information is available. Importantly, however, our results show that dynamic information allows for a much more fine-grained categorization and is essential in disentangling conversational expressions.
|•||Bülthoff HH (September-28-2010) Invited Lecture: Brain and Cognitive Engineering: What can Engineers learn from Cognitive Scientists?, Department of Brain and Cognitive Engineering, Korea University, Seoul, South Korea. |
This presentation will give an overview of current topics in the Biological Cybernetics labs at the Max Planck Institute in Tübingen and the Department of Brain and Cognitive Engineering at Korea University. Recent examples from our research on face and object recognition will highlight the importance of dynamic and multi-sensory information as well as active vision for recognition and show how perceptual research can contribute towards the development of better artificial systems.
|•||Bülthoff HH (September-13-2010) Invited Lecture: Towards artificial systems: what can we learn from human perception, Asia Pacific Center for Theoretical Physics (APCTP) Headquarters, Pohang, South Korea(Lecture 1442) . |
|•||Bülthoff HH (September-9-2010) Invited Lecture: Towards Artificial Systems: What can we learn from human perception?, Seoul National University, School of Computer Science and Engineering, Seoul, South Korea. |
|•||Bülthoff HH (September-8-2010) Invited Lecture: The Cybernetics Approach to Cognitive Engineering, Distinguished Lecture Series, Korea University, Seoul, South Korea. |
|•||Bülthoff HH (August-30-2010) Keynote Lecture: Towards artificial systems: what can we learn from human perception, 11th Pacific Rim International Conference on Artificial Intelligence (PRICAI 2010), Daegu, South Korea. |
|•||Bülthoff HH , Curio C , Engel D , Kottler VA, Malisi CU, Röttig M, Schultheiss SJ and Willing EM (August-2010) Abstract Talk: Optimizing minimal sketches of visual object categories, 33rd European Conference on Visual Perception, Lausanne, Switzerland, Perception39 (ECVP Abstract Supplement) 11. |
We present an iterative optimization scheme for obtaining minimal line sketches of object categories. Minimal sketches are introduced as a tool to derive the most important visual properties of a visual object category and can potentially provide useful constraints for automatic classification algorithms. We define the minimal sketch of an object category as the minimal number of straight lines necessary to lead to a correct recognition by 75% of naïve participants after one second of presentation. Nine participants produced sketches of 30 object categories. We displayed the three sketches with the lowest number of lines for each category to 24 participants who freely named them. In consecutive rounds the sketchers had to optimize their drawings independently based on sketches and responses of the previous rounds. The optimized sketches were subsequently rated again by 24 new subjects. The average number of lines used in the sketches decreased from 8.8 to 7.9 between the two trials while the average recognition rate increased from 57.3% to 67.9%. 27 of the 30 categories had at least one sketch that was recognized by more than 75% of subjects. For most of the categories, the sketches converged to an optimum within two drawing-rating rounds.
|•||Bülthoff HH , Fleming RW and Barnett-Cowan M (August-2010) Abstract Talk: Perceived object stability is affected by the internal representation of gravity, 33rd European Conference on Visual Perception, Lausanne, Switzerland, Perception39 (ECVP Abstract Supplement) 109 . |
Knowing an object's physical stability affects our expectations about its behaviour and our interactions with it. Objects topple over when the gravity-projected centre-of-mass (COM) lies outside the support area. The critical angle (CA) is the orientation for which an object is perceived to be equally likely to topple over or right itself, which is influenced by global shape information about an object's COM and its orientation relative to gravity. When observers lie on their sides, the perceived direction of gravity is tilted towards the body. Here we test the hypothesis that the CA of falling objects is affected by this internal representation of gravity. Observers sat upright or lay left- or right-side-down, and observed images of objects with different 3D mass distributions that were placed close to the right edge of a table in various orientations. Observers indicated whether the objects were more likely to fall back onto or off the table. The subjective visual vertical was also tested as a measure of perceived gravity. Our results show the CA increases when lying right-side-down and decreases when left-side-down relative to an upright posture, consistent with estimating the stability of rightward falling objects as relative to perceived and not physical gravity.
|•||Bülthoff HH , Wallraven C , de la Rosa S and Kaulard K (August-2010) Abstract Talk: Cognitive categories of emotional and conversational facial expressions are influenced by dynamic information, 33rd European Conference on Visual Perception, Lausanne, Switzerland, Perception39 (ECVP Abstract Supplement) 157. |
Most research on facial expressions focuses on static, ‘emotional’ expressions. Facial expressions, however, are also important in interpersonal communication (‘conversational’ expressions). In addition, communication is a highly dynamic phenomenon and previous evidence suggests that dynamic presentation of stimuli facilitates recognition. Hence, we examined the categorization of emotional and conversational expressions using both static and dynamic stimuli. In a between-subject design, 40 participants were asked to group 55 dierent facial expressions (either static or dynamic) of ten actors in a free categorization task. Expressions were to be grouped according to their overall similarity. The resulting confusion matrix was used to determine the consistency with which facial expressions were categorized. In the static condition, emotional expressions were grouped as separate categories while participants confused conversational expressions. In the dynamic condition, participants uniquely categorized basic and sub-ordinate emotional, as well as several conversational facial expressions. Furthermore, a multidimensional scaling analysis suggests that the same potency and valence dimensions underlie the categorization of both static and dynamic expressions. Basic emotional expressions represent the most eective categories when only static information is available. Importantly, however, our results show that dynamic information allows for a much more fine-grained categorization and is essential in disentangling conversational expressions.
|•||Bülthoff HH , de la Rosa S and Choudhery R (August-2010) Abstract Talk: Social interaction recognition and object recognition have different entry levels, 33rd European Conference on Visual Perception, Lausanne, Switzerland, Perception39 (ECVP Abstract Supplement) 12. |
Objects can be recognized at different levels of abstraction, eg basic-level (eg flower) and subordinate level (eg rose). The entry level refers to the abstraction level for which object recognition is fastest. For objects, this is typically the basic-level. Is the basic-level also the entry level for the social interaction recognition? We compared basic-level and subordinate recognition of objects and social interactions. Because social interaction abstraction levels are unknown, Experiment 1 determined basic-level and subordinate categories of objects and social interactions in a free grouping and naming experiment. We verified the adequacy of our method to identify abstraction levels by replicating previously reported object abstraction levels. Experiment 2 used the object and social interaction abstraction levels of Experiment 1 to examine the entry levels for social interaction and object recognition by means of recognition speed. Recognition speed was measured (reaction times, accuracy) for each combination of stimulus type and abstraction level separately. Subordinate recognition of social interactions was significantly faster than basic-level recognition while the results were reversed for objects. Because entry levels are associated with faster recognition, the results indicate different entry levels for object and social interaction recognition, namely the basic-level for objects and possibly the subordinate level for social interactions.
|•||Bülthoff HH (June-18-2010) Invited Lecture: The MPI CyberMotion Simulator:
A new concept for ab initio helicopter flight training, Institut für Hirnforschung, Bremen University, Bremen, Germany. |
|•||Bülthoff HH (June-11-2010) Invited Lecture: The MPI CyberMotion Simulator: Development of a novel helicopter trainer, ILA Helikopter Forum, Berlin, Germany. |
|•||Bülthoff HH (March-25-2010) Invited Lecture: Die Welt in unseren Köpfen: Sehen und Erkennen in Natur und Technik, Health and Life Sciences, Private Universität im Fürstentum Liechtenstein, Triesen, Liechtenstein. |
Die Überlegenheit der natürlichen über die künstliche Intelligenz liegt in der Fähigkeit des menschlichen Gehirns, die verschiedenen Sinnesinformationen miteinander zu verrechnen um dadurch sinnvolle Handlungen zu ermöglichen. Um diese Leistungen unseres Gehirns zu verstehen und in technische Systeme umzusetzen bedarf es der vereinten Anstrengungen verschiedener Disziplinen, darunter Biologie, Informatik, Mathematik, Physik, Psychologie und Robotik. Die neuen Methoden der Virtuellen Realität erlauben in Verhaltensexperimenten einen sensorischen Realismus zu erzeugen, der der Erfahrung der realen Welt weitgehend entspricht. Gleichzeitig erlauben diese Methoden eine genaue Kontrolle der Reizparameter, die für eine psychophysische Untersuchung notwendig sind. Darüber hinaus werden Wahrnehmungsleistungen nicht isoliert betrachtet sondern im geschlossenen Regelkreis von Wahrnehmung und Handlung untersucht.
|•||Bülthoff HH (January-29-2010) Keynote Lecture: The Cybernetics Approach to Perception, Cognition and Action, 2nd European Network for the Advancement of Artificial Cognitive Systems, Interaction and Robotics, Zürich, Switzerland. |
The question of how we perceive and interact with the world around us has been at the heart of cognitive and neuroscience research for the last decades. Despite tremendous advances in the field of computational vision made possible by the development of powerful learning techniques as well as the existence of large amounts of labeled training data for harvesting - artificial systems have yet to reach human performance levels and generalization capabilities. In this contribution we want to highlight some recent results from perceptual studies that could help to bring artificial systems a few steps closer to this grand goal. In particular, we focus on the issue of spatio-temporal object representations (dynamic faces), face synthesis, as well as the need for taking into account multi-sensory data in models of object categorization. In all of these perceptual research lines, the underlying research philosophy was to combine the latest tools in computer vision, computer graphics, and computer simulations in or der to gain a deeper understanding of recognition and categorization in the human brain. Conversely, we discuss how the perceptual results can feed back into the design of better and more efficient tools for artificial systems.
|•||Bülthoff HH (January-18-2010): Towards artificial systems: what can we learn from human perception?, 45th Winter Seminar 2010, Klosters, Switzerland. |
|•||Bülthoff HH and Robuffo Giordano P (December-3-2009): Providing vestibular cues to a human operator for a new generation of human-machine interfaces, 2nd Workshop for Young Researchers on Human-Friendly Robotics (HFR 2009), Sestri Levante, Italy. |
|•||Bülthoff HH (November-4-2009) Invited Lecture: What can Computers learn from Human Perception, Distinguished Lecturer Series, WCU Research Division for Brain and Cognitive Engineering, Korea University, Seoul, South Korea. |
|•||Bülthoff HH , Cunningham DW , Wallraven C and Kaulard K (November-2009) Abstract Talk: Laying the foundations for an in-depth investigation of the whole space of facial expressions, 10th Conference of Junior Neuroscientists of Tübingen (NeNa 2009), Ellwangen, Germany10 11. |
Compared to other species, humans have developed highly sophisticated communication systems for social interaction. One of the most important communication systems is based on facial expressions, which are both used for expressing emotions and conveying intentions. Starting already at birth, humans are trained to process faces and facial expressions, resulting in a high degree of perceptual expertise for face perception and social communication. To date, research has mostly focused on the emotional aspect of facial expression processing, using only a very limited set of „generic“ or „universal“ expressions, such as happiness or sadness. The important communicative aspect of facial expressions, however, has so far been largely neglected. Furthermore, the processing of facial expressions is influenced by dynamic information (e. g. Fox et al., 2009). However, almost all studies so far have used static expressions and thus were studying facial expressions in an ecologically less valid context (O’Toole et al., 2004). In order to enable a deeper understanding of facial expression processing it therefore seems crucial to investigate the emotional and communicative aspects of facial expressions in a dynamic context. For these investigations it is essential to first construct a database that contains such material using a well-controlled setup. In this talk, we will present the novel MPI facial expression database, which to our knowledge is the most extensive database of this kind up to date. Furthermore, we will briefly present psychophysical experiments with which we investigated the validity of our database, as well as the recognizability of a large set of facial expressions.
|•||Bülthoff HH (October-28-2009) Invited Lecture: Human Shape Perception, Electronics and Telecommunications Research Institute (ETRI), Daejeon, South Korea. |
One aspect in which human shape estimation is better than state-of-the-art computer vision algorithms, is that it is extremely stable across a wide range of complex lighting and reflectance conditions. For example, while most stereo and shape-from-shading algorithms require minimal specular reflections, the human brain, by contrast, appears to be well aware of the physics of specular reflections, to the extent that highlights actually improve human shape perception. Similarly, it is common for shape-from-shading algorithms to assume known illumination, and often collimated light (which is rarely encountered during the daytime). By contrast, human shape perception works best under complex illumination patterns. I will present a review of some of the findings from our research group in which human shape perception is evaluated under conditions that are particularly challenging for many computer systems, including complex lighting conditions and spatially varying or non-Lambertian BRDFs. In general we find that the more complex and naturalistic the viewing conditions, the better human perception is, suggesting that there are many sources of information within shading still to be discovered. I will present the community with a few key findings from human vision that I believe any biologically motivated machine vision system should emulate.
|•||Bülthoff HH (October-26-2009) Invited Lecture: Biologically Motivated Computer Graphics, Korea Institute of Science and Technology (KIST), Seoul, South Korea. |
|•||Bülthoff HH (October-9-2009) Invited Lecture: Biologically Motivated Computer Graphics, Korean Computer Graphics Society Meeting (KCGS-2009), Jeju Island, South Korea. |
|•||Bülthoff HH (September-30-2009): Recent Advances in Perception, Cognition and Action Research , International Symposium on Brain and Cognitive Engineering, Seoul, South Korea. |
|•||Bülthoff HH , Wallraven C and Gaissert N (August-2009) Abstract Talk: Exploring visual and haptic object categorization, 32nd European Conference on Visual Perception, Regensburg, Germany, Perception38 (ECVP Abstract Supplement) 159. |
Humans combine visual and haptic shape information in object processing. To investigate commonalities and differences of these two modalities for object categorization, we performed similarity ratings and three different categorization tasks visually and haptically and compared them using multidimensional scaling techniques. As stimuli we used a 3-D object space, of 21 complex parametrically-defined shell-like objects. For haptic experiments, 3-D plastic models were freely explored by blindfolded participants with both hands. For visual experiments, 2-D images of the objects were used. In the first task, we gathered pair-wise similarity ratings for all objects. In the second, unsupervised task, participants freely categorized the objects. In the third, semi-supervised task, participants had to form exactly three groups. In the fourth, supervised task, participants learned three prototype objects and had to assign all other objects accordingly. For all tasks we found that within-category distances were smaller than across-category distances. Categories form clusters in perceptual space with increasing density from unsupervised to supervised categorization. In addition, the unconstrained similarity ratings predict the categorization behavior of the unsupervised categorization task best. Importantly, we found no differences between the modalities in any task showing that the processes underlying categorization are highly similar in vision and haptics.
|•||Bülthoff HH , Campos J and Butler J (August-2009) Abstract Talk: The importance of body-based cues for travelled distance perception, 9th Annual Meeting of the Vision Sciences Society (VSS 2009), Naples, FL, USA, Journal of Vision9 (8) 1144. |
When moving through space, both dynamic visual information (i.e. optic flow) and body-based cues (i.e. proprioceptive and vestibular) jointly specify the extent of a travelled distance. Little is currently known about the relative contributions of each of these cues when several are simultaneously available. In this series of experiments participants travelled a predefined distance and subsequently reproduced this distance by adjusting a visual target until the self-to-target distance matched the distance they had moved. Visual information was presented through a head-mounted display and consisted of a long, richly textured, virtual hallway. Body-based cues were provided either by A) natural walking in a fully-tracked free walking space (proprioception and vestibular) B) being passively moved by a robotic wheelchair (vestibular) or C) walking in place on a treadmill (proprioception). Distances were either presented through vision alone, body-based cues alone, or both visual and body-based cues combined. In the combined condition, the visually-specified distances were either congruent (1.0x) or incongruent (0.7x/1.4x) with distances specified by body-based cues. Incongruencies were created by either changing the visual gain or changing the proprioceptive gain (during treadmill walking). Further, in order to obtain a measure of “perceptual congruency” between visual and body-based cues, participants were asked to adjust the rate of optic flow during walking so that it matched the proprioceptive information. This value was then used as the basis for later congruent cue trials. Overall, results demonstrate a higher weighting of body-based cues during natural walking, a higher weighting of proprioceptive information during treadmill walking, and an equal weighting of visual and vestibular cues during passive movement. These results were not affected by whether visual or proprioceptive gain was manipulated. Adopting the obtained measure of perceptual congruency for each participant also did not change the conclusions such that proprioceptive cues continued to be weighted higher.
|•||Bülthoff HH (August-2009): Multisensory integration for perception and action in virtual environments, 32nd European Conference on Visual Perception, Regensburg, Germany, Perception38 (ECVP Abstract Supplement) 2. |
Understanding vision has always been at the centre of research in perception and cognition. Experiments on vision, however, have usually been conducted with a strong focus on perception, neglecting the fact that in most natural tasks sensory signals are not ultimately used for perception, but rather for action. The effects of the action are sensed by the sensory system, so that perception and action are complementary parts of a dynamic control system. Additionally, the human sensory system receives input from multiple senses which have to be integrated in order to solve tasks ranging from standing upright to controlling complex vehicles. In our Cybernetics research group we use psychophysical, physiological, modeling, and simulation techniques to study how cues from different sensory modalities are integrated by the brain to perceive, act in, and interact with the real world. In psychophysical studies, we could show that humans integrate multimodal sensory information often, but not always, in a statistically optimal way such that cues are weighted according to their reliability. In this talk, I will present results from our studies on multisensory integration of perception and action in both natural and simulated environments for different tasks using our latest simulator technologies, the Cyberwalk omnidirectional treadmill and the MPI Motion Simulator based on a large industrial robot arm.
|•||Bülthoff HH and Wallraven C (August-2009): Beyond vision: multi-sensory processing in humans and machines, Second International Workshop on Shape Perception in Human and Computer Vision (SPHCV-ECVP 2009), Regensburg, Germany. |
The question of how humans learn to categorize objects and events has been at the heart of cognitive and neuroscience research for the last decades. In recent years, much work also in computer vision has focused on this topic and by now has generated multiple challenges, databases, and novel approaches. In this talk, I will argue that there is more to "vision" than "bags of words". Recent work in our lab has focused on using state-of-the-art computer graphics and simulation technology in order to advance our understanding of the role vision plays in the "ultimate cognitive system" - the human. In particular, in my talk I will discuss the need for spatio-temporal object representations, as well as why we need a notion of shape and material properties in object interpretation that goes far beyond most current computer vision approaches. Most importantly, however, I will focus on multi-modal/multi-sensory aspects of object processing as one of the key elements of learning about the world through interaction. Evi dence from several studies of haptic object processing, for example, has shown that the sense of touch is sometimes surprisingly acute in representing complex shape spaces. I will finish by showing how some of these perceptual and cognitive results can be integrated into novel, more efficient and effective vision systems.
|•||Bülthoff HH and Wallraven C (July-2009): Beyond vision: multi-sensory processing in
humans and machines, Workshop on Trends in Computer Vision 2009, Praha, Czech Republic. |
|•||Bülthoff HH (June-24-2009): Multi-sensory navigation in Virtual Reality, International Conference on Vision in 3D Environments (CVR 2009), Toronto, Canada. |
|•||Bülthoff HH (June-3-2009) Keynote Lecture: What can machine vision learn from human perception?, 3rd IAPR/IEEE International Conference on Biometrics (ICB 2009), Sassari, Italy. |
|•||Bülthoff HH (February-5-2009): Effect of lateral motion on drivers' performance in the MPI motion simulator, Driving Simulation Conference Europe (DSC 2009), Monte Carlo, Monaco. |
|•||Bülthoff HH , Ernst MO , Robuffo Giordano P , Souman JL , Mattone R and Luca AD (October-24-2008) Abstract Talk: The CyberWalk Platform: Human-Machine Interaction Enabling
Unconstrained Walking through VR, First Workshop for Young Researchers on Human-friendly robotics, Napoli, Italy(12) . |
In recent years, Virtual Reality (VR) has become increasingly realistic and immersive. Both the visual and auditory rendering of virtual environments have been improved significantly, thanks to developments in both hardware and software. In contrast, the possibilities for physical navigation through virtual environments (VE) are still relatively rudimentary. Most commonly, users can ‘move’ through highfidelity virtual environments using a mouse or a joystick. Of course, the most natural way to navigate through VR would be to walk. For small scale virtual environments one can simply walk within a confined space. The VE can be presented by a cave-like projection system, or by means of a head-mounted display combined with head-tracking. For larger VEs, however, this quickly becomes impractical or even impossible.
|•||Bülthoff HH (October-5-2008): Recognition and Categorization in Man and Machine, Fyssen Colloquium "From Objects to Categories: Visual Categorization in Big Brains, Small Brains and Machines", Saint Germain en Laye, France. |
|•||Bülthoff HH (September-16-2008) Keynote Lecture: Virtual reality as a valuable research tool for studying spatial cognition, Spatial Cognition 2008 (SC '08), Freiburg, Germany. |
|•||Bülthoff HH (August-2008): Learning System Dynamics: Transfer of Tranining in a Helicopter Hover Simulator, AIAA Guidance, Navigation and Control Conference, Honolulu, HI, USA. |
|•||Bülthoff HH , Butler JS and Smith ST (July-2008) Abstract Talk: The role of stereo vision in visual and vestibular cue integration, 9th International Multisensory Research Forum (IMRF 2008), Hamburg, Germany9 179. |
Self-motion through an environment is a composite of signals such as vision and vestibular cues. Recently, it has been shown that visual-auditory cues and visual-haptic cues combine in a statistically optimal fashion. We asked what role does stereo vision play in optimal integration of visual and vestibular cues for linear heading. Participants performed the task in visual alone, vestibular alone or combined visual-vestibular (self-motion). The conditions were grouped into two experiments; bi-ocular, 2-D experiment and stereo, 3-D experiment. Participants were seated on a Stewart motion platform and presented with two motions consisting of a standard heading of straight ahead and a comparison heading and judged which movement was more to the right. From the responses individual JND were calculated (i.e., reliability measure). In the 2-D experiment 40% of participantsâ€™ self-motion reliability was worse than their most reliable unimodal cue, thus violating optimal cue combination. In the 3-D experiment all subjects self-motion reliability was not statistically different from the optimal predicted self-motion and therefore more reliable than either unimodal cue. These results can be evaluated with respect to a neuronal population model. These findings show that visual-vestibular cues combine in statistically optimal fashion with the caveat of stereo visuals.
|•||Bülthoff HH (July-2008): Visual proprioceptive, and inertial cue-weighting in travelled distance perception, XXIX International Congress of Psychology (ICP 2008) , Berlin, Germany. |
|•||Bülthoff HH (June-23-2008) Keynote Lecture: Perceptual Graphics: Integrating Perception, Computer Graphics, and Computer Vision, 19th Eurographics Symposium on Rendering (EGSR 2008), Sarajewo, Bosnia and Herzegowina. |
In our Perceptual Graphics group at the Max Planck Institute in Tübingen we combine state-of-the-art computer graphics and computer vision technology with perceptual research. This integration has two goals: first of all, the technology allows us to conduct perceptual experiments with highly controlled, yet very realistic stimuli that advance our understanding of basic perceptual phenomena such as material perception or the recognition of facial expressions. Second, the results from these perceptual experiments can be used to improve the technology and to design novel applications that are perceptually effective — examples include an intuitive material editor for creation of arbitrary materials in computer graphics or a perceptually realistic facial animation. The human face is capable of producing an astounding variety of facial movements that are able to transport a large range of communicative meanings. To date, it is largely unclear, however, which information (including visual as well as auditory information) humans use to decipher the language of the face. In order to investigate this question systematically, one needs to have a highly flexible yet at the same time very realistic computer animation system. We are currently developing such a system in our group using state-of-the-art computer graphics and computer vision methods. This animation system is then used to create stimuli for experiments on perception of facial expressions which allowus to, for example, to manipulate the spatio-temporal properties of single regions of the face in order to determine their importance for recognition of expressions. In addition — and this constitutes the second aspect of perceptual graphics — we have also used these and similar perceptual experiments to determine the perceptual quality of computer graphics. The results have given us insights into specific parameters that need to be improved in order to provide an even higher level of realism and effectiveness.
|•||Bülthoff HH and Wallraven C (May-20-2008): Multi-sensory Integration for Perception and Action, ICRA 2008 Workshop on Future Directions in Visual Navigation, Pasadena, CA, USA. |
|•||Bülthoff HH (May-13-2008) Keynote Lecture: Going beyond vision: multisensory integration for perception and action, 6th International Conference on Computer Vision Systems, Vision for Cognitive Systems (ICVS 2008) , Santorini, Greece. |
Understanding vision has always been at the centre of research in both cognitive and computational sciences. Experiments on vision, however, have usually been conducted with a strong focus on perception, neglecting the fact that in most natural tasks sensory signals are not ultimately used for perception, but rather for action. The effects of the action are sensed again by the sensory system, so that perception and action are complementary parts of a dynamic control system. Additionally, the human sensory system receives input from multiple senses which have to be integrated in order to solve tasks ranging from standing upright to controlling complex vehicles. In our Cybernetics research group at the Max Planck Institute in Tuebingen, we use psychophysical, physiological, modeling, and simulation techniques to study how cues from different sensory modalities are integrated by the brain to perceive, act in, and interact with the real world. In psychophysical studies, we could show that humans integrate multimo dal sensory information often but not always in a statistically optimal way, such that cues are weighted according to their reliability. In this talk, I will present results from our studies on multisensory integration of perception and action in both natural and simulated environments in different task contexts - from object recognition, to navigation, to vehicle control.
|•||Bülthoff HH (March-30-2008): Multisensory integration for action in natural and virtual environments , Workshop on Natural Environments Tasks and Intelligence (NETI 2008), Austin, TX, USA. |
Many experiments which study the mechanisms by which different senses interact in humans focus on perception. In most natural tasks, however, sensory signals are not ultimately used for perception, but rather for action. The effects of the action are sensed again by the sensory system, so that perception and action are complementary parts of a dynamic control system. In our cybernetics research group at the Max Planck Institute in Tuebingen, we use psychophysical, physiological, modeling and simulation techniques to study how cues from different sensory modalities are integrated by the brain to perceive and act in the real world. In psychophysical studies, we could show that humans integrate multimodal sensory information often but not always in a statistically optimal way, such that cues are weighted according to their reliability. In this talk I will also present our latest simulator technology using an omni-directional treadmill and a new type of flight simulator based on an anthropomorphic robot arm.
|•||Bülthoff HH (March-11-2008): Locomotion in VR: State-of-the-art & Psychophysics, IEEE Virtual Reality Conference 2008 (VR '08), Reno, NV, US. |
|•||Bülthoff HH (January-25-2008) Invited Lecture: The Cybernetic Approach to Perception and Action, 43rd Winter Seminar 2008, Klosters, Switzerland. |
|•||Bülthoff HH and Wertheimer J (October-30-2007) Invited Lecture: Wie wirklich ist die Illusion?: Ein Dialog zwischen Natur- und Literaturwissenschaft, Studium Generale der Universität Tübingen, Tübingen, Germany. |
|•||Bülthoff HH and Wallraven C (October-14-2007): Multimodal Categorization, Eleventh IEEE International Conference on Computer Vision (ICCV 2007), Rio de Janeiro, Brazil. |
The question of how the human brain "makes sense" of the sensory input it receives has been at the heart of cognitive and neuroscience research for the last decades. One of the most fundamental perceptual processes is categorization the ability to compartmentalize knowledge for efficient retrieval. Recent advances in computer graphics and computer vision have made it possible to both produce highly realistic stimulus material for controlled experiments in life-like environments as well as to enable highly detailed analyses of the physical properties of realworld stimuli.
|•||Bülthoff HH (October-5-2007) Invited Lecture: Was wir zu sehen denken. Wahrnehmung und Handlung in realen und virtuellen Welten, Symposium: Nicht wahr?! Sinneskanäle, Hirnwindungen und Grenzen der Wahrnehmung, Germanisches Nationalmuseum Nürnberg, Germany. |
Die Sinnesorgane und die zugehörigen Verarbeitungsareale im Gehirn bilden unseren "Wahrnehmungsapparat". Er bildet die Außenwelt nicht nur in uns ab, sondern legt sie gleichsam für uns aus. Wahrnehmungsprozesse beruhen auf Filterung, Integration und Bewertung von Sinnesdaten. Welche Täuschungen können daraus resultieren und auf welchen Mechanismen beruhen sie? Welchen evolutionären Überlebensvorteil haben diese Mechanismen geboten? Gibt es Wissen über die Außenwelt jenseits unserer Sinneswahrnehmung?
|•||Bülthoff HH (September-2007): The MPI Motion Simulator: A new approach to motion simulation with an anthropomorphic robot arm, 2nd Motion Simulator Conference 2007, Braunschweig, Germany. |
|•||Bülthoff H (August-2007): The Role of Visual Cues and Whole-Body Rotations in Helicopter Hovering Control, AIAA Modeling and Simulation Technologies Conference and Exhibit 2007, Hilton Head, SC, USA. |
|•||Bülthoff HH , Butler JS and Smith S (July-2007) Abstract Talk: Integration of visual and vestibular cues to heading, 8th International Multisensory Research Forum (IMRF 2007), Sydney, Australia8 (61) . |
Accurate perception of ones self motion through the environment requires the successful integration of visual, vestibular, proprioceptive and auditory cues. We have applied Maximum Likelihood Estimation analysis to visual alone, vestibular alone and visual-vestibular linear self-motion (heading) estimation tasks. Using 2IFC method of constant stimuli and fitting the resulting psychometric data with the Matlab toolbox, psychofit (Wichman and Hill, 2001), we quantified perceptual uncertainty of heading discrimination by the standard deviation of the cumulative Gaussian fit. Our data show that when the uncertainty of visual and vestibular heading discrimination are matched in the combined information condition, there are two distinct classes of observers; those whose heading uncertainty is significantly reduced in the combined condition and those observers whos combined heading uncertainty is significantly increased. Our results are discussed in relation to monkey behavioural and neurophysiological heading e stimation data recently obtained by Angelaki and colleagues.
|•||Bülthoff HH (July-2007) Keynote Lecture: Multisensory Integration for Perception and Action, 8th International Multisensory Research Forum (IMRF 2007), Sydney, Australia(129) . |
Many experiments which study the mechanisms by which different senses interact in humans focus on perception. In most natural tasks, however, sensory signals are not ultimately used for perception, but rather for action. The effects of the action are sensed again by the sensory system, so that perception and action are complementary parts of a dynamic control system. In our cybernetics research group at the Max Planck Institute in Tübingen, we use psychophysical, physiological, modeling and simulation techniques to study how cues from different sensory modalities are integrated by the brain to perceive and act in the real world. In psychophysical studies, we could show that humans can integrate multimodal sensory information in a statistically optimal way, such that cues are weighted according to their reliability. A better understanding of multimodal sensory fusion will allow us to build new virtual reality platforms in which the design effort for simulating the relevant modalities (visual, auditory, haptic, vestibular and proprioceptive) is influenced by the weight of each. In this talk we will discuss which of these characteristics would be necessary to allow valuable improvements in high-fidelity simulator design.
|•||Bülthoff HH (July-2007) Keynote Lecture: Multisensory Integration in Virtual Environments, 8th International Multisensory Research Forum (IMRF 2007), Sydney, Australia8 (129) . |
Many experiments which study the mechanisms by which different senses interact in humans focus on perception. In most natural tasks, however, sensory signals are not ultimately used for perception, but rather for action. The effects of the action are sensed again by the sensory system, so that perception and action are complementary parts of a dynamic control system. In our cybernetics research group at the Max Planck Institute in Tübingen, we use psychophysical, physiological, modeling and simulation techniques to study how cues from different sensory modalities are integrated by the brain to perceive and act in the real world. In psychophysical studies, we could show that humans can integrate multimodal sensory information in a statistically optimal way, such that cues are weighted according to their reliability. A better understanding of multimodal sensory fusion will allow us to build new virtual reality platforms in which the design effort for simulating the relevant modalities (visual, auditory, haptic , vestibular and proprioceptive) is influenced by the weight of each. In this talk we will discuss which of these characteristics would be necessary to allow valuable improvements in high-fidelity simulator design.
|•||Bülthoff HH (July-2007) Invited Lecture: An image-based approach to perception and action, Queensland Brain Institute, Neuroscience Seminar Series, Brisbane, Australia. |
|•||Bülthoff HH (June-14-2007) Invited Lecture: From insect vision to human perception: A long journey with many friends to understand the brain, A Journey Through Computation, Genova, Italy. |
|•||Bülthoff HH (April-18-2007) Invited Lecture: Erkennen ist mehr als Sehen, Biozentrumskolloquium Universität Würzburg, Würzburg, Germany. |
|•||Bülthoff HH (March-29-2007): What is missing in high-fidelity motion simulation?, SIMONA Symposium, Delft, Netherlands. |
|•||Bülthoff HH (February-12-2007): Perception and Action in Virtual Environments, The Lausanne Neuroscience Seminars, Lausanne, Switzerland. |
|•||Bülthoff HH (October-24-2006) Invited Lecture: Sehen in Natur und Technik oder Wie kommt die Welt in den Kopf und was können Architekten damit anfangen, Aussenstellentagung der MPG-Bauabteilung, Grassau, Germany. |
|•||Bülthoff HH , Butler JS and Smith ST (October-2006) Abstract Talk: Multisensory self-motion estimation, 36th Annual Meeting of the Society for Neuroscience (Neuroscience 2006), Atlanta, GA, USA36 (12.6) . |
Navigation through the environment is a naturally multisensory task involving a coordinated set of sensorimotor processes that encode and compare information from visual, vestibular, proprioceptive, motor-corollary, and cognitive inputs. The extent to which visual information dominates this process is no better demonstrated than by the compelling illusion of self-motion generated in the stationary participant by a large-field visual motion stimuli. The importance of visual inputs for estimation of self-motion direction (heading) was first recognised by Gibson (1950) who postulated that heading could be recovered by locating the focus of expansion (FOE) of the radially expanding optic flow field coincident with forward translation. A number of behavioural studies have subsequently shown that humans are able to estimate their heading to within a few degrees using optic flow and other visual cues. For simple linear translation without eye or head rotations, Warren and Hannon (1988) report accurate discrimination of visual heading direction of about 1.5°. Despite the importance of visual information in such tasks, self-motion also involves stimulation of the vestibular end-organs which provide information about the angular and linear accelerations of the head. Our research (Smith et al 2004) has previously shown that humans with intact vestibular function can estimate their direction of linear translation using vestibular cues alone with as much certainty as they do using visual cues. Here we report the results of an ongoing investigation of self-motion estimation which shows that visual and vestibular information can be combined in a statistically optimal fashion. We discuss our results from the perspective that successful execution of self-motion behaviour requires the computation of one’s own spatial orientation relative to the environment.
|•||Bülthoff HH (September-11-2006): Object Recognition in Man and Machine, Summer School: Visual Neuroscience - from Spikes to Awareness, Rauischholzhausen, Germany. |
|•||Bülthoff HH (September-11-2006): Multimodal Integration for Perception and Action, Summer School: Visual Neuroscience - from Spikes to Awareness, Rauischholzhausen, Germany. |
|•||Bülthoff HH (August-29-2006) Invited Lecture: Multisensory Integration during Active Control, École polytechnique fédérale de Lausanne: Brain and Mind Institute, Lausanne, Switzerland. |
Most experiments which study the mechanisms by which different senses interact in humans focus on perception. In most natural tasks, however, sensory signals are not ultimately used for perception, but rather for action. The effects of the action are sensed again by the sensory system, so that perception and action are complementary parts of a dynamic control system. To get a better understanding of how different senses interact in self-motion, we study the control of self-motion in a closed perception-action loop. Here we investigated how cues from different sensory modalities (visual cues and body cues) are used when humans stabilize a simulated helicopter at a target location.
|•||Bülthoff HH (August-8-2006) Invited Lecture: Das Rätsel der Wahrnehmung: Eine Einführung, Wissenschaftsnacht, Tübingen, Germany. |
|•||Bülthoff HH and Wertheimer J (August-8-2006) Invited Lecture: Wie kommt die Welt in den Kopf und wieder heraus: Ein Dialog, Wissenschaftsnacht, Tübingen, Germany. |
|•||Bülthoff HH and Schultz J (August-2006) Abstract Talk: Attentional modulation by trial history, 29th European Conference on Visual Perception, St. Petersburg, Russia, Perception35 (ECVP Abstract Supplement) 128. |
Temporal patterning of stimuli can affect performance and be critical for perceptual learning. We tested whether trial history can explain target detection time even when target occurrence is unpredictable. 12 volunteers were presented with streams of stimuli of variable color, shape, and motion direction, and had to attend to all stimulus dimensions simultaneously to report Poisson-determined, 1-back repetitions in either dimension. Response times decreased exponentially with the number of successive targets (group means for 1 to 4 targets in succession: 1050, 763, 717, 722 milliseconds; 2-way repeated measures ANOVA: F(3,33) = 195, p&amp;lt;&amp;lt;0.0001, no main effect of stimulus dimension but interaction between dimension and number of successive targets: F(6,66) = 5.11, p&amp;lt;0.001). Response times were well explained by a leaky integrator of trial history with fast exponential decay (half-life = 1.21 trials; correlation coefficients significant at p&amp;lt;0.0002 for all dimensions and subjects; group mean correlation coefficients for color, shape and motion targets: 0.57(0.03), 0.57(0.02), 0.47(0.03)). Our results show that target detection times can be altered by trial history, and explainable by a fast-decaying integration of trial history. We propose that trial history modulates attention resulting in response time changes; we are currently investigating this hypothesis using functional neuroimaging.
|•||Bülthoff HH , Berger D and Terzibas C (June-15-2006): From virtual images to actions, Fifteenth Seminar "Virtual Images", Paris, France. |
Most experiments which study the mechanisms by which different senses interact in humans focus on perception. In most natural tasks, however, sensory signals are not ultimately used for perception, but rather for action. The effects of the action are sensed again by the sensory system, so that perception and action are complementary parts of a dynamic control system. To get a better understanding of how different senses interact in self-motion, we study the control of self-motion in a closed perception-action loop. Here we investigated how cues from different sensory modalities (visual cues and body cues) are used when humans stabilize a simulated helicopter at a target location.
|•||Bülthoff HH , Cunningham DW , Wallraven C and Nusseck M (May-2006) Abstract Talk: Perception of accentuation in audio-visual speech, 2nd Enactive Workshop at McGill University, Montreal, Canada. |
Introduction: In everyday speech, auditory and visual information are tightly coupled. Consistent with this, previous research has shown that facial and head motion can improve the intelligibility of speech (Massaro et al., 1996; Munhall et al., 2004; Saldana & Pisoni 1996). The multimodal nature of speech is particularly noticeable for emphatic speech, where it can be exceedingly difficult to produce the proper vocal stress patterns without producing the accompanying facial motion. Using a detection task, Swerts and Krahmer (2004) demonstrated that information about which word is emphasized exists in both the visual and acoustic modalities. It remains unclear as to what the differential roles of visual and auditory information are for the perception of emphasis intensity. Here, we validate a new methodology for acquiring, presenting, and studying verbal emphasis. Subsequently, we can use the newly established methodology to explore the perception and production of believable accentuation. Experiment: Participants were presented with a series of German sentences, in which a single word was emphasized. For each of the 10 base sentences, two factors were manipulated. First, the semantic category varied -- the accent bearing word was either a verb, an adjective, or a noun. Second, the intensity of the emphasis was varied (no, low, and high). The participants' task was to rate the intensity of the emphasis using a 7 point Likert scale (with a value of 1 indicating weak and 7 strong). Each of the 70 sentences were recorded from 8 Germans (4 male and 4 female), yielding a total of 560 trials. Results and Conclusion: Overall, the results show that people can produce and recognize different levels of accentuation. All "high" emphasis sentences were ranked as being more intense (5.2, on average) than the "low" emphasis sentences (4.1, on average). Both conditions were rated as more intense than the "no" emphasis sentences (1.9). Interestingly, "verb" sentences were rated as being more intense than either the "noun" or "adjective" sentences, which were remarkably similar. Critically, the pattern of intensity ratings was the same for each of the ten sentences strongly suggesting that the effect was solely due to the semantic role of the emphasized word. We are currently employing this framework to more closely examine the multimodal production and perception of emphatic speech.
|•||Bülthoff HH and Wallraven C (May-2006): Multimodal Recognition and Categorization, Vision Science Society Panel Presentation, Sarasota, FL, USA. |
|•||Bülthoff HH (January-25-2006): Perception and Action in Virtual Environments, 41st Winter Seminar 2006, Klosters, Switzerland. |
|•||Bülthoff HH (January-16-2006): Integration of visual, auditory and vestibular information in spatial orientation and control tasks, Bayesian Cognition Workshop, Paris, France. |
|•||Bülthoff HH , Thornton IM , Vuong QC and Chuang L (November-9-2005) Abstract Talk: Recognising novel deforming objects, 13th Annual Workshop on Object Perception, Attention, and Memory (OPAM 2005), Toronto, Canada13 3. |
Current theories of visual object recognition tend to focus on static properties, particularly shape. Nonetheless, visual perception is a dynamic experienceas a result of active observers or moving objects. Here, we investigate whether dynamic information can influence visual object-learning. Three learning experiments were conducted that required participants to learn and subsequently recognize different non-rigid objects that deformed over time. Consistent with previous studies of rigid depth-rotation, our results indicate that human observers do represent object-motion. Furthermore, our data suggest that dynamic information could compensate for when static cues are less reliable, for example, as a result of viewpoint variation.
|•||Bülthoff HH (September-15-2005): Towards a better understanding of motion simulation: a human perspective, 8th Driving Simulator Conference Europe (DSC 2005 Europe), Guyancourt, France. |
|•||Bülthoff HH , Blanz V , Breidt M , Krimmel M, Schmiedeberg T, Straub-Duffner S, Scherbaum K and Reinert S (August-30-2005) Abstract Talk: 3D Facial Growth in Healthy Caucasian Infants, 17th International Conference on Oral & Maxillofacial Surgery (ICOMS 2005), Wien, Austria . |
|•||Bülthoff HH and Fleming RW (August-2005) Abstract Talk: Fourier cues to 3-D shape, 28th European Conference on Visual Perception, A Coruña, Spain, Perception34 (ECVP Abstract Supplement) 53. |
If you pick up a typical vision text, you'll learn there are many cues to 3-D shape, such as shading, linear perspective, and texture gradients. Much work has been done to study each cue in isolation and also how the various cues can be combined optimally. However, relatively little work has been devoted to finding commonalities between cues. Here, we present theoretical work that demonstrates how shape from shading, texture, highlights, perspective, and possibly even stereopsis could share some common processing strategies. The key insight is that the projection of a 3-D object into a 2-D image introduces dramatic distortions into the local image statistics. It does not matter much whether the patterns on a surface are due to shading, specular reflections, or texture: when projected into the image, the resulting distortions reliably cause anisotropies in the local Fourier spectrum. Globally, these anisotropies are organised into smooth, coherent patterns, which we call 'orientation fields'. We have argued recently [Fleming et al, 2004 Journal of Vision 4(9) 798 - 820] that orientation fields can be used to recover shape from specularities. Here we show how orientation fields could play a role in a wider range of cues. For example, although diffuse shading looks completely unlike mirror reflections, in both cases image intensity depends on 3-D surface orientation. Consequently, derivatives of surface orientation (curvature) are related to derivatives of image intensity (intensity gradients). This means that both shading and specularities lead to similar orientation fields. The mapping from orientation fields to 3-D shape is different for other cues, and we exploit this to create powerful illusions. We also show how some simple image-processing tricks could allow the visual system to 'translate' between cues. Finally, we outline the remaining problems that have to be solved to develop a 'unified theory' of 3-D shape recovery.
|•||Bülthoff HH , von der Heyde M and Berger DR (July-25-2005): Cognitive influences on self-rotation perception, 1st International Conference on Augmented Cognition (HCI International 2005), Las Vegas, NV, USA. |
In this study we examined the types of information that can influence the perception of upright (yaw) rotations. Specifically, we examined the influence of stimulus magnitude, task-induced attention and awareness of inter-sensory conflicts on the weights of visual and body cues. Participants had to reproduce rotations that were presented as simultaneous physical body turns (via a motion platform) and visual turns displayed as a rotating scene. During the active reproduction stage, conflicts between the body and visual rotations were introduced by means of gain factors. Participants were instructed to reproduce either the visual scene rotation or the body rotation. After each trial participants reported whether or not they had perceived a conflict. We found significant influences of the magnitude of the rotation, attention condition (instruction to re produce platform or scene rotation), and reported awareness of a sensory conflict during the reproduction phase. Attention had a larger influence on the response of the participants when they noticed a conflict compared to when they did not perceive a conflict. Attention biased their response towards the attended modality. Our results suggest that not only the stimulus characteristics, but also cognitive factors play a role in the estimation of the size of a rotation in an active turn reproduction task.
|•||Bülthoff HH (June-19-2005) Keynote Lecture: Multimodal Sensor Fusion in Man and Machine, Robotics: Science and Systems I (RSS 2005), Cambridge, MA, USA. |
|•||Bülthoff HH and Berger D (June-2005) Abstract Talk: Effects of Attention and Cue Conflict Awareness on Multimodal Integration in Self-Rotation Perception, 6th International Multisensory Research Forum (IMRF 2005), Trento, Italy6 18-19. |
We investigated how the influence of visual and body cues on the perception of yaw rotations depends on focusing attention to either cue, and on becoming aware of conflicts between the two modalities. Participants experienced passive whole-body yaw rotations and concurrent visual rotations on a motion platform. They then had to turn back actively, while attending to either visual rotation or body rotation. During return we introduced a conflict between visual and body rotation by means of a gain factor. After each return, participants had to respond whether or not they had noticed a conflict. We found that the weight of the visual cue on the response was significantly higher for small than for large rotations. It was also significantly higher when participants attended to the visual rotation compared to platform rotation, showing that attention has a significant influence on the weights in the integration. Further analysis revealed that the effect of attention on the cue weights was significantly larger if participants noticed conflicts than if they did not. We conclude that participants can use attention to bias the cue weights in self-motion perception towards the attended modality, and that this effect is increased when a conflict between the cues is noticed.
|•||Bülthoff HH (May-16-2005) Invited Lecture: Perception and Action in Virtual Environments, Department of Psychology, Trinity College, Dublin, Ireland. |
|•||Bülthoff HH (May-6-2005): Novel Egomotion Simulators, Fifth Annual Meeting of the Vision Sciences Society (VSS 2005), Sarasota, FL, USA. |
|•||Bülthoff HH (April-27-2005): Object Recognition in Man and Machine, ICTP Workshop on Genes, Development and the Emergence of Behaviour, Psychophysics of Higher Cognitive Functions, Trieste, Italy. |
|•||Bülthoff HH (April-6-2005): Psychophysics in the 21. Century, FhG-MPG Workshop "Mathematik / Informatik", Sankt Augustin, Germany. |
|•||Bülthoff HH (March-4-2005) Invited Lecture: Wie kommt die Welt in den Kopf? Sehen und Erkennen in Natur und Technik, Verband Deutscher Maschinen- und Anlagenbau (VDMA) Mitgliederversammlung, Dresden, Germany. |
|•||Bülthoff HH (February-28-2005) Invited Lecture: Einführung in die Wahrnehmungsforschung, Blockpraktikum Psychophysik, Tübingen, Germany. |
|•||Bülthoff HH (January-21-2005) Invited Lecture: Perception and Action in Virtual Environments, NASA Ames Research Center, Moffet Field, CA, USA. |
|•||Bülthoff HH (January-20-2005) Invited Lecture: Perception and Action in Virtual Environments, Valve Workshop, Electronic Imaging 2005, San Jose, CA, USA. |
|•||Bülthoff HH (October-12-2004) Invited Lecture: Perspektiven der Wahrnehmungsforschung, Lions Club, Pforzheim, Germany. |
|•||Bülthoff HH (September-17-2004): Object Recognition, European Summer School "Visual Neuroscience: From Spikes to Awareness", Schloss Rauischholzhausen, Germany. |
|•||Bülthoff HH (August-6-2004) Keynote Lecture: Object Recognition in Man and Machine, International Workshop on Object Recognition, Attention, and Action, Kyoto, Japan. |
|•||Bülthoff HH , Welchman AE and Maier SJ (August-2004) Abstract Talk: The Role of Extra-Retinal Cues in Velocity Constancy, 5. Neurowissenschaftliche Nachwuchskonferenz Tübingen (NeNa '04), Oberjoch, Germany5 13. |
To estimate the real world speed of an object the velocity of the retinal projection must be scaled by the perceived distance. If observers perceive objects travelling with the same speed at different distances from the eye as equally fast, they are said to exhibit velocity constancy. However, not all studies examining velocity constancy support the idea that observers can scale speeds for the viewing distance. In fact they suggest that subjects perceive angular rather than objective velocities (McKee & Welch 1989). The degree to which velocity constancy is observed depends on the information provided by the stimulus and its surround (Wallach 1939, Epstein 1978, Zohary & Sittig 1993). So far, studies on velocity constancy and distance have not considered the separate contribution of vergence as a cue to distance. Here, we specifically investigate whether eye vergence (as an extra-retinal cue to distance) contributes to velocity constancy. Subjects viewed two sequentially-presented rotating wire-frame spheres moving horizontally in the frontoparallel plane. They were required to report whether or not the speed of the second sphere exceeded the objective velocity of the first one. By varying the disparity of the second sphere with respect to the background plane, we could investigate the constancy of velocity judgments at different disparity defined distances. Under conditions of vergence to the plane of the presentation screen, observers produced data consistent with velocity constancy.
|•||Bülthoff HH , Newell FN , Hansen PC, Steven MS and Calvert GA (June-2004) Abstract Talk: An fMRI investigation of visual, tactile and visuo-tactile “what” and “where” dissociations, 5th International Multisensory Research Forum (IMRF 2004), Barcelona, Spain5 (70) . |
Visual information about the shape and location of objects is processed with different but interrelated pathways. Considerably less is understood about the existence of similar pathways in the tactile domain and how the tactile and visual domains converge to form a coherent multisensory percept. The present fMRI study was conducted to determine how the tactile and visual modalities interact during both shape ("what") and location ("where") tasks. In the visual-visual condition, the "what" task activated a large number of brain areas not observed in the "where" task including hippocampus, fusiform and lingual gyrus, middle and inferior frontal gyri. No additional brain areas were stimulated in the "where" than "what" tasks suggesting that "where" tasks recruit a subset of those brain areas involved in matching information relating to identity. In contrast, activity during the tactile-tactile condition differed according to task. Activity during the "what" task was greater in the right superior temporal gyrus, and during the "where" task in the left inferior parietal lobule. Brain areas activated during visuo-tactile object recognition included areas previously implicated in visuo-tactile object matching tasks. These areas were not similarly active during the visuo-tactile "where" task suggesting they may be specific for crossmodal object recognition.
|•||Bülthoff HH (May-29-2004): Artificial and Natural Vision, 3rd Peter Wallenberg Symposium Sensing and Feeling, Helsinki, Finnland. |
|•||Bülthoff HH (May-26-2004): Categorization and Recognition of Structures, Events and Objects, Final Review Meeting of the EU IST Project CogVis, Stockholm, Sweden. |
|•||Bülthoff HH (March-12-2004) Invited Lecture: Einführung in die Wahrnehmungsforschung, Blockpraktikum Psychophysik, Tübingen, Germany. |
|•||Bülthoff HH (December-1-2003): Die hohe Kunst des Sehens. Oder: Was können die Computer noch vom Menschen lernen?, Siemens Stiftung, München, Germany. |
|•||Bülthoff HH (November-28-2003): Perception and Action in Virtual Environments, MPG-Sektionssymposium, Berlin, Germany. |
|•||Bülthoff HH , Wallraven C and Schwaninger A (November-2003) Abstract Talk: Computational modeling of face recognition, 44th Annual Meeting of The Psychonomic Society, Vancouver, Canada44 26. |
Recent psychophysical results on face recognition (Schwaninger et al., 2002) support the notion that processing of faces relies on two separate routes. The first route processes highdetail components of the face (such as eyes, mouth, etc.), whereas the second route processes the configural relationship between these components. This model was successfully used to explain several aspects of face recognition, such as the Thatcher Illusion or the stimuli composed by Young et al. (1987). We discuss a computational framework, in which we implemented configural and component processing using image fragments and their spatial layout. Using the stimuli from the original psychophysical study, we were able to model the recognition performance. In addition, large-scale tests with highly realistic computer-rendered faces from the MPI database show better performance and robustness than do other computational approaches using one processing route only.
|•||Bülthoff HH (October-27-2003) Keynote Lecture: Multimodal Sensor Fusion in the Human Brain, IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003), Las Vegas, NV, USA. |
|•||Bülthoff HH (October-7-2003): Perception and Action in Virtual Environments, Telepresence and Teleaction, München, Germany . |
|•||Bülthoff HH , von der Heyde M , Riecke BE and Schulte-Pelkum J (October-2003) Abstract Talk: Circular vection is facilitated by a consistent photorealistic scene, 6th Annual Workshop on Presence (Presence 2003), Aalborg, Denmark6 37. |
It is well known that large visual stimuli that move in a uniform manner can induce illusory sensations of self-motion in stationary observers. This perceptual phenomenon is commonly referred to as vection. The prevailing notion of vection is that the illusion arises from bottom-up perceptual processes and that it mainly depends on physical parameters of the visual stimulus (e.g., contrast, spatial frequency etc.). In our study, we investigated whether vection can also be influenced by top-down processes: We tested whether a photorealistic image of a real scene that contains consistent spatial information about pictorial depth and scene layout (e.g., linear perspective, relative size, texture gradients etc.) can induce vection more easily than a comparable stimulus with the same image statistics where information about relative depth and scene layout has been removed. This was done by randomly shuffling image parts in a mosaic-like manner. The underlying idea is that the consistent photorealistic scene might facilitate vection by providing the observers with a convincing mental reference frame for the simulated environment so that they can feel "spatially present" in that scene. That is, the better observers accept this virtual scene instead of their physical surrounding - i.e., the simulation setup - as the primary reference frame, the less conflict between the two competing reference frames should arise and therefore spatial presence and ego-motion perception in the virtual scene should be enhanced. In a psychophysical experiment with 18 observers, we measured vection onset times and convincingness ratings of sensed ego-rotations for both visual stimuli. Our results confirm the hypothesis that cognitive top-down processes can influence vection: On average,we found 50% shorter vection onset times and 30% higher convincingness ratings of vection for the consistent scene. This finding suggests that spatial presence and ego-motion perception are closely related to one another. The results are relevant both for the theory of ego-motion perception and for ego-motion simulation applications in Virtual Reality.
|•||Bülthoff HH (September-20-2003): State of the art lecture, 6. Bamberger Morphologietage, Bamberg, Germany. |
|•||Bülthoff HH (July-3-2003): Virtuelle Welten: Ein neuer Weg zur Erforschung des Gehirns, Neurobiologisches Kolloquium der Universität Oldenburg, Oldenburg, Germany. |
|•||Bülthoff HH , Cunningham DW , Wallraven C and Breidt M (July-2003) Abstract Talk: Facial Animation Based on 3D Scans and Motion Capture, Conference on Computer Graphics and Interactive Techniques (ACM SIGGRAPH 2003), San Diego, CA, USA. |
One of the applications of realistic facial animation outside the film industry is psychophysical research in order to understand the perception of human facial motion. For this, an animation model close to physical reality is important. Through the combination of high-resolution 3D scans and 3D motion capture, we aim for such a model and provide a prototypical example in this sketch. State-of-the art 3D scanning systems deliver very high spatial resolution but usually are too slow for real-time recording. Motion capture (mocap) systems on the other hand have fairly high temporal resolution for a small set of tracking points. The idea presented here is to combine these two in order to get high resolution data in both domains that is closely based upon real-world properties. While this is similar to previous work, for example [Choe et al. 2001] or [Pighin et al. 2002], the innovation of our approach lies in the combination of precision 3D geometry, high resolution motion tracking and photo-realistic textures.
|•||Bülthoff HH , Newell FN and Ernst M (June-2003) Abstract Talk: Multisensory perception of actively explored objects, 4th International Multisensory Research Forum (IMRF 2003), Hamilton, Canada4 (76) . |
Many objects in our world can be picked up and freely manipulated, thus allowing information about an object to be available to both the visual and haptic systems. However, we understand very little about how object information is shared across the modalities. Under constrained viewing cross-modal object recognition is most efficient when the same surface of an object is presented to the visual and haptic systems (Newell et al. 2001). Here we tested cross modal recognition under active manipulation and unconstrained viewing of the objects. In Experiment 1, participants were allowed 30 seconds to learn unfamiliar objects visually or haptically. Haptic learning resulted in relatively poor haptic recogition performance relative to visual recognition. In Experiment 2, we increased the learning time for haptic exploration and found equivalent haptic and visual recognition, but a cost in cross modal recognition. In Experiment 3, participants learned the objects using both modalities together, vision alone or haptics alone. Recognition performance was tested using both modalities together. We found that recognition performance was significantly better when objects were learned by both modalities than either of the modalities alone. Our results suggest that efficient cross modal performance depends on the spatial correspondence of object information across modalities.
|•||Bülthoff HH (May-6-2003): Human Psychophysics and Presence, Telecom Italia Future Center, Venezia, Italy. |
|•||Bülthoff HH (January-23-2003): What Computers can't do yet: See and Feel, 38th Manfred Eigen Winter Seminar, Klosters, Switzerland. |
|•||Bülthoff HH (January-21-2003) Invited Lecture: Wie kommt die Welt in den Kopf? Sehen und Erkennen in Natur und Technik, Fachhochschule Darmstadt, Fachbereich Mathematik und Naturwissenschaften, Darmstadt, Germany. |
|•||Bülthoff HH (January-14-2003) Keynote Lecture: Biomorphic Robotics, FET Information Event "Beyond Robotics", Brussels, Belgium. |
|•||Bülthoff HH (January-8-2003): Le codage égocentrique dans la perception visuelle et haptique des objets, Chaire de Physiologie de la Perception et de l'Action M. Alain Berthoz, Institut de Mathématiques, College de France, Paris, France. |
|•||Bülthoff HH (November-30-2002): Virtual Reality as a Tool to Study Human Perception and Cognition, IEEE Conference on Visualization 2002 (VIS '02), Boston, MA, USA. |
|•||Bülthoff HH (November-28-2002) Invited Lecture: Wie kommt die Welt in den Kopf? - Sehen und Erkennen in Natur und Technik, Ambassador Club, Bamberg, Germany. |
|•||Bülthoff HH , Tjan BS , Kourtzi Z , Grodd W and Lestou V (November-2002) Abstract Talk: Human fMRI Studies of Visual Processing in Noise, 32nd Annual Meeting of the Society for Neuroscience (Neuroscience 2002), Orlando, FL, USA32 (721.1) . |
Processing of visual information entails the extraction of features from retinal images that mediate visual perception. In the human ventral cortex, retinotopic and higher visual areas (e.g. Lateral Occipital Complex-LOC) have been implicated in the analysis of simple and more complex features respectively. To test how processing of complex natural images progresses across the human ventral cortex, we used images of scenes and added visual noise that matched the signal in spatial-frequency power spectrum. The resulting images were rescaled to ensure constant mean luminance and r.m.s. contrast across all noise levels. We localized individually in each observer the retinotopic regions and the LOC and measured event-related BOLD response in these regions during a scene discrimination task performed at 4 noise levels. Behavioral accuracy increased with increasing signal-to-noise ratio (SNR). We found that log %BOLD signal change from fixation baseline vs. log SNR is well-described by a straight line for all visual areas. The regression slope increased monotonically from lower to higher visual areas along the ventral stream. For example, changes by a factor of 8 in SNR produced little or no change to the BOLD response in V1/V2, but resulted in progressively larger increases in V4v, posterior, and anterior sub-regions of the LOC. These findings suggest that the use of visual noise can reveal the progression in complexity of the natural-image features that are processed across the human visual areas.
|•||Bülthoff HH (October-23-2002): Objekterkennung in Biologie und Technik, Kolloquium des Instituts für Kognitionswissenschaft, Osnabrück, Germany. |
|•||Bülthoff HH (October-4-2002): Wie kommt die Welt in unseren Kopf?, "Salon", Tübingen, Germany. |
|•||Bülthoff HH (September-19-2002): High-level Vision in Man and Machine, Eidgenössische Technische Hochschule Zürich, Zürich, Switzerland. |
|•||Bülthoff HH (August-14-2002) Keynote Lecture: View-Based Dynamic Object Recognition Based on Human Perception, 16th International Conference on Pattern Recognition (ICPR 2002), Québec, Canada. |
|•||Bülthoff HH , Fahle M and Franz VH (August-2002) Abstract Talk: Are motor effects of visual illusions caused by different mechanisms than the perceptual illusions?, 25th European Conference on Visual Perception, Glasgow, UK, Perception31 (ECVP Abstract Supplement) 144. |
In previous studies, we found effects of the Ebbinghaus (or Titchener) illusion on grasping. This contradicts the notion that the motor system uses visual transformations which are (a) different from the perceptual transformations and (b) unaffected by visual illusions [Milner and Goodale, 1995 The Visual Brain in Action (Oxford: Oxford University Press)]. Here, we tested whether the grasp effects are generated independently from the perceptual illusions. This could be the case if the motor system treated the illusion-inducing context elements as obstacles and tried to avoid them. To test this hypothesis, we varied the distance between context elements and target. Aluminum discs (31, 34, or 37 mm in diameter) were surrounded by small or large context circles (10 or 58 mm in diameter) at one of two distances (24 or 31 mm midpoint target disc to nearest point on context circles). In the perceptual task, fifty-two participants adjusted the size of a comparison stimulus to match the size of the target disc. In the grasping task, participants grasped the target disc. The trajectories were recorded and the maximum grasp apertures determined. The motor illusion responded to the variation of distance between context elements and target disc in exactly the same way as the perceptual illusion. This suggests that the same neuronal signals are responsible for the perceptual and for the motor illusion.
|•||Bülthoff HH , Thornton IM and Vuong QC (August-2002) Abstract Talk: Direction asymmetries for incidentally processed walking figures, 25th European Conference on Visual Perception, Glasgow, UK, Perception31 (ECVP Abstract Supplement) 151. |
Recently we have begun to explore the incidental processing of biological motion. We ask whether walking figures that an observer is told to ignore still affect performance on a primary task. Using a number of different paradigms, we have shown that to-be-ignored walkers are still processed and can affect behaviour. During the course of these studies we have observed that such incidental effects are often modulated by the left - right orientation of the ignored walkers. More specifically, the extent of interference tends to be much larger when the to-be-ignored figures are shown in left profile versus right profile. Furthermore, the magnitude of the asymmetry tends to be much larger when the primary task itself is attentionally demanding. Here, we present data from two paradigms, an Eriksen flanker task and a novel 'checkerboard' task. In the latter, alternate display squares contain either a walking figure or a patch of randomly moving dots. Observers are told to ignore the walkers and have to make judgments on the relative phase of the dot patterns. Data from both tasks are used to illustrate the aforementioned direction asymmetry and the results are discussed in terms of canonical viewpoints for attentional sprites.
|•||Bülthoff HH (July-9-2002): Virtuelle Welten: Ein neuer Weg zur Erforschung des Gehirns, Universität Mainz: Studium Generale, Mainz, Germany. |
|•||Bülthoff HH (July-1-2002): San Bernardino Tunnel, Gestaltung und Tunnelsicherheit, Hochschule für Technik und Wirtschaft Chur, Chur, Switzerland. |
|•||Bülthoff HH (June-1-2002): Image-based object recognition, International Symposium at the Hanse Wissenschaftskolleg: SFB 517, Delmenhorst, Germany. |
|•||Bülthoff HH (March-19-2002): Image-based object recognition in man and machines, University of Southern California, Los Angeles, CA, USA. |
|•||Bülthoff HH (March-18-2002): Image-based object recognition in man and machines, California Institute of Technology (Caltech), Pasadena, CA, USA. |
|•||Bülthoff HH (December-8-2001): Recognition with local features under illumination changes, IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2001), Kauai, HI, USA. |
|•||Bülthoff HH (November-26-2001) Invited Lecture: Biologische und maschinelle Objekterkennung, Universität Bremen, Montagskolloquium SFB 517 (Neurokognition), Bremen, Germany. |
|•||Bülthoff HH (November-20-2001): Dynamic Facial Expressions, EU Comic Meeting, Bruxelles, Belgium. |
|•||Bülthoff HH (November-7-2001): Object and Face Recognition in Man and Machines, Mathematisches Forschungsinstitut, Oberwolfach, Germany. |
|•||Bülthoff HH (October-18-2001): Object and Face Recognition in Man and Machines, Universität Berlin, Institut für Psychologie: Graduiertenkolleg, Berlin, Germany. |
|•||Bülthoff HH (October-11-2001) Invited Lecture: Die Welt in unseren Köpfen: Sehen und Erkennen in Natur und Technik, Zeppelin Museum, Friedrichshafen, Germany. |
|•||Bülthoff HH (August-24-2001): Object and Face Recognition in Man and Machines, Stanford University, Department of Psychology, Stanford, USA. |
|•||Bülthoff HH (August-7-2001): Object and Face Recognition in Man and Machines, University of Berkeley, USA. |
|•||Bülthoff HH , Thornton IM and Knappmeyer B (August-2001) Abstract Talk: Characteristic motion of human face and human form, Twenty-fourth European Conference on Visual Perception, Kusadasi, Turkey, Perception30 (ECVP Abstract Supplement) 33. |
Do object representations contain information about characteristic motion as well as characteristic form? To address this question we recorded face and body motion of human actors and applied these patterns to computer models. During an incidental learning phase observers were asked to make trait judgments about these animated faces (experiment 1) or characters (experiment 2). During training, the faces and characters always moved with the motion of one particular actor. For example, face A was always animated with actor A's motion, and face B with actor B's motion. In tests, stimuli were either consistent (face A/actor A) or inconsistent (face A/actor B) relative to training. In addition, we systematically introduced ambiguity to the form of the stimuli (eg morphing between face A and face B). Results indicate that as form becomes less informative, observers' responses become biased by the incidentally learned motion patterns. We conclude that information about characteristic motion seems to be part of the representation of these objects. As shape and motion information can be combined independently with this technique, future studies will allow us to quantify the relative importance of characteristic motion versus characteristic form.
|•||Bülthoff HH (July-30-2001): Image-based object recognition in man and machines, Workshop on Vision Based Object Recognition in Robotics, 2001 IEEE International Symposium on Computational Intelligence in Robotics and Automation (CIRA 2001), Banff, Canada. |
|•||Bülthoff HH (July-30-2001): Dynamic Aspects of Object and Face Recognition, Stockholm Workshop on Computational Vision, Rosenon Island, Schweden. |
|•||Bülthoff HH (June-28-2001): Sehen und Erkennen in Natur und Technik (und Kunst), Dissertationswettbewerb, Max-Planck-Institut für Psychologische Forschung, München, Germany. |
|•||Bülthoff HH (November-29-2000): Image-based object recognition, Gemeinsames Forschungskolloquium "Theoretische und Experimentelle Kognitions-Psychologie" des Max Planck Instituts für Psychologische Forschung der der Allgemeinen und Experimentellen Psychologie der Ludwig-Maximilians-Universität München LMU, München, Germany. |
|•||Bülthoff HH (November-3-2000): Image-based Object Recognition, University of Glasgow, Psychology Department, Glasgow, UK. |
|•||Bülthoff HH (October-27-2000): Image-based Object Recognition, University of Zürich, Institute of Neuroinformatics, Zürich, Switzerland. |
|•||Bülthoff HH (August-27-2000) Keynote Lecture: Visual, haptic, and vestibular cue integration, 23rd European Conference on Visual Perception (ECVP 2000), Groningen, Netherlands, Perception29 (ECVP Abstract Supplement) 3-4. |
In the past we have studied the integration of different visual cues for depth perception. Recently we have begun to study the interaction between cues from different sensory modalities. In a recent paper (Ernst et al, 2000 Nature Neuroscience 3 69 - 73) we could show that active touch of a surface can change the visual perception of surface slant. Apparently, haptic information can change the weights assigned to different visual cues for surface orientation. In another multisensory integration project we could show that visual and haptic information about the shape of objects can lead to a common representation with cross modal access [Bülthoff et al, 1999 Investigative Ophthalmology & Visual Science 40(4) 398]. Now we are investigating another input into our spatial representation system. Using a 6-DOF Motion Platform we are studying the interaction between the vestibular and the visual system for recognition. I report on first experiments that show that we can derive reliable information about position and velocity of a moving observer from the vestibular system. This information could be used for spatial updating in recognition tasks where the recognition of objects or scenes is facilitated by knowing the position and viewing direction of the observer.
|•||Bülthoff HH , Cunningham DW and Chatziastros A (August-2000) Abstract Talk: Can we be forced off the road by the visual motion of snowflakes? Immediate and longer-term responses to visual perturbations, 23rd European Conference on Visual Perception (ECVP 2000), Groningen, Netherlands, Perception29 (ECVP Abstract Supplement) 118. |
Several sources of information have been proposed for the perception of heading. Here, we independently varied two such sources (optic flow and viewing direction) to examine the influence of perceived heading on driving. Participants were asked to stay in the middle of a straight road while driving through a snowstorm in a simulated, naturalistic environment. Subjects steered with a forced-feedback steering wheel in front of a large cylindrical screen. The flow field was varied by translating the snow field perpendicularly to the road, producing a second focus of expansion (FOE) with an offset of 15°, 30°, or 45°. The perceived direction was altered by changing the viewing direction 5°, 10°, or 15°. The onset time, direction, and magnitude of the two disturbances were pseudo-randomly ordered. The translating snow field caused participants to steer towards the FOE of the snow, resulting in a significant lateral displacement on the road. This might be explained by induced motion. Specifically, the motion of the snow might have been misperceived as a translation of the road. On the other hand, changes in viewing direction resulted in subjects steering towards the road's new vantage point. While the effect of snow persisted over repeated exposures, the viewing-direction effect attenuated.
|•||Bülthoff HH (June-26-2000) Keynote Lecture: Computer Graphics Psychophysics, 11th Eurographics Workshop on Rendering Techniques , Brno, Czech Republic. |
|•||Bülthoff HH (June-23-2000) Invited Lecture: Image-based Object Recognition, Ecole Polytechnique & Laboratoire de Physiologie pour la Perception et l'Action: Collège de France (LPPA ), Paris, France. |
|•||Bülthoff HH (May-15-2000) Invited Lecture: Image-based Object Recognition and Example-based Face Synthesis, First IEEE International Conference on Biological Motivated Computer Vision (BMCV 2000), Seoul, Korea. |
|•||Bülthoff HH (February-25-2000): Multisensory Recognition of Objects, 3. Tübinger Wahrnehmungskonferenz (TWK 2000), Tübingen, Germany24. |
these representations are useful, if not essential, in a wide variety of cognitive tasks such as identification of objects, guiding actions and in directing spatial awareness and attention. Determining the properties of this representation has long since been a contentious issue. One method of probing the nature of human representation is by determining the extent to which it can surpass or go beyond visual (or sensory) experience. From a strictly empiricist standpoint what cannot be seen cannot be represented; except as a combination of things that have been experienced. In this case representation is always limited by experience and one such limitation on experience is that we always perceive the world from a specific viewpoint determined by our position in space. We show that going beyond experience is extremely difficult to do. This is demonstrated mainly by the learning and recognition of objects, both novel and familiar. However, from a psychological standpoint it is pointless discussing representation devoid of the functional role it plays in facilitating cognitive tasks. In considering the functional role of representation we must shed the simplifying assumption of an independent and modular visual system that reconstructs distal space and replace it with a functional definition which depends on the cognitive task and which is limited by attention. We therefore also present an overview of a new series of ‘old-fashioned’ object and scene recognition studies carried out within realistic, interactive, contexts. We find the most flexible means of looking at the functional role of representation is within virtual (computer generated) contexts. Computer simulations can now provide both highly realistic visual contexts as well as realistic interactivity, including feedback. We demonstrate how this new technology can be used to address an old problem. In cases where this technology is not advanced enough to provide multisensory information about the shape of objects we used real objects made out of LegoTM bricks. With these objects we studied how the brain exchanges visual and haptic information to build a more complete representation of object shape learned in one orientation and tested in a different orientation. We found that visual as well as haptic recognition strongly depends on the orientation difference between training and testing. Interestingly we found that recognition across modalities was best for rotations that involved an exchange between the front and back of an object. Taken together, we conclude that the visual and haptic system code view-specific representations of objects, but each system has its own "view"of an object. For the visual system it is the surface of the object facing the observer; for the haptic system, it is the surface of the object that the fingers explore more extensively, namely, the backside of the object.
|•||Bülthoff HH (February-12-2000): Recognition and Navigation in Virtual Environments, Institute for Hearing Accessibility Research (IHEAR) Workshop on Acoustic Ecology, Vancouver, Canada. |
|•||Bülthoff HH (January-22-2000): Image-based Recognition in Man, Monkey and Machines, Interdisziplinäres Kolloquium, Klosters, Schweiz. |
|•||Bülthoff HH (November-25-1999) Invited Lecture: Die Welt in unseren Köpfen: Sehen und Erkennen in Natur und Technik, Universität Kaiserslautern, Studium Integrale, Kaiserslautern, Germany. |
|•||Bülthoff HH (November-12-1999): How to cheat and get away with it or what computer graphics can learn from human psychophysics, Eberhard-Karls Universität, Wilhelm-Schickard Institut für Informatik (WSI-GRIS), Tübingen, Germany. |
|•||Bülthoff HH (October-3-1999) Invited Lecture: Die Welt in unseren Köpfen: Sehen und Erkennen in Natur und Technik, Symposium "Turm der Sinne", Nürnberg, Germany. |
|•||Bülthoff HH (August-16-1999): Recognition of objects and scenes in virtual and real environments, Smith-Kettlewell Institute, San Francisco, CA, USA. |
|•||Bülthoff HH (August-10-1999): Image-based strategies in man, monkeys, and machines, 26th International Conference on Computer Graphics and Interactive Techniques (SIGGRAPH 99), Los Angeles, CA, USA. |
|•||Bülthoff HH (July-21-1999): Multisensory recognition of objects and scenes, ATR Symposium on Face and Object Recognition, Kyoto, Japan. |
|•||Bülthoff HH (June-30-1999): Virtuelle Realität: ein methodisches Werkzeug bei Untersuchungen des Sehsystems, Neurologische Klinik, Freiburg, Germany. |
|•||Bülthoff HH (May-19-1999): Using virtual reality technology to study the human representation of space and objects, Werner Reimers Stiftung, Bad Homburg, Germany. |
|•||Bülthoff HH (March-23-1999): Die hohe Kunst des Sehens: Erkennen in Natur und Technik, Hospitalhof Stuttgart: Evangelisches Bildungswerk, Stuttgart, Germany. |
|•||Bülthoff HH (October-30-1998): Sixth Kanizsa Lecture: Perception and Action: Controlling the loop using Virtual Realities, University of Trieste, Trieste, Italy. |
|•||Bülthoff HH and Christou CG (August-26-1998) Abstract Talk: Vision in a Natural Environment, 21st European Conference on Visual Perception, Oxford, UK, Perception27 (ECVP Abstract Supplement) 18. |
It has been twenty years since David Marr produced his ground-breaking framework of vision as a hierarchical combination of distinct modules, each performing its own computation on retinal input. This modular theory is a computational simplification that treats the goal of vision as the extraction of visual cues. Researchers have been addressing how each of the modules could possibly operate in isolation. To this end we have had many ingenious inventions such as the random-dot stereogram, intricate plaid patterns ,and colourful Mondrians. However, the simplifications afforded by such thinking are often offset by the difficulties they introduce. First, the world does not consist of plaid patterns--it's more complex than that. Second, isolation of visual information almost inevitably leads to ambiguity in the reconstruction of the real world. The ill-posedness of vision with isolated cues can be resolved by the combination of cues: disparity, shading, texture, motion, etc. Using statistical methods such the Bayesian framework allows for the maximisation of the information derived from various sources. But, it seems still not to be enough. Perhaps a better way of thinking about seeing can be reformulated; vision does not start at the retina. Vision starts when a particular task has to be performed. The role of vision is not one of reconstruction of the real world in the brain but one of serving the needs of a mobile active being that functions in the real world. The talks presented in this session perhaps give a flavour of how it has been in Vision and also perhaps a flavour of how it will be in the future.
|•||Bülthoff HH (August-5-1998): View-based Recognition and Navigation in Natural Environments, 1998 Stockholm Workshop on Computational Vision, Rosenön, Sweden. |
|•||Bülthoff HH (June-25-1998): View-based Strategies for Recognition and Navigation, ENA Workshop on Neuroinformatics, Potsdam, Germany. |
|•||Bülthoff HH (June-19-1998): Wahrnehmen und Agieren im Raum, Universität Zürich. Psychologisches Institut, Zürich, Switzerland. |
|•||Bülthoff HH (April-20-1998): Gehirn und Wahrnehmung: Neueste Erkenntnisse aus der Hirnforschung, Heinz Nixdorf MuseumsForum, Paderborn, Germany. |
|•||Bülthoff HH (April-5-1998): Vision in the Perception Action Framework, Symposium "The Neurology of Vision: New Vistas", Tübingen, Germany. |
|•||Bülthoff HH (February-18-1998) Invited Lecture: Die Welt in unseren Köpfen: Sehen und Erkennen in Natur und Technik, Deutsches Museum, München, Germany. |
|•||Bülthoff HH (February-12-1998): Sehen und Erkennen in Technik und Biologie, Naturforschende Gesellschaft Graubünden, Chur, Switzerland. |
|•||Bülthoff HH (January-23-1998): Bild-basierte Objekterkennung, Universität Marburg, Marburg, Germany. |
|•||Bülthoff HH (October-12-1997): Computational theory of vision, Summerschool Graduierten Kolleg (GKN), Konstanz, Germany. |
|•||Bülthoff HH (October-3-1997): Scene recognition in virtual environments, Conference on Vision for Reach and Grasp, Minneapolis, MN, USA. |
|•||Bülthoff HH (September-3-1997): View-based object recognition, 4eme Assises de Programme de Recherche en Sciences Cognitives de Toulouse, Toulouse, France. |
|•||Bülthoff HH (April-4-1997): View-based shape representation, Spring School Conference, Utrecht, Netherlands. |
|•||Bülthoff HH (March-25-1997): The view-based approach to object recognition, scene perception and biological motion, Conference on Active Vision in Animals and Machines, Berlin, Germany. |
|•||Bülthoff HH (January-8-1997): View-based representations, navigation and biological motion perception, AVM Workshop on "Models, Views and Appearances: Contrasting Approaches to Representation?", St. Francois Longchamp, France. |
|•||Bülthoff HH , Bülthoff I and Edelman S (September-1996) Abstract Talk: Features of the representation space for 3D objects, 19th European Conference of Visual Perception, Strasbourg, France, Perception25 (ECVP Abstract Supplement) 49-50. |
To explore the nature of the representation space of 3-D objects, we studied human performance in forced-choice classification of objects composed of four geon-like parts, emanating from a common centre. The two class prototypes were distinguished by qualitative contrasts (cross-section shape; bulge/waist), and by metric parameters (degree of bulge/waist, taper ratio). Subjects were trained to discriminate between the two prototypes (shown briefly, from a number of viewpoints, in stereo) in a 1-interval forced-choice task, until they reached a 90% correct-response performance level. In experiment 1, eleven subjects were tested on shapes obtained by varying the prototypical parameters both orthogonally (Ortho), and in parallel (Para) to the line connecting the prototypes in the parameter space. For the eight subjects who performed above chance, the error rate increased with the Ortho parameter-space displacement between the stimulus and the corresponding prototype: F1,68=3.6, p<0.06 (the effect of the Para displacement was marginal). Clearly, the parameter-space location of the stimuli mattered more than the qualitative contrasts (which were always present). To find out whether both prototypes or just the nearest neighbour of the test shape influenced the decision, in experiment 2 eight new subjects were tested on a fixed set of shapes, while the test-stage distance between the two classes assumed one of three values (Far, Intermediate, or Near). For the six subjects who performed above chance, the error rate (on physically identical stimuli) in the Near condition was higher than in the other two conditions: F1,89=3.7, p<0.06. The results of the two experiments contradict the prediction of theories that postulate exclusive reliance on qualitative contrasts, and support the notion of a metric representation space with the subjects' performance determined by distances to more than one reference point or prototype (cf Edelman, 1995 Minds and Machines 5 45 - 68).
|•||Bülthoff HH (July-22-1996): Integration of Visual Cues, Neuroinformatik Symposium, Schloss Reisensburg, Germany. |
|•||Bülthoff HH (May-28-1996): Psychophysik des Sehens, Bundesministerium für Bildung und Forschung, Bonn, Germany. |
|•||Bülthoff HH (January-26-1996): Object and Face Recognition, NHK Corporation, Tokyo, Japan. |
|•||Bülthoff HH (January-26-1996): View-based object recognition and navigation, IEICE Technical Meeting, Tokyo, Japan. |
|•||Bülthoff HH (January-23-1996): View-based object recognition: the role of parts, symmetry and illumination, ATR Symposium on Face and Object Recognition, Kyoto, Japan. |
|•||Bülthoff HH (December-7-1995): Psychophysical support for image-based object recognition, Second Asian Conference on Computer Vision, Singapore. |
|•||Bülthoff HH (September-15-1995): Recognition and navigation in virtual realities, British Association - Annual Festival of Science, University of Newcastle, Newcastle upon Tyne, England. |
|•||Bülthoff HH , Zabinski M, Blanz V and Tarr MJ (August-1995) Abstract Talk: To what extent do unique parts influence recognition across changes in viewpoint?, 18th European Conference on Visual Perception, Tübingen, Germany, Perception24 (ECVP Abstract Supplement) 3. |
We investigated how varying the number of unique three-dimensional parts within an object influenced recognition across changes in viewpoint. Stimuli were realistically-shaded images of objects composed of five three-dimensional volumes linked end-to-end. Of the five parts within each object, either zero, one, three, or five were qualitatively distinct from other members of the recognition set (e.g., brick versus cone). Non-distinct parts were cylindrical tubes. Independent of the number of distinct parts, the three-dimensional angles between components were different for each object as in Bülthoff and Edelman (1992). In both sequential matching and naming tasks we compared the impact of depth rotations on recognition performance. Separate between-subject conditions were defined based on the number of distinct parts for each member of the recognition set. The No-Parts condition was run on all subjects and served as a baseline for the other conditions. For both tasks, three major results stand out. First, regardless of the number of qualitatively distinct parts there was an effect of viewpoint on recognition performance. Second, the impact of viewpoint change in the One-Part condition was less than that in each of the other conditions. Third, the addition of parts beyond a single unique part produced strong viewpoint-dependent recognition performance that was comparable to that obtained for objects with no distinct parts. Taken together these findings indicate that visual recognition may be accounted for by view-based models in which image-based representations include some qualitatively-defined features.
|•||Bülthoff HH and Troje NF (August-1995) Abstract Talk: Viewpoint in variance in face recognition: a closer look, 18th European Conference on Visual Perception, Tübingen, Germany, Perception24 (ECVP Abstract Supplement) 13. |
|•||Bülthoff HH (June-29-1995): Objekterkennung und Raumorientierung ohne drei-dimensionale Repräsentation, Universität Ulm: Fakultät für Informatik, Abteilung Neuroinformatik, Ulm, Germany. |
|•||Bülthoff HH (June-22-1995): Sprache, Sehen, Gedächtnis: Neue Methoden der Hirnforschung, Hauptversammlung der Max-Planck Gesellschaft, Potsdam, Germany. |
|•||Bülthoff HH (June-12-1995): Objekterkennung und Raumorientierung ohne drei-dimensionale Repräsentation, Universität Bremen: Institut für Hirnforschung, Bremen, Germany. |
|•||Bülthoff HH (March-13-1995): How are three-dimensional objects represented in the brain?, AT&T, Bell Laboratories, Holmdel, NJ, USA. |
|•||Bülthoff HH (March-12-1995): Image-based Object Recognition, NECI Workshop, Princeton, NJ, USA. |
|•||Bülthoff HH (December-14-1994): Drei-dimensionale Objekterkennung ohne drei-dimensionale Repräsentation, Universität Bremen, Informatik-AG KI, Bremen, Germany. |
|•||Bülthoff HH (December-12-1994): Drei-dimensionale Objekterkennung ohne drei-dimensionale Repräsentation, Institut für Biologie II, Aachen, Germany. |
|•||Bülthoff HH (November-24-1994): Drei-dimensionale Objekterkennung ohne drei-dimensionale Repräsentation, Max-Planck Institut für psychologische Forschung, München, Germany. |
|•||Bülthoff HH (October-2-1994) Invited Lecture: Psychophysical support for a Bayesian framework for depth-cue integration, Annual Meeting of the Optical Society of America (OSA 1994), Dallas, TX, USA. |
|•||Bülthoff HH (September-6-1994): Image-based Object Recognition: Psychophysics, 17th Annual Meeting of the European Neuroscience Association (ENA 1994), Vienna, Austria, European Journal of Neuroscience6 (Supplement 7) 67. |
|•||Bülthoff HH (July-11-1994): A Bayesian Framework for the Integration of Depth Cues, A&P Conference, Kyoto, Japan. |
|•||Bülthoff HH (July-7-1994): A Bayesian Framework for the Integration of Depth Cues, Stereo-Workshop, Tübingen, Germany. |
|•||Bülthoff HH (June-30-1994): Virtual Reality: Ein Werkzeug in der psychophysischen Gehirnforschung, Studium Generale, Tübingen, Germany. |
|•||Bülthoff HH (April-9-1994): How are three-dimensional objects represented in the brain?, Object Recognition Symposium, Syracuse, NY, USA. |
|•||Bülthoff HH (January-27-1994): Does the Seeing Brain know Physics?, Neurokolloquium, Tübingen, Germany. |
|•||Bülthoff HH (April-26-1993): 3D Objekterkennung ohne 3D Repräsentation, University of Bremen, Bremen, Germany. |
|•||Bülthoff HH (January-6-1993): Ideal observers and psychophysics: shape from texture, Chatham Meeting on "Perception as Bayesian Inference", Cape Cod, MA., USA. |
|•||Bülthoff HH (January-5-1993): A Bayesian approach to sensor fusion: strong coupling and competitive priors, Chatham Meeting on "Perception as Bayesian Inference", Cape Cod, MA., USA. |
|•||Bülthoff HH (October-28-1992): 3D Object Recognition without 3D Object Representation, University of Western Ontario, London, Ontario. |
|•||Bülthoff HH (April-20-1992): Psychophysical support for a 2D view interpolation theory of object recognition, Harvard University, Cambridge, MA., USA. |
|•||Bülthoff HH (April-3-1992): Integration of Visual Modules, Boston University, Boston, MA., USA. |
|•||Bülthoff HH (January-30-1992): 3D Object Recognition by 2D View Interpolation: more evidence from human and monkey psychophysics, Weizmann Institute, Rehovot, Israel. |
|•||Bülthoff HH (January-28-1992): Computer Graphics Psychophysics of early, middle and highlevel vision, IAICV conference plenary talk, Ramt-Gan, Israel. |
|•||Bülthoff HH (January-10-1992): Learning to Recognize 3D Objects from a small set of 2D Images, M.I.T. Endicott House Learning Meeting, Boston, MA, USA. |
|•||Bülthoff HH (December-6-1991): Psychophysical support for a 2D view interpolation theory of object recognition, Neural Information Processing Workshop on Self-Organization and Unsupervised Learning in Vision, Vail, CO., USA. |
|•||Bülthoff HH (October-21-1991): Computer Graphik Psychophysik: Ein neuer Ansatz zur Aufklärung kognitiver Sehleistungen, Max Planck Institut für biologische Kybernetik, Tübingen, Germany. |
|•||Bülthoff HH (September-29-1991): Evaluating Object Recognition Theories by Computer Graphics Psychophysics, Dahlem Workshop on Exploring Brain Functions: Models in Neuroscience, Berlin, Germany. |
|•||Bülthoff HH (September-25-1991): Learning and Object Recognition: from Computation to Psychophysics, Caltech, Pasadena, CA., USA. |
|•||Bülthoff HH (May-17-1991): 3D Object Recognition without 3D Object Representation, Baylor College of Medicine, Houston, TX., USA. |
|•||Bülthoff HH (April-25-1991): 3D Object Recognition without 3D Object Representation., Yale University, Department of Psychology, New Haven, CT., USA. |
|•||Bülthoff HH (March-6-1991): 3D Object Recognition without 3D Object Representation., MIT, Department of Brain and Cognitive Sciences, Cambridge, MA., USA . |
|•||Bülthoff HH (November-6-1990): Shape from X: psychophysics and computation, SPIE Conference on Sensor Fusion III: 3-D Perception and Recognition, Boston, MA., USA. |
|•||Bülthoff HH (November-3-1990): Bildzentrierte Repräsentationen in dreidimensionaler Objekterkennung, Universität Ulm, Lehrstuhl für Informatik, Ulm, Germany. |
|•||Bülthoff HH (September-3-1990): Integration von Modulen zur Wahrnehmung von Oberflächen und Objekten, Max Planck Institut für biologische Kybernetik, Tübingen, Germany. |
|•||Bülthoff HH (August-30-1990): Integration von Modulen zur Wahrnehmung von Oberflächen und Objekten, Ruhr-Universität Bochum. Lehrstuhl für Neuroinformatik, Bochum, Germany. |
|•||Bülthoff HH (July-26-1990): Integration of various cues to depth, THE RANK PRIZE FUNDS, Neural Representation of 3-D Space, Grasmere, UK. |
|•||Bülthoff HH (July-12-1990): Integration of Depth Modules, Robotics System Design Department of Computer Science Industrial Partners Program, Brown University, Providence, RI. USA. |
|•||Bülthoff HH (March-28-1990): Integration of Depth Information, Conference on "Computational Models in Vision'', Trieste, Italy. |
|•||Bülthoff HH (March-14-1990): Does the Seeing Brain know Physics, Department of Applied Mathematics, Brown University, Providence, RI., USA. |
978-3-927091-77-1 , , and : 10th Tübingen Perception Conference: TWK 2007, 10th Tübinger Wahrnehmungskonferenz, 163, Knirsch, Kirchentellinsfurt, Germany, (July-2007).
3-927091-73-1 , , and : 9th Tübingen Perception Conference: TWK 2006, 9th Tübinger Wahrnehmungskonferenz, 177, Knirsch, Kirchentellinsfurt, Germany, (March-2006).
3-927091-70-7 , , and : 8th Tübingen Perception Conference: TWK 2005, 8th Tübinger Wahrnehmungskonferenz, 202, Knirsch, Kirchentellinsfurt, Germany, (February-2005).
3-89838-059-9 , and : Dynamic Perception: Workshop of the GI Section "Computer Vision", 5th Workshop on Dynamic Perception 2004, 253, Akademische Verlagsgesellschaft, Berlin, Germany, (November-2004).
978-3-540-22945-2 , , and : Pattern Recognition: 26th DAGM Symposium, 26th Pattern Recognition Symposium, 581, Springer, Berlin, Germany, (August-2004).
3-927091-68-5 , , and : 7th Tübingen Perception Conference: TWK 2004, TWK 2004, 198, Knirsch, Kirchentellinsfurt, Germany, (February-2004).
3-927091-62-6 , , , and : 6. Tübinger Wahrnehmungskonferenz, Sixth Perception Conference at Tübingen (TWK 2003), 183, Knirsch, Kirchentellinsfurt, Germany, (February-2003).
3-540-00174-3 , , and : Biologically Motivated Computer Vision: Second International Workshop, 2nd International Workshop on Biologically Motivated Computer Vision (BMCV 2002), 662, Springer, Berlin, Germany, (November-2002).
3-927091-56-1 , , and : TWK 2002 : Beiträge zur 5. Tübinger Wahrnehmungskonferenz, 5. Tübinger Wahrnehmungskonferenz (TWK 2002), 222, Knirsch, Kirchentellinsfurt, Germany, (February-2002).
3-927091-54-5 , , and : TWK 2001: Beiträge zur 4. Tübinger Wahrnehmungskonferenz, 4. Tübinger Wahrnehmungskonferenz (TWK 2001), 184, Knirsch, Kirchentellinsfurt, Germany, (March-2001).
978-3-540-67560-0 , and : Biologically Motivated Computer Vision: First IEEE International Workshop on Biologically Motivated Computer Vision (BMCV 2000), First IEEE International Workshop on Biologically Motivated Computer Vision (BMCV 2000), 656, Springer, Berlin, Germany, (May-2000).
3-927091-49-9 , , and : TWK 2000: Beiträge zur 3. Tübinger Wahrnehmungskonferenz, 3. Tübinger Wahrnehmungskonferenz (TWK 2000), 169, Knirsch, Kirchentellinsfurt, Germany, (February-2000).
3-927091-45-6 , , and : Beiträge zur 2. Tübinger Wahrnehmungskonferenz, 2. Tübinger Wahrnehmungskonferenz (TWK 99), 134, Knirsch, Kirchentellinsfurt, Germany, (February-1999).
3-927091-40-5 , , and : Visuelle Wahrnehmung: Beiträge zur 1. Tübinger Wahrnehmungskonferenz, 1. Tübinger Wahrnehmungskonferenz (TWK 1998), 170, Knirsch, Kirchentellinsfurt, Germany, (February-1998).
, and (October-2014) Contributions of visual and proprioceptive information to travelled distance estimation during changing sensory congruencies Experimental Brain Research 232(10) 3277-3289.
, , , , and (September-2014) A Framework for Biodynamic Feedthrough Analysis Part I: Theoretical Foundations IEEE Transactions on Cybernetics 44(9) 1686-1698.
, , , , , and (September-2014) A Framework for Biodynamic Feedthrough Analysis Part II: Validation and Application IEEE Transactions on Cybernetics 44(9) 1699-1710.
, , and (September-2014) Pilot Adaptation to Different Classes of Haptic Aids in Tracking Tasks Journal of Guidance, Control, and Dynamics Epub ahead.
, , , , , and (August-2014) Owning an Overweight or Underweight Body: Distinguishing the Physical, Experienced and Virtual Body PLoS ONE 9(8) 1-13.
, , , and (August-2014) The eyes grasp, the hands see: Metric category knowledge transfers between vision and touch Psychonomic Bulletin & Review 21(4) 976-985.
, , , , and (July-2014) A Biodynamic Feedthrough Model Based on Neuromuscular Principles IEEE Transactions on Cybernetics 44(7) 1141-1154.
, and (July-2014) A Novel Overactuated Quadrotor Unmanned Aerial Vehicle: Modeling, Control, and Experimental Validation IEEE Transactions on Control Systems Technology Epub ahead.
, , , , , and (July-2014) Mathematical Biodynamic Feedthrough Model Applied to Rotorcraft IEEE Transactions on Cybernetics 44(7) 1025-1038.
, , , and (June-2014) Active In-Hand Object Recognition on a Humanoid Robot IEEE Transactions on Robotics Epub ahead.
, , , and (June-2014) Emotion categorization of body expressions in narrative scenarios Frontiers in Psychology 5(623) 1-11.
, and (April-2014) A comparison of geometric- and regression-based mobile gaze-tracking Frontiers in Human Neuroscience 8(200) 1-12.
, , and (April-2014) Intersegmental Eye-Head-Body Interactions during Complex Whole Body Movements PLoS ONE 9(4) 1-15.
and (April-2014) Motor-visual neurons and action recognition in social interactions Behavioral and Brain Sciences 37(2) 197-198.
, , , and (April-2014) The importance of stimulus noise analysis for self-motion studies PLoS ONE 9(4) 1-8.
, and (March-2014) Local and global reference frames for environmental spaces Quarterly Journal of Experimental Psychology 67(3) 542-569.
, , , and (February-2014) Interactive Multiple Object Tracking (iMOT) PLoS ONE 9(2) 1-19.
, , , and (January-2014) A key region in the human parietal cortex for processing proprioceptive hand feedback during reaching movements NeuroImage 84 615–625.
, , , and (January-2014) A psychophysical evaluation of haptic controllers: viscosity perception of soft environments Robotica 32(1) 1-17.
, , and (January-2014) Human sensitivity to vertical self-motion Experimental Brain Research 232(1) 303-314.
, , , and (January-2014) Putting Actions in Context: Visual Action Adaptation Aftereffects Are Modulated by Social Contexts PLoS ONE 9(1) 1-10.
, , and (January-2014) Decentralized Rigidity Maintenance Control with Range-only Measurements for Multi-Robot Systems International Journal of Robotics Research . submitted
, , , and (January-2014) Methods for Multi-Loop Identification of Visual and Neuromuscular Pilot Responses IEEE Transactions on Cybernetics . submitted
, , , and (December-2013) A practical biodynamic feedthrough model for helicopters CEAS Aeronautical Journal 4(4) 421-432.
, , , , , and (December-2013) Visual capture and the experience of having two bodies: evidence from two different virtual reality techniques Frontiers in Psychology 4(946) 1-15.
, , , , and (November-2013) Integration of visual and inertial cues in the perception of angular self-motion Experimental Brain Research 231(2) 209-218.
and (October-2013) Human path navigation in a three-dimensional world Behavioral and Brain Sciences 36(5) 544-545.
, and (October-2013) Learning to navigate: Experience versus maps Cognition 129(1) 24–30.
, , and (October-2013) Saccade reaction time asymmetries during task-switching in pursuit tracking Experimental Brain Research 230(3) 271-281.
, , and (October-2013) View dependencies in the visual recognition of social interactions Frontiers in Psychology 4(752) 1-10.
and (September-2013) Verbal Shadowing and Visual Interference in Spatial Memory PLoS ONE 8(9) 1-9.
, , , , and (August-2013) Naturalistic Stimulus Structure Determines the Integration of Audiovisual Looming Signals in Binocular Rivalry PLoS ONE 8(8) 1-8.
, , and (August-2013) Psychological influences on distance estimation in a virtual reality environment Frontiers in Human Neuroscience 7(580) 1-7.