Researchdirections

PLAR: Prompt Learning for Action Recognition

AZTR: We present a new general learning approach, Prompt Learning for Action Recognition (PLAR), which leverages the strengths of prompt learning to guide the learning process. Our approach is designed to predict the action label by helping the models focus on the descriptions or instructions associated with actions in the input videos. Our formulation uses various prompts, including learnable prompts, auxiliary visual information, and large vision models to improve the recognition performance.

PMI Sampler: Patch Similarity Guided Frame Selection for Aerial Action Recognition

PMISampler: We present a new algorithm for selection of informative frames in video action recognition. Our approach is designed for aerial videos captured using a moving camera where human actors occupy a small spatial resolution of video frames. Our algorithm utilizes the motion bias within aerial videos, which enables the selection of motion-salient frames. We introduce the concept of patch mutual information (PMI) score to quantify the motion bias between adjacent frames, by measuring the similarity of patches.

PORCA: Modeling and planning for autonomous driving among many pedestrians.

Abstract This projects investigates a planning system for autonomous driving among many pedestrians. A key ingredient of our approach is a motion prediction model for pedestrians and vehicles. It accounts for both a pedestrian’s global navigation intention and local interactions with the vehicle and other pedestrians. Unfortunately, the autonomous vehicle does not know the pedestrians’ intentions a priori and requires a planning algorithm that hedges against the uncertainty in pedestrian intentions.

PedVR: Simulating Gaze-Based Interactions between a Real Rser and Virtual Crowds

Abstract We present a novel interactive approach, PedVR, to generate plausible behaviors for a large number of virtual humans, and to enable natural interaction between the real user and virtual agents. Our formulation is based on a coupled approach that combines a 2D multi-agent navigation algorithm with 3D human motion synthesis. The coupling can result in plausible movement of virtual agents and can generate gazing behaviors, which can considerably increase the believability.

PedVR: Simulating Gaze-Based Interactions between a Real Rser and Virtual Crowds

Pedestrian Dominance Modeling for Socially-Aware Robot Navigation

Abstract We present a Pedestrian Dominance Model (PDM) to identify the dominance characteristics of pedestrians for robot navigation. Through a perception study on a simulated dataset of pedestrians, PDM models the perceived dominance levels of pedestrians with varying motion behaviors corresponding to trajectory, speed, and personal space. At runtime, we use PDM to identify the dominance levels of pedestrians to facilitate socially-aware navigation for the robots. PDM can predict dominance levels from trajectories with~ 85% accuracy.

Pedestrian Dominance Modeling for Socially-Aware Robot Navigation

Perceptual Thresholds for Radial Optic Flow Distortion in Near-Eye Stereoscopic Displays

Abstract We provide the first perceptual quantification of user’s sensitivity to radial optic flow artifacts and demonstrate a promising approach for masking this optic flow artifact via blink suppression. Near-eye HMDs allow users to feel immersed in virtual environments by providing visual cues, like motion parallax and stereoscopy, that mimic how we view the physical world. However, these systems exhibit a variety of perceptual artifacts that can limit their usability and the user’s sense of presence in VR.

Physically-Based Modeling

Placing Human Animations into 3D Scenes by Learning Interaction- and Geometry-Driven Keyframes

Abstract We present a novel method for placing a 3D human animation into a 3D scene while maintaining any human-scene interactions in the animation. We use the notion of computing the most important meshes in the animation for the interaction with the scene, which we call “keyframes.” These keyframes allow us to better optimize the placement of the animation into the scene such that interactions in the animations (standing, laying, sitting, etc.