Researchdirections

GraphRQI: Classifying Driver Behaviors Using Graph Spectrums

Paper Code Dataset Supplementary Material GraphRQI GitHub Code TRAF/ Argoverse Full paper + proofs

We present a novel algorithm (GraphRQI) to identify driver behaviors from road-agent trajectories. Our approach assumes that the road-agents exhibit a range of driving traits, such as aggressive or conservative driving. Moreover, these traits affect the trajectories of nearby road-agents as well as the interactions between road-agents.

HALO: Human Preference Aligned Offline Reward Learning for Robot Navigation

Abstract In this paper, we introduce HALO, a novel Offline Reward Learning algorithm that quantifies human intuition in navigation into a vision-based reward function for robot navigation. HALO learns a reward model from offline data, leveraging expert trajectories collected from mobile robots. During training, actions are uniformly sampled around a reference action and ranked using preference scores derived from a Boltzmann distribution centered on the preferred action, and shaped based on binary user feedback to intuitive navigation queries.

HTRON: Efficient Outdoor Navigation with Sparse Rewards via Heavy Tailed Adaptive Reinforce Algorithm

Abstract We present a novel approach to improve the performance of deep reinforcement learning (DRL) based outdoor robot navigation systems. Most, existing DRL methods are based on carefully designed dense reward functions that learn the efficient behavior in an environment. We circumvent this issue by working only with sparse rewards (which are easy to design), and propose a novel adaptive Heavy-Tailed Reinforce algorithm for Outdoor Navigation called HTRON. Our main idea is to utilize heavy-tailed policy parametrizations which implicitly induce exploration in sparse reward settings.

Haptic Rendering in Mixed Reality

HighlightMe: Detecting Highlights from Human-Centric Videos

Abstract We present a domain- and user-preference-agnostic approach to detect highlightable excerpts from human-centric videos. Our method works on the graph-based representation of multiple observable human-centric modalities in the videos, such as poses and faces. We use an autoencoder network equipped with spatial-temporal graph convolutions to detect human activities and interactions based on these modalities. We train our network to map the activity- and interaction-based latent structural representations of the different modalities to per-frame highlight scores based on the representativeness of the frames.

How Diverse is Driving Style in Trajectory Forecasting Datasets?

.grid-container { display: grid; /* grid-template-columns: repeat(3, 1fr); grid-template-rows: repeat(2, 1fr); */ gap: 10px; margin: 0; padding: 0; } .grid-item { width: 100%; height: 100%; object-fit: cover; margin: 0; padding: 0; }
– Overview Trajectory forecasting has become a popular deep learning task due to its relevance for scenario simulation for autonomous driving. Specifically, trajectory forecasting predicts the trajectory of a short-horizon future for specific human drivers in a particular traffic scenario.

Human Animation Placement

Overview The human animation placement research direction aims to place 3D human animations into complex indoor 3D scenes such that interactions with the scene present in the animation are preserved. The current state-of-the-art in PACE is able to place long form animations into scenes while subtly adjusting these animations to better fit the unique features of the environment. This work has use cases in VR and AR, Game Design, and HRI as it allows users to place realistic looking humans into arbitrary environments.

Human Trajectory Prediction via Neural Social Physics

Abstract Trajectory prediction has been widely pursued in many fields, and many model-based and model-free methods have been explored. The former include rule-based, geometric or optimization-based models, and the latter are mainly comprised of deep learning approaches. In this paper, we propose a new method combining both methodologies based on a new Neural Differential Equation model. Our new model (Neural Social Physics or NSP) is a deep neural network within which we use an explicit physics model with learnable parameters.

INTENT-O-METER: Determining Perceived Human Intent in Multimodal Social Media Posts using Theory of Reasoned Action

Abstract We propose INTENT-O-METER, a perceived human intent prediction model for multimodal (image and text) social media posts. INTENT-O-METER models ideas from psychology and cognitive modeling literature, in addition to using the visual and textual features for an improved perceived intent prediction model. INTENT-O-METER leverages Theory of Reasoned Action (TRA) factoring in (i) the creator’s attitude towards sharing a post, and (ii) the social norm or perception towards the post in determining the creator’s intention.

IR-GAN: Room Impulse Response Generator for Far-field Speech Recognition

Abstract We present a Generative Adversarial Network (GAN) based room impulse response generator (IR-GAN) for generating realistic synthetic room impulse responses (RIRs). IR-GAN extracts acoustic parameters from captured real-world RIRs and uses these parameters to generate new synthetic RIRs. We use these generated synthetic RIRs to improve far-field automatic speech recognition in new environments that are different from the ones used in training datasets. In particular, we augment the far-field speech training set by convolving our synthesized RIRs with a clean LibriSpeech dataset.