Researchdirections

Towards Improved Room Impulse Response Estimation for Speech Recognition

Abstract We propose to characterize and improve the performance of blind room impulse response (RIR) estimation systems in the context of a downstream application scenario, far-field automatic speech recognition (ASR). We first draw the connection between improved RIR estimation and improved ASR performance, as a means of evaluating neural RIR estimators. We then propose a GAN-based architecture that encodes RIR features from reverberant speech and constructs an RIR from the encoded features, and uses a novel energy decay relief loss to optimize for capturing energy-based properties of the input reverberant speech.

TraPHic: Predicting Trajectories of Road-Agents in Dense and Heterogeneous Traffic.

Paper Code Dataset TraPHic, CVPR 2019 GitHub Code TRAF #### Paper TraPHic: Trajectory Prediction in Dense and Heterogeneous Traffic Using Weighted Interactions, CVPR 2019. Rohan Chandra, Uttaran Bhattacharya, Aniket Bera, and Dinesh Manocha.– We present a new algorithm for predicting the near-term trajectories of road-agents in dense traffic videos. Our approach is designed for heterogeneous traffic, where the roadagents may correspond to buses, cars, scooters, bi-cycles, or pedestrians.

Traffic-Aware Autonomous Driving with Differentiable Traffic Simulation

Paper Code ICRA 2023 Github
Authors: Laura Zheng, Sanghyun Son, Ming Lin Abstract: While there have been advancements in autonomous driving control and traffic simulation, there have been little to no works exploring the unification of both with deep learning. Works in both areas seem to focus on entirely different exclusive problems, yet traffic and driving have inherent semantic relations in the real world.

TrafficPredict: Trajectory Prediction for Heterogeneous Traffic-Agents

Abstract To safely and efficiently navigate in complex urban traffic, autonomous vehicles must make responsible predictions in relation to surrounding traffic-agents (vehicles, bicycles, pedestrians, etc.). A challenging and critical task is to explore the movement patterns of different traffic-agents and predict their future trajectories accurately to help the autonomous vehicle make reasonable navigation decision. To solve this problem, we propose a long short-term memory-based (LSTM-based) realtime traffic prediction algorithm, TrafficPredict.

Tutorial 1 : Editing MultiUAV script

Tutorial 1 : Writing example programs

Tutorial 2 : Editing d-orca roslaunch file

For example, if only 4 nodes have to be launched (assuming there are only 4 drones in the simulation) in the same machine, the following change has to be made in line 5 of dorca.launch file in launch directory. <arg name="nr" default="3"/> and make sure the line 11 of of dorca.launch looks as follows <include file="$(find dorca)/launch/dorca.launch" if="$(eval arg(‘nr’) - 1 >= 0)"> Note : Make sure the total agents in the config file is same as the total nodes launched in the launch file.

Tutorial 3 : dorcacircle

UAV-Sim: NeRF-based Synthetic Data Generation for UAV-based Perception

UAVSim: Tremendous variations coupled with large degrees of freedom in UAV-based imaging conditions lead to a significant lack of data in adequately learning UAV-based perception models. Using various synthetic renderers in conjunction with perception models is prevalent to create synthetic data to augment the learning in the ground-based imaging domain. However, severe challenges in the austere UAV-based domain require distinctive solutions to image synthesis for data augmentation. In this work, we leverage recent advancements in neural rendering to improve static and dynamic novelview UAV-based image synthesis, especially from high altitudes, capturing salient scene attributes.

UAV4D: Dynamic Neural Rendering of Human-Centric UAV Imagery using Gaussian Splatting

Despite significant advancements in dynamic neural rendering, existing methods fail to address the unique challenges posed by UAV-captured scenarios, particularly those involving monocular camera setups, top-down perspective, and multiple small, moving humans, which are not adequately represented in existing datasets. In this work, we introduce UAV4D, a framework for enabling photorealistic rendering for dynamic real-world scenes captured by UAVs. Specifically, we address the challenge of reconstructing dynamic scenes with multiple moving pedestrians from monocular video data without the need for additional sensors.