This week’s blog post will provide a brief overview of the project and news about breakthrough (!!!) that occurred this morning by Dr. Sprague.
LET’S TALK ABOUT BATS. Bats have strong echolocating abilities that allow them to sense the physical world around them without having to use their eyes. Different bats emit different frequencies of calls; the echoes that their sensitive ears hear back can tell the bats many things, such as the proximity of prey, the speed of flapping wings, and the presence of other bats. There exists more than 900 species of bats, all of which are equipped with intricate but varying ear structures designed for different habitats. Since bats have such powerful ears, they almost don’t need their eyes.
CAN HUMANS ECHOLOCATE? Actually yes, there are people who are able to echolocate, despite our ears being less sensitive ears than bat ears. In particular, echolocation seems to be useful for people with visual impairment. By listening to short and quick clicking noises (typically emitted by the person), trained human ears can sense physical objects without invoking the eyes’ help. So if bats can echolocate and humans can echolocate…
IS IT POSSIBLE FOR MACHINES TO ECHOLOCATE? More technically, could a portable device be trained to simulate echolocation via a neural network? This is the center of my summer research. To the right is an image of the apparatus Dr. Sprague build to gather data. It consists of a ROS Kinect depth sensor, a portable speaker, and 3D printed, rubber “bat ears” taped to stereo microphones. The speaker emits an unlimited series of high frequency chirps which gets heard by the microphones through the bat ears. The echoes caused by the chirps bouncing across various surfaces are also heard by the microphones. This collection of audio samples is used as the neural network input (after being digitalized or converted into spectrograms). As the microphones are collecting audio data, the Kinect is simultaneously collecting depths maps of the surfaces that the sounds are bouncing off of. The depth maps serve as the ground truths of the neural network, which trains itself to make acceptable depth map predictions given the audio input.
THIS IS THE CURRENT STATE OF THE PROJECT. There are many tunable parameters within this neutral network, such as input type (digitalized audio vs. spectrograms), number of hidden layers, characteristics of the convolutions, etc. A major step of this project is to determine the most successful network architecture for this specific task. Before this morning, all versions of my neural network produced one depth map as the predicted depth map for any and all audio inputs, regardless of whether or not the network had seen the input before. I was stuck on this issue for what seemed like a while. Dr. Sprague discovered this morning that my spectrogram inputs had not been normalized correctly. After he divided the log of my input spectrogram arrays by twenty, the network started making varied predictions!!! Seriously. That was the answer to my problem.
Below are some results from the neural network. On the left side are the true depth maps of spectrogram inputs from the test set (i.e. samples the network did not see during training), and on the right side are the respective predictions that the neural network made. While the predictions are not perfect, the neural network did pretty well! The loss associated with these results is a millimeter mean squared error of 0.0431. Blue represents a closer surface; red represents a farther away surface.
In the far away future, we hope to have created a portable device for people with visual impairment that assists them with sensing the area in front of them via echolocation-inspired neural networks. The next step in this project is to learn more about how bat ears work and to collect a lot of “strong” data to train the neural network with using detailed, artificial bat ears. For this step, Team Sprague is traveling to Virginia Tech to speak with researchers who know a lot about bat ears. Cheers to a Friday full of science and collaboration!