Despite our best efforts, hospitals are noisy places: pumps beep, beds alarm, babies cry, patients moan in pain, and others struggle to breathe. We present a system to detect these “acoustic events” in real-time utilizing a model generated via a convoluted-neural-network with acoustic event detection and classification capabilities. The device we will be using at bedside is our medical data aggregation device running a specialized version of linux which is also capable of recording RS232-data from traditional monitors as well as bluetooth data from patient wearables; however, our acoustic classification algorithms can also be ported onto nearly any other system so long as computational power is sufficient to process incoming data and calculate predictions based upon the neural model. We start with audio clips in a variety of formats, then add labels to areas within each clip which identify classification targets. We then pre-process each clip, cleaning and normalizing the audio as well as using our custom algorithm to identify and extract the best features with which we will train a convoluted neural network. The model generated from this training set is then pulled onto our medical data aggregation device. In real-time, an array of microphones on the device then process incoming audio data and apply the trained model to generate acoustic classification predictions.

In the use case we will demo at AIMed, we will listen for infant cries and perform a set of actions when the defined confidence interval is reached. These actions include playing music, turning on attached peripheral devices, turning on lights, and sending an alert to associated users. By continuing to listen for the presence of a cry, we capture whether the action corrected the acoustic event (cry). Once sufficient action-response data has been captured, we use this information to train an action-response model in order to predict the best action to perform in a given situation (data includes infant age, time of day, and response to prior actions). Our system also reports results to a cloud-based management console; however, all processing is performed on the device itself. The model improves automatically via re-training at specified intervals based upon user-reporting of false-positive and false-negative detections which sends a small labeled audio-clip to the server. User-level permissions for each device are managed on the cloud-based server, passed to each device using a secure system which utilizes encrypted json-web-tokens generated and stored on the device itself.

While the device we will be demoing is specific to infant cries, the system we have developed is capable of identifying a wide-range of acoustic events. The future of this system in the hospital setting will allow for this rich yet currently untapped source of data to be captured and utilized for improved patient safety, patient satisfaction, and hospital efficiencies.



Author: Jack Neil

Coauthor(s): Jack Neil

Status: Completed Work