Works on recorded conversations between doctors and patients. Noise reduction is performed on the audio, following which it is converted to text format. The text is further processed to produce a tabular summary of all the essential details discussed in the conversation

Role:

Deep Learning Developer

Project Duration

Sept 2021- Oct 2021

What did I do?

Storyboarding

Defining the workflow

Coding the module(s)

Co-authored the research paper

Tools

Visual Studio Code

Google Colaboratory

GitHub

WhiteBoard

Pen and Paper

Team Members

Avani Shrivastava

ND Bhavana

Judy Simon


A research paper based on this was written and presented at the 5th International Conference on I-SMAC(IoT in Social, Mobile, Analytics and Cloud) on 11th November 2021

ISMACPaper-compressed.pdf

Deep Learning based Transcribing and Summarizing Clinical Conversations*


Problem Statement

To use a supervised deep learning technique for noise suppression using a convolutional network, the Google Speech-to-Text API for transcription of the conversation and a basic SVM module which categorizes text based on the given tags and relative frequency of occurrence of a word to create the tabular summary of the said doctor-patient verbal exchange.

Untitled


💻What is the project about?

An automation mechanism of noise suppression to eliminate environmental disturbance, followed by transcription and summarization of the recorded conversation taking place between the doctor and the patient(s) to focus only on the essential information, since abridging the entire conversation as a whole may be counterproductive. The tabular summary obtained at the end of the process can be used by the doctors and patients alike, to understand the patient history, prognoses and/or diagnoses.

🎯Objectives

👩‍⚕️ 🧑‍⚕️Target Audience


Research Findings

There are existing systems discussed for noise suppression that use a combination of RNN-N and RNN-R and  another that uses SEGAN to tackle some of the challenges mentioned in “Challenges of developing a digital scribe to reduce clinical documentation burden” relating to poor quality of recorded sound.

While the former is an easily implementable network, it is a primitive one as well. Thus we opt for a more efficient Convolutional Autoencoder architecture inspired by the latter.

The transcription systems studied do not use noise suppression, thus, our system incorporates a noise suppression algorithm to attempt to improve the accuracy of transcription even in noisy environments. One existing system uses Google’s speech to text API for the purpose of getting the transcription, which is also employed in our system because of its success.

Papers focusing on the classification of medical data were also reviewed, the insights were used in the formulation of our labeling module.