Works on recorded conversations between doctors and patients. Noise reduction is performed on the audio, following which it is converted to text format. The text is further processed to produce a tabular summary of all the essential details discussed in the conversation
Role:
Deep Learning Developer
Project Duration
Sept 2021- Oct 2021
What did I do?
Storyboarding
Defining the workflow
Coding the module(s)
Co-authored the research paper
Tools
Visual Studio Code
Google Colaboratory
GitHub
WhiteBoard
Pen and Paper
Team Members
Judy Simon
A research paper based on this was written and presented at the 5th International Conference on I-SMAC(IoT in Social, Mobile, Analytics and Cloud) on 11th November 2021
Deep Learning based Transcribing and Summarizing Clinical Conversations*
Problem Statement
To use a supervised deep learning technique for noise suppression using a convolutional network, the Google Speech-to-Text API for transcription of the conversation and a basic SVM module which categorizes text based on the given tags and relative frequency of occurrence of a word to create the tabular summary of the said doctor-patient verbal exchange.
💻What is the project about?
An automation mechanism of noise suppression to eliminate environmental disturbance, followed by transcription and summarization of the recorded conversation taking place between the doctor and the patient(s) to focus only on the essential information, since abridging the entire conversation as a whole may be counterproductive. The tabular summary obtained at the end of the process can be used by the doctors and patients alike, to understand the patient history, prognoses and/or diagnoses.
🎯Objectives
👩⚕️ 🧑⚕️Target Audience
Research Findings
There are existing systems discussed for noise suppression that use a combination of RNN-N and RNN-R and another that uses SEGAN to tackle some of the challenges mentioned in “Challenges of developing a digital scribe to reduce clinical documentation burden” relating to poor quality of recorded sound.
While the former is an easily implementable network, it is a primitive one as well. Thus we opt for a more efficient Convolutional Autoencoder architecture inspired by the latter.
The transcription systems studied do not use noise suppression, thus, our system incorporates a noise suppression algorithm to attempt to improve the accuracy of transcription even in noisy environments. One existing system uses Google’s speech to text API for the purpose of getting the transcription, which is also employed in our system because of its success.
Papers focusing on the classification of medical data were also reviewed, the insights were used in the formulation of our labeling module.