You are on page 1of 7

CLASSIFYING GENE VARIATIONS

USING NATURAL LANGUAGE


PROCESSING AND MACHINE
LEARNING
BVCOE, BEIT
• Karan Solanki
• Anish Shelte
1. A molecular pathologist selects a
list of genetic variations of interest
that he/she want to analyse.

2. The molecular pathologist


CURRENT searches for evidence in the
medical literature that somehow
WORKFLOW are relevant to the genetic
variations of interest.

3. Finally this molecular pathologist


spends a huge amount of time
analysing the evidence related to
each of the variations to classify
them.
THE PROCESS OF GOING THROUGH THEREFORE OUR GOAL IS TO
STEP 1 AND STEP 2 IS FAIRLY EASY. REPLACE STEP 3 WITH A MACHINE
BUT STEP 3 IS SUPER TIME LEARNING MODEL.
CONSUMING.

CURRENT WORKFLOW
• Classifying the given genetic
variation based on evidence from
PROBLEM STATEMENT text-based clinical
literature/research papers.
There are nine different classes a genetic mutation
can be classified on.

Therefore this is a multi-class classification problem. DATASET

This is not a trivial task since interpreting clinical


evidence is very challenging even for human
specialists.
OBJECTIVES AND CONSTRAINTS

Interpretability of the No low latency


model is important requirement.

Probability is needed
Errors are very costly. rather than the model
explicitly mentioning the
class label.
REFERENCES

You might also like