Professional Documents
Culture Documents
Arushi Melanoma
Arushi Melanoma
Arushi Raghuvanshi
In the United States in 2010 there were only about 70,000 new cases of melanoma cancer but 10,000 deaths, making it one of the deadliest types of cancer. Melanoma cancer can be cured. Then why so deadly? The biggest problem with melanoma is that it is not diagnosed early enough.
I created a computer program for at home use to help with the early diagnosis of melanoma cancer.
Abstract
Melanoma cancer is one of the most dangerous and potentially deadly types of skin cancer; however, if diagnosed early, it is nearly one-hundred percent curable [UnderstMel09]. Here I propose an efficient system which helps with the early diagnosis of melanoma cancer. Different image processing techniques and machine learning algorithms are evaluated to distinguish between cancerous and noncancerous moles. Two image feature databases were created: one compiled from a dermatologist-training tool for melanoma from Hosei University and the other created by extracting features from digital pictures of lesions using a software called Skinseg. I then applied various machine learning techniques on the image feature database using a Python-based tool called Orange. The experiments suggest that among the methods tested, the combination of Bayes machine learning with Hosei image feature extraction is the best method for detecting cancerous moles. Then, using this method, a computer tool was developed to return the probability that an image is cancerous. This is a very practical application as it allows for at-home findings of the probability that a mole is cancerous. This does not replace visits to a
Procedure (1)
Step 1 Acquire Images
Acquire images of both cancerous and non-cancerous moles from dermatologists and from the internet
Step 2 Create feature database using different image feature extraction software tools
i) Skinseg tool - Skinseg segments a given image to isolate the portion of interest (i.e. the mole) and extracts a set of features from this segment.
After compiling a set of images, open each image individually, and segment it. If automatic segmentation doesnt work, do semi-automatic or manual segmentation.
Procedure (2)
Once all of the feature files are saved, use a python script to create a database of the following features: Dominant HSI intensity Asymmetry Entropy Boundary irregularity Energy Average RGB intensity Inertia Dominant RGB intensity Homogeneity Average HSI intensity This creates a database of the features in a TAB delimited file.
ii) Hosei tool: The Hosei Tool has predetermined features for given images. Most likely, these features were determined by physician inspection.
Using this dermatologist-training website, compile a set of the following features: Globules Symmetry Atypical Pigment Borders Blue Whitesh Veil Color Atypical Vascular Pattern Pigment Network Irregular Streaks Branched Steaks Irregular Pigmentation Homogenous Regression Structures Dots
Procedure (3)
Step 3. Machine Learning
The databases created in step 2 are now used for machine learning. There are various methods for machine learning. I worked with the following:
Majority Learning Bayes Learning Decision Trees kNN (Nearest Neighbor)
Evaluate the above methods using a Python based software tool called Orange. I wrote code in this program to test the percent accuracies of different sets of data for a given machine learning method and feature extraction method.
Image Acquisition
Taken by normal digital camera Acquired images from multiple sources Dr. Kristin Stevens, MD, Dermatologist, Providence Medical Group, Portland, OR Dr. Sandhya Koppula, MD, Dermatologist, Cornell Dermatology Clinic, Beaverton, OR Internet
Machine Learning
Machine Learning Tools
I used Orange, a python-based tool, which had libraries of multiple machine learning algorithms. I wrote programs in Orange to test a variety of different machine learning algorithms (listed below) MVSIS is another machine learning tool which I explored but didnt test
Bayes Learning
In Bayes learning, Bayesian networks are created which represent the relationship between a given feature and the probability that the mole is cancerous. Combined, these networks can give a probability for whether or not the mole is cancerous.
Decision Trees
This machine learning method creates a tree based on the training data. There are a variety of different techniques to how to create the best tree and to distinguish which features are important and which are not.
Original image
Run Python script (convert.py) providing all the feature text files as input and creating a TAB file Each input file gets represented by a row in TAB file
No
skinsegd b.tab
User-Submitted Image
Segmented image
Extracted Features
skinsegd b.tab
Run Python script in Orange for machine learning which applies various machine learning methods to provide a probability for whether or not the user-submitted image has a cancerous mole.
Decision Trees
Majority Learning
Bayes Learning
Nearest Neighbor
Features
Machine Learning
Diagnosis
Results
Acknowledgements
Dr. A. Goshtasby, Wright State University on Skinseg Image processing tool Dr. Scott E Umbaughs, Southern Illinois University, Edwardsville on CVIP tools Dr. Alan Mishchenko, University of California Berkeley for MVSIS Mr. Holger Ldtke, founder of MoleExpert for providing access to an evaluation version of the MoleExpert Micro tool. Ms. Iris Cheng, University of California Berkeley for sharing her research on image processing Dr. B.J. Shrestha, Missouri University of Science and Technology Dr. Kristin Stevens, MD, Dermatologist, Providence Medical Group, Portland. Field expert for current methods of diagnosis; also provided images of melanoma Dr. Sandhya Koppula, MD, Dermatologist, Cornell