You are on page 1of 4

Brain Tumor Analysis of the LGG Segmentation Dataset

T. Bala Saatvik A. Karthikeyan G.S. Deepakkumar


Coimbatore Institute of Technology Coimbatore Institute of Technology Coimbatore Institute of Technology
Coimbatore Coimbatore Coimbatore
India India India
71762108005@cit.edu.in 71762208202@cit.edu.in 71762108006@cit.edu.in

D. Sri Surya
Coimbatore Institute of Technology
Coimbatore
India
71762108044@cit.edu.in

Abstract—This report presents an analysis of the LGG missing sequences are replaced with the FLAIR sequence,
Segmentation Dataset, which includes brain MR images and resulting in all images having 3 channels.
manual FLAIR abnormality segmentation masks. The Mask Format: The masks are binary, 1-channel images
dataset consists of 110 patients from The Cancer Genome that segment the FLAIR abnormality present in the FLAIR
Atlas (TCGA) lower-grade glioma collection, with sequence, which is available for all cases.
corresponding fluid-attenuated inversion recovery (FLAIR)
sequences and genomic cluster data. Additionally, tumor III. Data Preprocessing
genomic clusters and patient information are provided in the To prepare the dataset for analysis, various preprocessing
data.csv file. The report covers data preprocessing, steps were performed, including data cleaning, handling
descriptive statistics, visualization techniques, correlation missing sequences, normalization, and alignment of images
analysis, and a Power BI dashboard for comprehensive and masks.
analysis of the dataset.

Keywords—Brain tumor analysis, LGG Segmentation IV. Descriptive Statistics


Dataset, MR images, FLAIR abnormality segmentation
masks, data preprocessing, descriptive statistics, Descriptive statistics were computed to gain insights into
the dataset's characteristics. Statistical measures such as
visualization techniques, correlation analysis, Power BI
mean, standard deviation, minimum, maximum, and quartiles
dashboard.
were calculated for image dimensions, tumor size, and other
relevant attributes.
I. Introduction

The LGG Segmentation Dataset is a valuable resource


for brain tumor analysis. It comprises brain MR images and V. Visualization Techniques
corresponding FLAIR abnormality segmentation masks Visualization techniques were employed to provide a
obtained from The Cancer Imaging Archive (TCIA). This visual representation of the dataset. Histograms, box plots,
report aims to analyze the dataset and provide insights into scatter plots, and heatmaps were generated to visualize image
brain tumor characteristics and segmentation. intensity distributions, tumor size distributions, spatial
relationships, and correlations between variables.
II. Dataset Description
VI. Correlation Analysis
The dataset consists of brain MR images and
segmentation masks for FLAIR abnormality. The key details Correlation analysis was performed to examine the
are as follows: relationships between different variables in the dataset.
Correlation coefficients were computed, and correlation
matrices were visualized to identify potential associations
and dependencies.
Data Source: The Cancer Imaging Archive (TCIA)
Patient Information: The dataset includes 110 patients
from the TCGA lower-grade glioma collection. VII. Brain Tumor Prediction using UNet Model
Image Format: The images are provided in the .tif format In addition to the analysis mentioned above, a UNet
with 3 channels per image. model was developed to predict brain tumor presence based
on the LGG Segmentation Dataset. This section highlights
Image Sequences: For 101 cases, the dataset provides
the steps involved in training and evaluating the UNet model
three sequences: pre-contrast, FLAIR, and post-contrast (in
for tumor prediction.
this order of channels). However, for 9 cases, the post-
contrast sequence is missing, and for 6 cases, the pre-
contrast sequence is missing. To standardize the dataset, the
VIII.A. Model Architecture: UNet preprocessing, descriptive statistics, visualization techniques,
correlation analysis, development of a UNet model for brain
The UNet model architecture is a popular choice for
tumor prediction along with the creation of dashboard using
medical image segmentation tasks. It consists of an encoder-
Microsoft Power BI. The UNet model demonstrates
decoder structure with skip connections, enabling precise
promising results in accurately segmenting brain tumors
localization of tumor regions. The model takes the FLAIR
based on FLAIR MRI images. The findings contribute to the
MRI images as input and predicts the corresponding tumor
field of medical image analysis and provide insights for
segmentation masks.
further research and advancements in brain tumor detection
and treatment.
VIII.B. Data Preparation
To train the UNet model, the dataset was split into XI. Sample Visualization
training and validation sets. Data augmentation techniques
such as rotation, flipping, and scaling were applied to
increase the dataset size and improve model generalization.
The FLAIR MRI images were used as input, while the
corresponding binary tumor segmentation masks served as
the ground truth labels.

VIII.C. Model Training


The UNet model was trained using a combination of
pixel-wise binary cross-entropy loss and dice coefficient loss
to optimize both the tumor shape and location. The training
process involved feeding batches of FLAIR images into the
model, comparing the predicted tumor masks with the
ground truth masks, and updating the model parameters using
backpropagation. The training was performed for a specified
number of epochs, with a suitable learning rate and
optimizer.

VIII.D. Model Evaluation


To evaluate the performance of the UNet model, the
trained model was applied to the validation set. The predicted
tumor masks were compared with the ground truth masks
using evaluation metrics such as dice coefficient, Jaccard
index, and accuracy. Additionally, qualitative analysis was
conducted by visually inspecting the predicted tumor
segmentations against the actual tumor regions in the
validation images.

VIII.E. Results and Discussion


The results of the UNet model for brain tumor prediction
are presented, including evaluation metrics and qualitative
analysis. The model's performance in terms of accuracy,
sensitivity, and specificity is discussed, highlighting its
ability to accurately identify tumor regions in brain MR
images. Limitations and potential areas for improvement are
also discussed.

IX. Power BI Dashboard


To facilitate interactive exploration and analysis, a Power
BI dashboard was created. The dashboard incorporates
visualizations, filters, and interactive elements to provide a
comprehensive view of the dataset and enable users to gain
deeper insights about the patient’s data.

X. Conclusion
In conclusion, this report provides a comprehensive
analysis of the LGG Segmentation Dataset, including data
XII. Power BI Dashboard Image

XIII. Acknowledgment
We would like to acknowledge the support and resources
provided by The Cancer Imaging Archive (TCIA) for the
LGG Segmentation Dataset. Additionally, we extend our
gratitude to the research community and contributors
involved in making this dataset available for scientific
research and development.

You might also like