Professional Documents
Culture Documents
• Objective : MultiModal Feature Wise Attention Model for Visual Question Answering: (Jun’21-Till date)
Guide: Prof Biplab Baneerjee (CSRE),IIT Bombay | Guide: Prof Leena Vachhani(SysCon),IIT Bombay
◦ Objective : Develop a deep learning based model to extract image and text features for answer prediction.
◦ OngoingWork : Trained CNN(VGG-NET) and RNN(LSTM) to extract image and text features.
◦ Concatenating this CNN and RNN models for answer prediction using MS-COCO Dataset.
◦ Future work : Introduce feature wise attention mechanism to learn cross feature wise attention between
image and question modalities.
HOBBIES