You are on page 1of 2

Pawan Rajpoot

· NLP | KNOWLEDGE GRAPHS | MULTI-MODAL | DEEP LEARNING | LLMS


03-03, 10 Shelford, 288353, Singapore
 (+65) 9077-0324 |  pawan.rajpoot2411@gmail.com |  pawan2411 |  pawanrajpoot | Google Scholar

Summary
8+ years of experience in Core NLP research development, currently working on Multi Modal Knowledge Graphs, their applications in Search
and Vision Language Models at scale . Solved E commerce core problems at scale using scaled Product Knowledge Graphs. Rich knowledge on
core NLP problems like Verb Sense Disambiguation, WSD, Semantic Role Labeling with state of the art results. Worked on Machine Translation,
Language Models, First Order Logic, Paraphrase Detection, Web Ontologies, Knowledge Base. Built project LOGICNET which is a state of the art
inference engine which forms the backbone of IPSoft’s cognitive Agent Amelia. Research and hands on experience with LLMs GPT2/3/4, ChatGPT.

Work Experience
Huawei Singapore
STAFF ENGINEER, POISSON SEARCH LABS July 2022 - PRESENT
• Research focus on multi-modal knowledge graphs and their application in general search.
• Conditional Image Retrieval for Fashion domain using state-of-the-art multi-modal VL frameworks.
• State of the art results on ”Outside Knowledge Visual Question Answering” and ”Image Text Matching” on large web-scale data (paper in writing)
.
• Integrating Large Language Models with State of the Art Vision Language Models for logical inference Engine using LLM, langchain, chain of
thought and prompt engineering.
Rakuten, India Bangalore, India
LEAD DATA SCIENTIST, RAKUTEN CATALOGUE PLATFORM December 2020 - June 2022
• Product Knowledge Graph: Lead the effort to create and enrich graph for Rakuten rewards partner data, included information extraction
pipelines with attribute discovery using various ML/DL methods. High Impact project which led to 1B+ yen turn around in Gross sale for year
2020-21
• Attribute Knowledge Graph: Product attribute data across all Rakuten partner websites, enriched with attribute details like category specific
attribute definitions, synonyms and value normalization. Project unified all e-commerce data AI pipelines at Rakuten with average load of 1
Million extraction requests per day
Bangalore, India
SENIOR DATA SCIENTIST, RAKUTEN CATALOGUE PLATFORM September 2019 - December 2020
• Product Matching: Lead the project, handled a team of 2 Data Scientists, 2 Engineers and 3 Interns, this project serves as a platform for product
matching over all of Rakuten and it’s partners catalogue data, key pipelines included are text/image attribute extraction, knowledge enrichment
and matching.
• Scaling Deep Learning Model workflow using Kubeflow, Google AI Hub, Kafka Streams, Tensor Serving.
Tact.AI Bangalore, India
SOFTWARE ENGINEER, AI TEAM May 2018 - September 2019
• Developed Scaled Bi LSTM-CRF based Intent Classifier/Entity extraction model with Elmo embeddings.
• Developed Model deployment workflow on JFrog artifact repository for different environments.
IPsoft Bangalore, India
RESEARCH & DEVELOPMENT ENGINEER, AMELIA SCIENCE TEAM Apr. 2016 - May 2018
• Worked on project LOGICNET from POC to production level. This project forms backbone of Amelia (Cognitive Agent), an inference engine build
on top of RDF responsible for semantic parsing of utterances.
• Verb Sense Disambiguation, open domain solution based on a Unsupervised algorithm using Google n gram dataset.
• Verb Sense Disambiguation, open domain solution based on Deep Averaging Networks.
• Question Classifier, based on Semantic role Labeling and SVM with state of the art F1 score of .96.
• Google N Gram, Analysis of historical book data using Lucene based dictionary. Added NER, POS tagging and did a comparison with Wikipedia
Dependency Dump and SVO triplet data by NELL.
Factset Research Systems Hyderabad, India
SOFTWARE ENGINEER, MACHINE LEARNING AND LANGUAGE TRANSLATION TEAM Jul. 2015 - Mar. 2016
• Developed Ukrainian to English and Portuguese to English Translator under Moses framework with 80 percent average accuracy.
• Worked on improving the accuracy of project Table Handler which extracts tables out of the pdf documents and structure the information in a
logical manner.
• Worked on improving the pipeline for Translation framework.

Education
IIIT-B(International Institute of Information Technology) Bangalore, India
M.TECH. IN INFORMATION TECHNOLOGY. Jul. 2013 - Jul. 2015
• Specialization: Computer Sciences. Overall CGPA: 3.04/4.0

MAY 9, 2023 PAWAN RAJPOOT · RÉSUMÉ 1


Honors & Awards
HONORS
Rakuten E-Rank, Member of E-rank community which is Rakuten top 200 Employess worldwide club, two
2021/22 Bangalore, India
years in a row
2020 Rakuten Eureka Award (Team), Yearly award Bangalore, India
2020 Rakuten Star Award - 2 Nos. , June & November Bangalore, India
2015 Infosys Scholarship, IIIT B, M. Tech. 2013-15 Batch. Bangalore, India
2011 Microsoft Student Partner, Represented Microsoft at State/National Level Seminars. Noida, India
2010 Mozilla Student Rep, FOSS member and active organizer of Mozilla Events (Link). Noida, India

HACKATHONS
2017 2nd Place, NLP Skill Test by Analytics Vidhya (800+ participants) (Link) Bangalore, India
2016 2nd Place, APIcalypse by doselect (Link) Hyderabad, India
2015 5th Place, Source Awaken by doselect (Link) Hyderabad, India

Skills
Natural Language Processing 8+ Years
SEMANTIC PARSING
• RDF, Neo4j, JanusGraph, Gremlin, GQL.
• Verbnet, Wordnet, Framenet, Entity Recognition, Dep Parsing.
• First Order Logic, Attempto, Ontologies, Knowledge Base.
• PCA, LSI, LDA, TF-IDF, Elmo, Glove, Word2vec, Fastext etc.
Machine / Deep Learning 8+ Years
FRAMEWORKS AND ALGORITHMS
• Linear Regression, Logistic Regression, SVM, Neural Networks.
• Scikit, DL4J, Tensorflow, Keras, Torch, PyTorch, Distributed GPU training.
• Snorkel, Data Programming, Kafka, Tensor serving, Kubeflow
Development 8+ Years
LANGUAGE AND TOOLS
• C, Java, Python, Shell, R, Scala.
• Intellij, Pycharm, Eclipse, Git, Maven.
• SQL, RDF, Lucene, Elastic Search, FAISS, Hive.

Patent and Publications


Lexical Simplification using Multi-Level and Modular Approach.
EMPIRICAL METHODS IN NLP 2022 TSAR

NL-Augmenter: A Framework for Task-Sensitive Natural Language Augmentation


THE NORTHERN EUROPEAN JOURNAL OF LANGUAGE TECHNOLOGY 2022

E-commerce Product Matching at Internet Scale


13TH INTERNATIONAL CONFERENCE ON E-BUSINESS, MANAGEMENT AND ECONOMICS, 2022

Visual Word Sense Disambiguation using zero shot multimodal approach


ACL 2023 SEMEVAL

Dual Task Framework for Crowdsourcing Low-Resourced Languages:


Conversational-Long Form Generation
ACM CONFERENCE ON CONVERSATIONAL USER INTERFACES

E-commerce attribute extraction at scale


PATENT(UNDER REVIEW)

MAY 9, 2023 PAWAN RAJPOOT · RÉSUMÉ 2

You might also like