You are on page 1of 2

Shubham Gupta mastershubham@gmail.

com | +91 84316-24358 Shubham shubham Shubham

EDUCATION PROFESSIONAL EXPERIENCE


Indian Institute of Technology (IIT) Microsoft India (R&D) Private Limited: Engineering Intern (Cloud + AI) (June-July
Bhilai, India 2022): Control Plane Alert System Automation
B.Tech. Computer Science (2019-2023) • Built an alerting system that will proactively monitor and flag Microsoft Azure Data
CGPA – 9.16/10 (5,6,7th Sem: SGPA 10/10) Centre servers which are NOT performing to their full potential and returns a complete Root
PSBB Learning Leadership Academy, Cause Analysis (RCA) of their failures before it leads to a customer escalation.
Bangalore, India • This alert system has automated the debugging process for failed servers by generating
SSCE 2019 | CBSE | 94.6% automated workflows/incidents and is currently deployed on 80,000 servers with an estimated
High School 2017 | CBSE | CGPA – 10/10 overall cost of $8 billion.
Tech: C#, Python, React, Power Automate, Power BI, JSON, Azure Data Explorer (Kusto)
ACHIEVEMENTS C-DAC: Quantum Research Intern (May-July 2021)
Implementation of Quantum Computing (QC) Algorithms
• ICPC 2022 Regionalist (Kanpur Site) • Quantum simulators and algorithms – explored and shortlisted
• Top Contributor – NumPy (v1.22.0) • Extensive coding, rigid code testing, simulation, and hardware study was performed on
and Pandas (v1.3.4) the PARAM (India's first indigenous computer) Shrestha computing node and ATOS
• de.ci.phe.red Labs CTF’21|3 position QLM Machine to come up with a comparative study across shortlisted simulators.
rd

• Google Code Jam 2021 – Qualified till • Detailed error analysis marked by error codes, hardware optimization, the behaviour of
Round 2 circuits under noise, and environmental factors and transpilers were also worked upon.
• Google Hash Code 2021 – AIR 1333 • Advanced testing was done on real quantum processors provided by Google QC Service.
• CodeChef sed_123 – 2094 (5 Star)
• Codeforces sh1194 1700 (Expert) KEY PROJECTS
• Leetcode – 550+ Solved (Beats 96%) Computer System Design: Design Space Exploration of URL Shortening Service
• Built a complete URL Shortening Service to scale up to 500M writes per month based on
• IIT JEE Adv 2019 – AIR 5146
collected requirements and capacity estimates (200 writes and 20K reads per second)
• JEE Main 2019 – 99.37 percentile
• Generated a Queuing model using Markov Chains on the incoming requests and
• KVPY IISc – Fellowship Holder processing at the cache, network, load balancer, application server and database.
AIR 353 Rank List • Analysed 3 Design Choices: Golang + DynamoDB, NodeJS + MySQL/MongoDB and
• KCET - State Rank 99 generated loads on them to contrast throughput, cache rate, latency and cost per transaction.
• NSTSE Rank Holder – AIR 98 Tech: AWS, Golang, DynamoDB, NodeJS, MySQL, MongoDB, Python, K6 (load testing),
HTML, CSS, EJS, chart.js, Shell Script, C++. GitHub Project Research Paper
SKILLS Computer Vision: Image Super-Resolution using Deep CNNs (SRCNN) and Upsampling
• Implemented a 3-Layered CNN Architecture (128 X 64) from scratch that generates High-
Areas: System Design, Cloud, Security, Resolution (HR) images from Distorted Low-Resolution (LR) images by considering MSE
VLSI, Software Engineering, ML (Computer Loss, SGD Optimizer, ReLU Activation and achieved State-of-the-Art results. Video-Demo
Vision, NLP, Information Retrieval, Text • Applied Post-Upsampling on HR images to obtain Ultra-HD images clearest to the last
Analytics), Open Source, Full-Stack. Pixel using iterative deblurring, brightness contrast, Bicubic Upsampling. Video-Demo
Embedded Systems Libraries/Frameworks: Tech: PyTorch, TorchVision, OpenCV, Image Reader, image-similarity-measures, SciPy,
Kubernetes, TensorFlow, Keras, NLTK, SciPy, scikit-image, NumPy, GitHub, Google Colab. NVIDIA CUDA 4.0. Repository
PyPlot, TextBlob, spaCy, Selenium, Pandas, Information Retrieval Search Engine GitHub Project
RegEx, scikit-learn, speech_to_text, OpenCV,
• Constructed Elastic Search Index of Netflix Dataset and processed queries on GCP.
• Crawled and Indexed Wiki pages and added skip pointers to form restrictive merging of
Mypy, PyTorch, Matplotlib, TorchVision,
terms while answering queries and performing TF-IDF scoring using Vector Space Model
pytest, pytz, Cython, Django, bnlearn, pgmpy • Computed Precision, Recall, MAP, Pseudo-Relevance feedback on CIIR UMass Dataset.
Web: HTML, CSS, EJS, JavaScript, React.js, • Ranked documents using TF-IDF, Binary Independence and Unigram Language Model.
Angular, Next.js, Node.js, JSON, XML, Ajax, • Classified documents based on given queries using Rocchio and KNN Classifier (K = 1, 3,
PHP, Bootstrap, LAMP, MERN 5) Reported Precision, Recall and F1 Score (Train : Test – 70:30)
Languages: C, C++, C#, Python, Java, PHP, Tech: GCP, Elastic Search, NLTK, Python, Pandas, JSON, Kibana, scikit-learn, matplotlib
Kotlin, PHP, System Verilog, Apps Script, Artificial Intelligence Heuristics Graphic Maze Solver Heroku App
Solidity, MATLAB, Shell script, gnuplot, SQL, • Built a graphic maze solver leveraging BFS, DFS, A*, DFBB, Dijkstra, Greedy Best
Erlang, Golang, Scala, PDDL, Lex, Yacc First Search, IDS, IDA* algorithms with Manhattan Distance Heuristic.
Databases: MySQL, MongoDB Atlas, Redis, • Built Bayesian Belief Networks for discrete probability distributions and SATPLANs
using MiniSAT for block movement. Leveraged Min-Max, Genetic and Alpha-Beta Pruning
DynamoDB, Elasticsearch, Apache Derby
Algorithms to solve CSPs, N-Puzzle and TSP. GitHub Project
Tools: AWS, MS Azure, GCP, HDFS,
• Solved logical planning problems: Robot Movement, Logistics using PDDL with STRIPS.
CI/CD, Docker, Workbench, Bootloader, Tech: CSS, Next.js, PDDL, MiniSAT, Python, bnlearn, pomegranate, pgmpy
TPM, Wireshark, NS-3, Waf, GDB, Valgrind, Network Communication Applications - Socket Programming Video-Demo Description
Maven, Jar, Linux, Shell, Kibana • Implemented an IPERF Application, Reverse Shell, FTP, and End-to-End RSA 256-bit
Platforms: Raspberry Pi, Android Studio, Encryption across Client-Server over TCP stream using Python constructs from scratch.
Firebase, Heroku,Colab, Flutter, Visual Studio, • Extended the standard UDP Echo-Client Server to implement IP Protocol Independence
Remix, QEMU, GitHub, Power Automate and Real-Time Communication (RTC) with multiple clients within the same network. Report
QC Simulators: Qiskit, Intel Quantum Tech: Python, C, C++, Bash, PowerShell, Linux GitHub Repository
Simulator (IQS), Google Cirq Language Processing: Semantic Parsing of C and C++ Program Video Demo
• Generated tokens on compilation-based languages to build a Lexical/Syntax Analyser
Miscellaneous: xv6, Reverse Engineering, and Three Address Code generator for the C programming constructs. Lex Syntax TAC
Image Processing Tech: C, C++, Lex, Yacc, Flex, Bison, Linux, Shell script
OPEN-SOURCE CONTRIBUTION ACADEMIC PROJECTS
Security: Secure boot in embedded devices (Raspberry Pi)
Pandas GitHub - Pandas, NumPy GitHub - • Optimized and enhanced the Secure/Verified Boot and U-Boot functionality of
NumPy Raspberry Pi to build an end-point authentication system with end-to-end encryption.
Opened multiple PRs and contributed to Tech: Raspberry Pi (3 and 4), Bootloader, TPM, Linux Kernel, C++, Bash, Shell script.
Python repositories: pandas and NumPy Cloud: Docker Network to link Local Host with Database GitHub Description
spanning multiple releases by solving • Stitched a Containerised Docker Network with an Open-Sourced Apache Database to
multiple release/integration bugs. deploy website on localhost. Tech: Python, PHP, Docker, Apache Derby, Linux
(Counted among the top contributors for
OS Constructs GitHub OS Constructs GitHub - CoW Fork
NumPy 1.22.0 release.)
• Built a Linux Shell and associated mechanisms for Pthread Synchronization, Dynamic
Tech: Mypy, pytest, pytz, python-dateutil,
Cython, Python, Linux, CI/CD
Memory Management, and Process Scheduling in the 64-bit Linux architecture.
• Implemented a host of OS operations including file systems, paging, locking,
Checkstyle: GSoC GitHub - Checkstyle : concurrency, networking subsystems, trap handling, semaphores, threading.
Solved production issues and GSoC-related • Added a priority copy-on-write in the fork () system call that allows parent/child to use
bugs while debugging large Java codebases the same memory and create copies on page modification. Tech: xv6, C, C++, Bash
by creating multiple PRs. NLP: Word Embedding Modelling, Visualization with Applications GitHub
Tech: Java, Maven, Jar, Linux, CI/CD. • Built 6 Word2Vec Models (CBOW, Skip-Gram, GloVe – 0/1 negative sampling) from
. scratch to reconstruct linguistic context of Wikipedia Dumps and retrieved a semantic
EVENTS HOSTED correlation score of 0.29 (on SimLex-999) with p-value of 7.158.
• Implemented a 0/1 Classifier to apply word vectors for NER (Information Extraction) on
1. Python Code-Lab for DSC, IIT Bhilai the CoNLL-2003 dataset to achieve accuracy – 94.22%, F1 – 0.978.
2. de.ci.phe.red CTF in 2020. • Implemented Neural Transition Dependency Parser on GloVe 300-dim vectors to
3. Code-Ingenuity – Competitive vectorize POS Tags/Relations. EWT English TreeBank with UAS – 78, LAS – 74.7345
Coding contest by Ingenuity Tech: t-SNE Visualization, Gensim, PyTorch, NLTK, Pandas, NumPy, SciPy, NetworkX
4. IIT Bhilai’s 1st Annual Quizzing Blockchain Ethereum Live Project - Remix, Video-Demonstration. GitHub Repository
Fest, ExQuizite • Implemented a mechanism to Transfer Digital Assets by changing the NFT ownership via
an Open Auction with multiple users bidding with 5 % Royalty transferred to the Artist.
POSITIONS OF RESPOSIBILITY • Implemented multiple Smart Contract Applications – Auction, Contract Tender, Lottery,
and Voting over the Ethereum Blockchain Ropsten Deployment
1. Core Member – DSC, IIT Bhilai, run Tech: Solidity, Remix IDE, Ethereum, MetaMask, Ropsten, Etherscan
by Google Developers NLP: Multi-Class Hindi Speech Profanity Detection: Lexicon Approach + Bi-LSTM
2. Ingenuity (Coding Club, IIT Bhilai): • Enhanced paper implementation: Lexicon with sentiment polarity identification on
Core Member Hindi Subjectivity WordNet and Theme-Nouns which is fed into a lexicon classifier (Linear
• Created a culture for competitive – O(n) time complexity) and tested on a real-world web discourse to obtain an improved
programming CP to guide/mentor Precision – 48.58%. Recall – 91.84%, F1 – 63.54% (Recall >> Precision) GitHub
students to go into the depths of coding in • Novel implementation of a Bi-LSTM architecture on Hindi Hate Speech Dataset of Fake/
Codeforces, Leetcode. Defamation/Hostile Tweets. Accuracy: 58%, Loss: 1.326. Tech: Pandas, PyTorch, NLTK
• Enhanced DS Algo, Dynamic Networks – Application and Transport Layer with NS-3 Description GitHub Repo
Programming (DP), and Graph Theory
• Built a full network simulation program across 2 localhosts to transfer large files (> 1 MB)
3. OpenLake – Mentor Community
and study the network throughput and congestion window variation with bandwidth changes
• Driving multiple Open-Source
• Generated PCAP files to study the Application: HTTP, FTP, DNS, and Transport: TCP,
Projects at Foss Overflow (Insti-App)
UDP, ICMP layer protocols detailing on the TCP Variants: Cubic, NewReno, Vegas, Africa
4. de.ci.phe.red Labs: Core Member
• Explored various networking commands – ping, netstat/dig, and their functionalities.
• Driving research projects on Hashing,
Tech: NS-3, NetAnim, Wireshark, tshark, tcpdump, gnuplot, Cisco Packet Tracer, RFC
Lightweight Crypto, Quantum
Cryptanalysis, QKD (BB84).
NLP: Language Modelling (LM) GitHub Tech: NLTK, spaCy
• Built a Bigram Spelling Correction probabilistic LM trained on Gutenberg Corpus
• Specialise in Reverse-Engineering,
Image Processing, CTF. Shubham - Core against Non-Vocabulary (Edit Distance < 2) and Vocabulary words to obtain accuracy
5. Core Member – Quiz and Drama 75.388% and 15.02% respectively. (Train: Test – 80:20)
• Represented the IIT Bhilai Quiz club at • Built a Sequence Recovery Trigram LM to obtain correct word sequence from a
the 4th Inter-IIT Cultural Meet (Dec character sequence using the Brown Corpus (Accuracy: 74.97%, Train: Test – 80:20)
2019) organized by IIT Bombay. • Perplexity Evaluation of LMs on Webtext Corpus (Scores - Bi:16.18,Tri:2.15,Quad:0.98)
• Latin and English LMs for Language Identification of given sentence: Word Probability
RELEVANT COURSES MySQL Database - GitHub - MySQL Live Project Demo
• Created a full-fledged database with Entity-Relationship Model to deal with 1000+
Computer System Design, Information queries on large high-speed databases. Tech: Microsoft Workbench, Shell script, MySQL
Security, AI, Information Retrieval, ML, NLP, VCS – Family Tree – Family Tree Demonstration Tech: Git, Shell script, PyPlot, Git Graph
VLSI, Blockchain, Cryptography, Quantum • Designed a graphical family tree based on a user-based Git VCS commit history
Computing, Cloud Computing, VMs, OS, Advanced-Data Structures – CLL, DLL, Stack-Permutation, RBT
Language Processing, Distributed Systems, • A host of functions for insertion, deletion, updating, etc were constructed to cover all
ML/DevOps, DBMS, Advanced Graph Theory, the functionality exhibited by these data structures. Tech: C, Shell script, Linux, Bash
Theory of Computation, DSA, Algorithms, Q/A: Reading Passage/Comprehension (Closed-Domain) GitHub Dataset
Computer Networks, Computer Hardware • Performed POS Tagging, Normalization on Questions/Passage with Coreference
Architecture, OOPs, Programming Language Resolution to find out passage sentence with maximum question overlap.
Principles, Full Stack Development, Digital • Extracted relevant Noun Phrases (NP) from matched sentences to retrieve the answer
Logic Design, Discrete Structures, • Obtained Results: Precision: 0.37, Recall: 0.66
Programming, Entrepreneurship and Start- WEB/APP DEVELOPMENT
ups, Linear Algebra, Concurrent Programming,
Website Dev GitHub - Jekyll Tech: Jekyll-Theme, HTML, CSS, React.js, Node.js
Software Tools/Tech, Calculus, Game Theory,
• Built student information profiles in the institute cryptography club website
Discrete Math, Astronomy - Galaxies

You might also like