You are on page 1of 46

Class of 2018

STUDENT
RESUME
BOOK

msia.careers@northwestern.edu
CLASS OF 2018 PROFILE

44% WOMEN
41
C L ASS S I Z E

INTERNSHIP PLACEMENTS
A.T. KEARNEY
1.5
ABC SUPPLY AV E R AG E Y E A RS P R I O R
AIRBNB WO R K E X P E R I E N C E
APPLE, INC
BALYASNY ASSET MANAGEMENT
BUZZFEED
CME GROUP PRIOR DEGREE
ENOVA CONCENTRATIONS
EXPEDIA
FORD MOTOR COMPANY SOCIAL
GODADDY BUSINESS & SCIENCES 1.16%
HCSC FINANCE 11.24%
KPMG OTHER 2.33%
LAZARD ECONOMICS 25.19%
LINKEDIN
MICROSOFT
MOLEX, INC
STEM 69.38%
NASA
NORDSTROM, INC
ON POINT TECHNOLOGY
OPEX ANALYTICS
PROCTOR & GAMBLE CO
SCHNEIDER Computer Science 4.65%
SOCIAL FINANCE
TRANSUNION, LLC Science 5.81%
TRELLO
UNIVERSITY OF CHICAGO Engineering 22.09%
ZURICH AMERICAN INSURANCE CO
Mathematics & Statistics 36.82%
Class of 2018

PATRICK CHANG JAMIE CHEN JERRY CHEN JOHNNY CHIU

LUCA COLOMBO GRACE CUI ANISHA DUBHASHI JILL FAN

MATT GALLAGHER MICHAEL GAO LAUREN GARDINER JOE GILBERT

SARAH GREENWOOD VARUN GUPTA VERONICA HSIEH WENZE HU

RISHABH JOSHI BROOKE KENNEDY ARVIND KOUL TUCKER LEWIS

WEI LI EMMA LI ZILI LI JUNXIONG LIU


Class of 2018

YUQING LIU DANIEL LÜTOLF-CARROLL SPENCER MOON ERIC PAN

MICHAEL PAULEEN CHRIS ROZOLIS WILL SONG CHRISTA SPIETH

PENNY SUN PHYLLIS SUN SAURABH TRIPATHI VINCENT WANG

LOGAN WILSON HAO XIAO WENJING YANG TONG YIN

ETHEL ZHANG
PATRICK CHANG
Product Manager by Trade
Data Scientist in Training
510.710.7317 | patrickchang@gmail.com

EDUCATION
Northwestern University The Master of Science in Analytics is a cross-disciplinary master’s degree with an
Master of Science in Analytics applied curriculum exploring data science, machine learning, and business informatics.
Sep 2017 – Dec 2018 Developed profiling clusters and predictive pricing model for the Chicago Parks
Honors: Graduate Fellowship District Day Camp which accounts for 33% of the Parks District’s revenue.
GPA: 3.83
Topics: Unsupervised Learning, Predictive Analytics, Text Analytics, Deep Learning, Big
Data, Optimization, Healthcare Analytics, Data Mining, GIS for Public Health, Databases
Columbia University Major: Industrial Engineering - Minor: Sociology - Dean’s List Honors
Bachelor of Science Activities: Engineering Class Council, Asian American Alliance Exec Board
Aug 2007 – May 2011 Conducted research in both Political Science and Public Health. Credited in:
• "Costly Jobs: Trade-related Layoffs, Government Compensation, and Voting in U.S.
Elections", Yotam Margalit, American Political Science Review.
• “Systems biology of human benzene exposure”, Luoping Zhang, Chem Biol Interact.

WORK EXPERIENCE
Urban Labs (UChicago) Data Research Intern – Poverty Lab
Chicago Developing scope and building model predicting families at risk of first time
Jun 2018 – Present homelessness so that LA County can optimize distribution of homelessness prevention
services. For another project, crafting an interactive dashboard displaying the results of
a model predicting the likelihood of college freshman to drop out before graduation.
Nielsen Senior Product Manager – Nielsen Marketing Cloud
San Francisco & New York Managed the growth and activation of Nielsen datasets for use in programmatic
Jul 2012 – Sep 2017 environments, including: Television Viewership and Credit Card Spend.
harvard princeton yale brown MIT cal tech ivy executive president IT • Grew product revenue 200% YOY to $40MM in 2014 and continued growth
information technology epidemiology urban planning informatic
informatics financial synergy synergize manager vice director budget 150% YOY through 2015; brokered partnerships with 30 new platforms.
management innovation chicago legal lawyer law disease laboratory • Oversaw team composed of two data scientists and three offshore analysts.
resident healthy hospital doctor nurse compliance report external
partner liability insurance privacy behavior society statistical prevention • Led the activation of Nielsen datasets on mobile advertising platforms; at
global biostatistics services organization six sigma pmi project institute launch, Mobile Precision Marketing was the first mobile ad targeting solution
pmp capm pmi agile certified practioner loyola depaul uic certification
biotechnology pharmaceutical biotech drugs block chain blockchain lab utilizing television viewership data.
big data spark hadoop artifical intelligence machine watson ai aws
amazon web developer outsource offshore india carnegie mellon harvey
Emerging Leaders Associate
mudd internet of things quantum dark serverless server archituecture Member of the Nielsen Emerging Leaders Program, Class of 2012. The Emerging
city dashboard viz google apple facebook salesforce microsoft cisco
qualcomm ibm dell social good entrepreneurship entrepreneuer
Leaders Program is an 18-month management accelerator program, which provides
microloan kiva mba phd communications virtual on demand networks knowledge and experience through exposure to a diversity of projects over four
protocols risk enterprise software retrieval warehouse warehousing
officer government gov tech govtech nih cdc atlanta dc d.c. washington
business units.

CS Technology Associate Consultant


New York Oversaw multi-million dollar infrastructure technology projects including the JP Morgan
Jun 2011 – May 2012 Chase company wide data center refresh and Thomson Reuters global operating
system upgrade. Created an internal resource forecasting and allocation tool.

SKILLS ACTIVITIES
R Python SQL GIS Spark Java UChicago Civic Scopathon – Organizing Committee
Onboarding non-profit partners for solution building event.
Hadoop Tableau STATA SAS HTML5 Techsoup – Volunteer Consultant
Advised on projects including Techsoup’s free phone rollout.
Conversational Mandarin Eagle Scout
Bay Area Rescue Mission – Volunteer
Coordinated volunteer and resource allocation of tech and
Drums Photography Climbing Cooking
food services for the shelter.
Jamie Chen
805.405.3924 I jamie.chen@u.northwestern.edu I linkedin.com/in/jamie-chen
github.com/jchen0529 I public.tableau.com/profile/jamie

EDUCATION
Northwestern University, McCormick School of Engineering Evanston, IL
Master of Science in Analytics, Merit-based Fellowship Expected Dec 2018
Courses: Predictive Analytics, Database & Information Retrieval, Data Mining, Deep Learning, Optimization

College of William and Mary Williamsburg, VA


Bachelor of Science in Mathematics, Marketing Double Major, Magna Cum Laude May 2014

PROFESSIONAL EXPERIENCE
Microsoft Corporation Redmond, WA
Data Scientist Intern June 2018 – Aug 2018
• Defined the objectives of improving bot traffic detection for Microsoft News to protect revenue and improve user
experience. Built complex data pipelines to query, join, and preprocess daily clickstream data (>1B records) for
analyzing bot traffic, performed exploratory data analysis on anomalous bot users’ behaviors
• Trained predictive models to classify bots, designed and tested bot detection rules that can be applied at the
hourly level, improved warm path analytics and resulted in daily data storage saving on over 3.7M page views

Ernst & Young, Quantitative Economics and Statistics (QUEST) Washington, D.C.
Senior Analyst July 2016 – July 2017
• Led a team of four to publish the 2016 US Investment Monitor Study that analyzed investment trends by industry
• Quantified companies’ economic impact as a result of planned investments. Final reports helped multiple
companies qualify for state tax incentives and aided their communication with stakeholders
• Led marketing initiatives with cross-functional teams to expand service offerings and won two projects
Analyst Aug 2014 – June 2016
• Designed stratified statistical samples in SAS to estimate companies’ tax deductions and provide litigation support
• Launched 25+ web surveys, analyzed survey responses and created dynamic Tableau visualization dashboards

Harman International Industries, AHA Radio Palo Alto, CA


Strategic Planning Intern June – Aug 2013
• Forecasted 3G mobile user growth in the target market and influenced management’s decision to enter market

TECHNICAL SKILLS & PUBLICATIONS


Skills: Python • PySpark • R • SQL • Java • Hadoop • Hive • C# • SAS • Tableau • SPSS • Microsoft Office (advanced)
Publications: Ernst & Young, Quantitative Economics and Statistics (QUEST)
• Viewpoints on paid family and medical leave Mar 2017
• 2016 US Investment Monitor Study – EY Aug 2016
• Impact of the Orphan Drug Tax Credit on treatments for rare diseases June 2015

PROJECT WORK & INTERESTS


Chicago Botanic Garden, IBM Analytics Jan 2018 – May 2018
• Identified data scope and joined multiple datasets together to extract insights on CBG members. Created new
segments and behavioral profiles for 48k members using unsupervised learning methods (K-Means, PCA, and
Gaussian mixture models), recommended strategies for increasing donation based on member segments
Shopify, Industry Practicum Project Oct 2017 – May 2018
• Analyzed clickstream data by developing classification and clustering models to understand malicious bot traffic
Compass Pro Bono Consulting, Compass DC Oct 2016 – May 2017
• Developed a 3-year strategic plan for a non-profit in Virginia to increase annual philanthropy event participation

Interests: Swim, kickboxing, marathons, travel, karaoke, tango, hiking, music and food exploration
Zheyuan (Jerry) Chen
zheyuanchen2018@u.northwestern.edu · (347) 604-0819

Education
Northwestern University – Evanston, IL December 2018 (Expected)
Master of Science in Analytics GPA: 3.98/4.0
Columbia University – New York, NY August 2011
Ph.D. in Chemical Physics
University of Science and Technology of China (USTC) – Hefei, China July 2006
B.S. in Chemical Physics, Admitted to Special Class for the Gifted Young

Professional Experience
NASA Jet Propulsion Laboratory (JPL) Pasadena, CA
Data Science Intern June 2018 – August 2018
 Built a recommender that helped engineers fill out categorical fields in failure reports, which was estimated to
save 40 hours per year
 Implemented classification models in Python for form fields with predictions of top-3 options and reduced
model prediction time by a factor of 60 using NumPy
 Trained random forest, naïve Bayes and support vector machine models with recall larger than 0.8 using
engineered features (text and recency/frequency)

Projects
Zurich Insurance Practicum Project September 2017 – June 2018
 Built a better pricing model (neural network) of Worker’s Compensation insurance to help Zurich optimize
premium pricing strategy
 Converted in-production model from SAS to Python (using H2O library) with a replication error of 0.02%
 Calculated model scores from more than 0.6 million policies using generalized linear model (GLM)
 Assessed the model performance in terms of accuracy, execution and business impacts
Embiggening The Simpsons Dialogues April 2018 – June 2018
 Created deep learning models that wrote dialogues in the style of The Simpsons
 Achieved the human-level performance of identifying the 4 main characters by applying character-level
language model with long short-term memory (LSTM) networks
 Selected as the best poster in the deep learning class
ShopRunner Repurchasing Analysis January 2018 – March 2018
 Applied discrete time survival model logistic regression to analyze the repurchasing behavior of customers in
the ShopRunner network
 Identified factors that affected repurchasing: recency/frequency, holiday season, and network effect
 Found a positive network effect which showed that healthy competition boosted repurchasing
Catalog Mailing Marketing Analysis September 2017 – December 2017
 Predicted future purchase to maximize the total actual purchases (payoff) of top 1000 prospects
 Achieved 44% of the theoretical maximum payoff in a highly imbalanced dataset, 11 times higher than the
payoff of the baseline method
 Developed a two-step model by combining logistic and multiple regression models in R
 Performed substantial data cleaning and feature engineering

Technical Skills
Python, R, SQL, Java, TensorFlow, Hadoop, Spark, AWS, Tableau, D3

Research Experience
Columbia University New York, NY
Postdoctoral Research Associate September 2011 – May 2013
Graduate Research Assistant September 2006 – August 2011
 Deployed nanocrystals and thin carbon film in next-generation photovoltaics
 Initialized, designed and executed experiments that pinpointed a key issue limiting photovoltaic performance
 Published results in high-impact journals and presented findings at prestigious conferences
SHIH-CHUAN (JOHNNY) CHIU
(847) 262-7315 | johnnychiu@u.northwestern.edu | https://github.com/johnnychiuchiu
Preferred emphasis on algorithms, machine learning, data mining, statistics, applied mathematics or similar field
EDUCATION
NORTHWESTERN UNIVERSITY Evanston, IL
Master of Science in Analytics Dec 2018 (expected)
Coursework: Predictive Analytics, Data Mining, Data Visualization, Deep Learning, Analytics for Big Data,
Data Warehouse, Text Analytics, Social Network Analysis
Identify, analyze and interpret trends or pattern in complex data sets, including telemetry from storage arrays and other IT infrastructure components
NATIONAL TAIWAN UNIVERSITY Taipei, Taiwan
Bachelor of Mathematics/Bachelor of Economics Jun 2013
Relevant Courses: Computer Programming, Computational Mathematics, Statistics, Probability Theory,
Mathematical Software, Economic Forecasting
USE ADVANCED MACHINE LEARNING TECHNIQUES TO PREDICT OR ALERT IN CASE OF FAILURES
ULM UNIVERSITY Ulm, Germany
Exchange Program, Baden-Württemberg Scholarship Mar 2012-Jul 2012

SKILLS & CERTIFICATIONS
Patent: Marketing Intelligence Platform, Utility Model Patent in Taiwan, Dec 1, 2017-Jun 22, 2027
Certification Number M552625
Programming & Software: Python, R, Spark, Hive, Pig, SQL, Git, Bash, Java, Tableau, Microsoft Office
Languages: Native speaker of Mandarin & Taiwanese; fluent in English; intermediate German
Certifications: Google Analytics Certification, AdWords Certification, DoubleClick Bid Manager Certification
Bachelor's degree- working towards completion of PhD program in Computer Science or Computer Engineering.
PROFESSIONAL EXPERIENCE
LinkedIn Mountain View, CA
Data Science, Analytics Intern Jun 2018-Sep 2018
• Provided product suggestions to improve app retention by analyzing early user actions which are
determinant to users’ long-term engagement and presented to key stakeholders to influence product
roadmap.
• Built automating insight extraction tools to produce actionable insights from multi-dimensional data.

urAD (marketing & advertising agency) Taipei, Taiwan


Data Analytics Specialist Jul 2014-Jun 2017
• Enhanced data management platform’s functionality by generating user segment based on website
behavior; resulted in an additional $20k of quarterly revenue.
• Built Python API for the data analytics features of company's marketing intelligence platform; produced
insightful results including ad performance ranking, ad design and campaign optimization suggestions.
• Wrote data analysis report helping marketers make business decisions & increase sales using website
clickstream data; generated 10 new partnerships with e-commerce clients in 2 months. 

• Managed in-house mobile performance tracking platform; completed server-to-server integration with 30
companies; resulted in 120k user downloads for 40 apps in 6 months.
• Led company ERP platform’s full development cycle from ideas, system flow, to refining user experience
and leading cross-departmental communication; cut internal workflow processing time by 50%.

PROJECT WORK
BP NORTH AMERICA, Northwestern University Oct 2017-Jun 2018
Generated customer segmentation and build consumer lifetime value model to drive incremental revenue.
ShopRunner, Northwestern University Feb 2018-Mar 2018
Implemented algorithms to generate personalized recommendation for ShopRunner’s members.
Luca Colombo
(224) 334-5010 Ÿ LucaColombo2018@u.northwestern.edu Ÿ www.linkedin.com/in/lucacolombo1

Education
Northwestern University. Evanston IL Anticipated Dec 2018
Master of Science in Analytics. GPA: 4.0/4.0
• Coursework: Predictive Analytics, Data Mining, Big Data, Java & Python Programming, Databases &
Data Warehouses, Data Visualization, Machine Learning Model Deployment, Deep Learning
• Future coursework: Optimization & Heuristics, Reinforcement Learning, Text Analytics
• ABC Supply Hackathon, 3rd Place. Enova Data Smackdown, 3rd Place
Università Bocconi (Italy) and Université catholique de Louvain (Belgium) Apr 2016
Joint Master of Science in Economics. Graduated magna cum laude
Università Bocconi. Italy Oct 2013
Bachelor’s Degree in Economics. Graduated cum laude
University of Chicago. Chicago IL. Exchange quarter Sept 2012 – Dec 2012

Skills & Certifications


Computer skills: Python, R, SQL, Spark, MapReduce, Java, JavaScript (D3), Git, Tableau, SAS, AWS, Excel
Languages: English, Italian, French, Spanish

Work Experience
Blue Cross and Blue Shield of Illinois. Chicago IL Jun 2018 – Sep 2018
Data Science Sr. Intern
• Deployed to production a Python module to automatize the update of a database table (currently run
on a weekly basis) saving man-hours, improving scalability and mitigating risk of errors
• Increased writing performance by 10 times and parallelized execution of hundreds of SQL queries
• Enabled to outsource the running of the tool to non-technical colleagues, by focusing on traceability,
reproducibility and user friendliness (using logging, Sphinx, batch files and YAML files)
Citadel. Chicago IL Mar 2017 – Aug 2017
Electronic Communications Surveillance Analyst
• Conducted case-driven surveillance of e-communications to detect inappropriate and illicit behavior
• Evaluated proofs of concept for a new holistic employee surveillance machine learning software
MAPP Economics, an economic consulting firm
Analyst. Brussels, Belgium Jan 2016 – Jul 2016
• Worked closely with managing director in the context of antitrust investigations and merger control
involving the European Commission and National Competition Authorities
• Used statistical learning and data visualization to study competitive behavior of firms
Intern. Paris, France Jul 2015 – Dec 2015
• Performed data cleansing, exploratory data analysis and data visualization
• Supported analysts and economists in the preparation of reports and client presentations
Banco Desio, a publicly traded commercial bank. Desio, Italy Jun 2014 – Sept 2014
Summer intern, Office of Strategic Planning
• Analyzed financial statements of competing banks to benchmark quarterly earnings
Università Bocconi. Milan, Italy Nov 2011 – Jul 2014
Teaching Assistant for the courses “IT skills for economics” and “Introduction to STATA”

Projects
Industry Practicum: Zurich North America Nov 2017 – Jun 2018
• Monitored business impact of in-production pricing model for workers’ compensation insurance
• Developed machine learning model to help Zurich optimize its pricing strategy
NBA History Visualization. Github repo: https://bit.ly/2sTMwuZ May 2018 – Jun 2018
• Created an interactive dashboard using D3 to explore the evolution of the NBA’s style of play
Employee Turnover Analysis. Github repo: https://bit.ly/2KqOLw3 Feb 2018 – Mar 2018
• Trained a Random Forest classifier to predict probability of quitting with 95% recall
• Built a web-app to interact with the model and deployed it to Amazon Web Services

Interests
Swimming, cycling and running. Member of the Northwestern Triathlon Club and Bocconi Running Club
Yue (Grace) Cui
(626)203-1729 | yuecui2018@u.northwestern.edu
EDUCATION
Northwestern University, Evanston Dec 2018 (Expected)
M.S. in Analytics, McCormick School of Engineering GPA: 3.85/4.00
Coursework: Predictive Analysis, Data Mining, Big Data Analytics, Deep Learning, Data Visualization, Data Management, Databases
& Information Retrieval, Text Analytics, Optimization & Heuristics, A/B Testing, Time Series, Statistical Consulting
University of California, Los Angeles Jun 2017
B.S. Statistics; B.A. Economics Cumulative GPA: 3.83/4.00, Statistics Major GPA: 3.90/4.00
Honors and Awards: Magna Cum Laude; Winner - Best Use of External Data, Marketing Optimization, ASA DataFest;
1st Place, Target Corp. Case Competition; 2nd Place, PWC Challenge Case Competition

TECHNICAL SKILLS
R, Python, SQL, Spark, Hadoop, Impala, Hive, MapReduce, Tableau, AWS, Java, D3.js, HTML/CSS, Git, Unix, SAS, Tensorflow

EXPERIENCE
Procter & Gamble, Cincinnati Jun 2018 – Aug 2018
Data Science Intern, Global Skin & Personal Care
• Initiated and led a project that focused on using NLP to recognize specific vocabulary involving P&G product ingredients from
7M+ call center conversations (Python).
• Identified and visualized trends in consumer concerns about ingredients with text analytics methods such as TF-IDF scoring and
Word2Vec (pandas, numpy, nltk, scikit-learn, textblob, seaborn, matplotlib) and developed algorithms to detect anomalies.
• Created data pipelines and performed sales lift analysis using Big Data tools (Impala, Hive, Spark) on 14B+ transaction records.
• Improved due diligence process for $300,000/year external data purchase by converting data cleansing process and exploratory
analysis from Excel to R and Tableau (dplyr, tidyr, plotly), reducing evaluation timeline from 3 months to 3 weeks.
British Petroleum, Chicago Oct 2017 – Jun 2018
Data Science Student Consultant
• Identified key factors affecting customers’ gas purchasing behavior with customer segmentation analysis (R, Python).
• Built a Consumer Lifetime Value model and developed an interactive Tableau dashboard tool to help BP determine optimal
spending in marketing activities.
• Translated technical algorithms and methodologies into client deliverables and effectively communicated the results to clients.
Junction of Statistics and Biology Lab, UCLA Jul 2016 – Feb 2017
Research Assistant
• Led two other students in building an R package to help users compare tissue or cell types based on chromatin states.
• Wrote functions that allow users to convert bigwig files into csv files and turn the data into eligible formats for statistical analysis.
Credit Reference Center, The People’s Bank of China, Beijing, China Jun 2015 - Aug 2015
Intern, Research and Development Department
• Designed marketing plans for the company’s social media platform with knowledge of the credit reporting system in order to
attract more followers. Followers of the WeChat platform rose from 20,000 to 90,000.

PROJECTS
Gaming Analytics Project, Northwestern University Dec 2017 – Jun 2018
• Designed a novel team-based Player-versus-Player recommender system framework for players to boost performance using 16M+
matches’ data provided by Destiny 2, a modern massively multi-player online game by Bungie.
• Constructed players’ profiles by applying various clustering methods such as K-means, GMM, and Archetypal analysis (Python).
• Selected teams and players with similar playstyles but higher performance or faster improvement for recommendation using KNN.
• Submitted paper to the Artificial Intelligence and Interactive Digital Entertainment Conference (AIIDE 2018).
Grocery Sales Forecasting Project, Northwestern University Jan 2018 – Mar 2018
• Applied supervised learning methods such as Logistic Regression, GAM, Decision Tree, Random Forests, GBMs, and Neural
Network to forecast products’ unit sales for a large Ecuador-based grocery retailer.
• Developed a front-end interactive web app using Python, HTML, JavaScript and Flask and deployed it on AWS EC2.
Predictive Analytics Class Project, Northwestern University Oct 2017 – Dec 2017
• Built logistic regression, lasso regression, and multiple linear regression models to identify the top 1000 customers with the highest
expected dollar purchase from catalog mailing for a retail company.

VOLUNTEER AND INTERESTS


• Volunteer: Recorded electronic books in both English and Mandarin for visually-impaired people from China.
• Interests: Tennis, ping-pong, badminton, photography, cooking, baking
ANISHA DUBHASHI
949-981-2132 | anishadubhashi2018@u.northwestern.edu
EDUCATION
Northwestern University 9/2017 - 12/2018 (Expected)
Master of Science, Analytics GPA: 3.98
Coursework: Predictive Analytics (Supervised Learning), Data Mining (Unsupervised Learning), Deep Learning, Big Data
Analytics, Data Visualization, Text Analytics, Optimization and Heuristics
Honors: Graduate Fellowship

University of California, Los Angeles 9/2009 - 6/2013


Bachelor of Science, Mathematics-Economics

TECHNICAL SKILLS
R, Python, SQL, Java, Tableau, Git, HTML/CSS, JavaScript (D3.js), Hadoop, Spark, Hive, AWS (Redshift, S3, EC2), VBA

WORK EXPERIENCE
Nordstrom, Seattle, WA 6/2018 - 8/2018
Data Science Intern, Marketing Analytics
• Implemented clustering algorithm using R to segment millions of customers in order to gain insight into behaviors,
measure marketing effectiveness, and better allocate future marketing spend. Presented results to leadership.
• Created a model that classifies new customers into a cluster for incorporation into ETL of productionized model.

Synchrony Financial, Chicago, IL 9/2017 - 6/2018


Graduate Student Consultant
• Collaborated in a team to create a chatbot prototype in Python that handles various customer service tasks
utilizing natural language processing and a Naïve Bayes machine learning model on customer service e-chat data.

Conversant Media/Epsilon, Chicago, IL


Senior Analyst, Technical Account Analytics 3/2016 - 8/2017
Analyst, Technical Account Analytics 12/2014 - 3/2016
• Queried a SQL database with nine billion daily bids and over eight billion client retail transactions to optimize
success of campaigns; presented actionable insights in Tableau to clients to win five new business targets.
• Designed and launched A/B tests to determine optimal promotional and marketing strategies for clients.
• Created a monitoring system for test and control messaging using Python to detect operational issues.
• Led new hire trainings and biweekly knowledge sessions for the Analytics department up to the VP level on
company database specifics, table structures, and mapping across profiles.

UnitedHealth Group, Irvine, CA


Analyst, Analytics 7/2014 - 11/2014
Analyst, Pricing 7/2013 - 7/2014
• Built, enhanced, and troubleshot complex models used by the pricing team to improve accuracy and efficiency.
• Conceptualized new, competitive pricing rates while maintaining profit levels by analyzing competitive intelligence
and historical forecast data. Presented a study of standard rates to demonstrate technique effectiveness.

PROJECTS AND AWARDS


1st place, Enova Data Smackdown, Evanston, IL 10/2017
• Trained a boosted tree model in R to predict the future value of properties for Enova’s data science competition.

Gaming Analytics Research, Evanston, IL 12/2017 - 6/2018


• Developed an archetypal analysis machine learning model in R to detect anomalies and exotic playstyles in the
eSports game League of Legends based on Riot’s API data of previously recorded matches.

Cuisine Prediction Web App Development, Evanston, IL 1/2018 - 3/2018


• Built a text classification web app for cuisine prediction with Python, Flask, and HTML deployed on AWS EB.
QINGJIN (JILL) FAN
qingjinfan2018@u.northwestern.edu § (217) 898-0681 § www.linkedin.com/in/jill-fan

EDUCATION
NORTHWESTERN UNIVERSITY EVANSTON, IL
Master of Science in Analytics EXPECTED: DEC 2018
§ GPA: 3.83/4.00
§ Relevant Coursework: Predictive Analytics, Big Data Analytics, Data Mining, Machine Learning, Deep Learning, Optimization,
Databases, Data Warehousing, A/B Testing, Recommender System, Text Analytics, Data Visualization

UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN URBANA-CHAMPAIGN, IL


Bachelor of Science in Civil Engineering; Minor in Computer Science AUG 2013 – MAY 2017
§ GPA: 3.85/4.00, High Honors, Engineering Dean’s List

SKILLS
§ Programming: Python, R, SQL, Hadoop, Spark, MapReduce, Hive, Java, JavaScript (d3.js), C/C++, Bash
§ Tools: Tableau, Git, Amazon Web Services, Azure, Matlab, MAPR, Mathematica, Minitab

PROFESSIONAL EXPERIENCE
SCHNEIDER NATIONAL GREEN BAY, WI
Advanced Analytics Intern JUN 2018 – AUG 2018
Associate Engagement Survey Analytics
§ Applied an RNN model for sentiment analysis of text data, leading to a 20% increase on testing accuracy with data augmentation.
§ Extracted linguistic features using NLP techniques, visualized word embeddings using nltk, genism and HoloViews.
Work Experience Text Analytics
§ Built an ensemble of various SVM models to classify candidates based on their previous work experience with 80% accuracy.
§ Derived confidence scores from SVM models as factors for employee turnover prediction using scikit-learn.
Driver Message Analytics
§ Constructed an ETL process to identify freeform notification messages from drivers with Spark, Python and Hive.
§ Leveraged big data infrastructure to store driver messages and enhance analytics capabilities in the system.

BOSCH WUXI, CHINA


Human Capital Analytics Intern JUN 2017 – AUG 2017
§ Migrated KPI calculation tools from Excel to Python (pandas, NumPy) and built a user-friendly interface that reduced KPI
reporting process from 3 hours to 20 minutes.
§ Identified correlation among company’s medical clinic visiting rate, workforce illness rate and absence rate, interpreted the results
and presented deliverables to senior manager to improve productivity and reduce absenteeism.
§ Acquired, cleaned, and structured massive data of 300k+ rows from multiple sources and built a master database in Excel for
future workforce analysis used by the HR department.

PRICEWATERHOUSECOOPERS SHANGHAI, CHINA


Technology Consulting Intern JUN 2016 – AUG 2016
§ Designed a weight-adjusted grading scheme to evaluate system suppliers for a core banking platform transformation project.
§ Re-engineered business processes that increased the efficiency of processing time by avoiding human-errors, and achieved a
smooth transition between two distinct electronic back-end processing systems.

PROJECT EXPERIENCE
ANALYTICS HACKATHON 1ST PLACE - NORTHWESTERN UNIVERSITY & ABC SUPPLY MAY 2018
§ Implemented K-Means clustering to segment fleet vehicles and to determine potential replaceable commercial vehicles.
§ Identified significant factors for vehicle replacement in Random Forest with 93% accuracy.
§ Developed a Holt-Winters time series model to predict sales loss due to vehicle replacement.

BRAND RECOMMENDATION ENGINE - SHOPRUNNER JAN 2018 – MAR 2018


§ Designed a hybrid recommender system to personalize a top list of brands for customers based on their purchase behavior.

CUSTOMER PURCHASE PREDICTION - NORTHWESTERN UNIVERSITY NOV 2017 – DEC 2017


§ Classified and identified highly responsive customers resulting from catalog mailing and built predictive models for dollar purchase
amount using logistic regression, multiple linear regression, and lasso regression in R.
§ Selected the best model based on statistical significance, goodness of fit, parsimony, interpretability, and financial payoff.
Matthew Todd Gallagher
matthewgallagher2018@u.northwestern.edu | (302)-241-6027

Technical Skills / Recent Projects:


Languages: Python, Java, R, SQL, JavaScript, D3.js
Libraries: Sci-kit learn, Pandas, Numpy, Matplotlib, Flask, PySpark
Software: AWS Cloud Stack (Redshift, Lambda, Athena, Kinesis), SSIS, Tableau, Hadoop
Industry Practicum: BP North America | Northwestern University
• Performed customer segmentation analysis from consumer purchase data using R and Python. Created clusters
using Gaussian mixture models and designed a recency-frequency customer lifetime value model to identify
segments for targeted marketing. Ultimately designed a Tableau dashboard and presented findings to BP North
America marketing stakeholders.
Predicting Food Inspection Outcomes | Northwestern University
• Built and trained a logistic regression to predict restaurant food inspection failures. Utilized Python, Flask, and
AWS to stand up a web app allowing real time API pulls, analysis, and retraining.
Education:
Northwestern University | McCormick School of Engineering September 2017–December 2018
Master of Science in Analytics
Relevant Coursework: •Optimization and Heuristics •Predictive Analytics •Data Mining
•Text Analytics •Analytics for Big Data •Deep Learning
Lewis and Clark College September 2010–May 2014
Majors: Economics and Mathematics
Member of Pi Mu Epsilon (Mathematics Honor Society)
Honors Economic Thesis: Determining The Effect Of Increased Access To Higher Education On The Skill Premium
Varsity Track and Field
Professional Experience:
Expedia Group | Bellevue, WA June 2018–September 2018
BI Engineer Intern
• Deployed a reporting and visualization pipeline designed to monitor and analyze the booking transaction data flow. The
pipeline leveraged the AWS cloud stack, specifically: Kinesis, Lambda, Athena, and Redshift
• Created metric to capture both overall data flow health, and health of individual micro-services
• Built an interactive Tableau dashboard from the reporting pipeline, which acts as the source of truth for overall
microservice ecosystem health.
• Presented visualization and reporting infrastructure to division and organization leadership
GetInsured (acquired Array Health) | Seattle, WA December 2016–September 2017
Data Analyst
• Improved billing integration scalability through designing ETL scripts, automating data processing in SQL and
PowerShell, and writing detailed documentation
• Took ownership of insurer billing integrations for our largest client, involving monthly flow of $2-3 million in premiums
• Managed client expectations, vendor responsibilities, and coordinated the multi-team effort
• Led weekly billing integration meetings with clients, vendors, and GetInsured department leadership
• Created custom decision support features for sales department
Array Health | Seattle, WA February 2016–November 2016
Data Operations Support Analyst
• Researched and resolved flagged consumer-facing data discrepancies, requiring clear communication with our Customer
Support, Product, and DevOps departments
• Automated outdated Excel manual procedures using SQL, reducing time required for data QA and cleansing
• Developed ETL solutions to standardize and process external enrollment data, allowing easy merging with our schema
• Performed QA testing to validate production data processing tools
MICHAEL (YIFEI) GAO
(610) 517-3696 · yifeigao2019@u.northwestern.edu
EDUCATION
NORTHWESTERN UNIVERSITY Evanston, IL
Master of Science in Analytics Expected Dec. 2018
• GPA 3.9
LEHIGH UNIVERSITY Bethlehem, PA
Bachelor of Science in Accounting with CPA Eligibility (Cum Laud) May 2016
Integrated Degree of Computer Science and Business (Magna Cum Laude) May 2016
• GPA 3.6 – Top 10%
TECHNICAL SKILL
SKILLS: Python, R, SQL, Spark, Hadoop, Tableau, AWS, Java (MapReduce), JavaScript (d3), Flask, TensorFlow, Hive
COURSES: Predictive Analytics, Data Mining, Machine Learning, Data Visualization, Big Data, Deep Learning,
Optimization, Text Analytics, A/B Testing, Data Warehousing, Bloomberg Professional Certificates
WORK EXPERIENCE
A.T. KEARNEY Chicago, IL
Summer Business Analyst – Analytics Practice Jun. 2018 – Aug. 2018
• Designed, modeled, and piloted carrier performance program for Fortune 10 automotive manufacturer that is
expected to create $6M annual saving and significantly improve operational efficiency by reducing dwell time
• Created and presented to 6 partners and client executive directors the interactive geospatial dashboard to visualize
vehicle distribution network to be used in carrier contract negotiation
• Performed baseline analysis on ocean shipping capacity to validate $5M saving potential after web-scraping
JEFFERIES New York, NY
Risk Analyst May 2016 – Aug. 2017
• Elevated operational efficiency by 30% through automating new account onboarding process
• Streamlined client relationship management by creating KYC-based macros that helped reduce over $1M charges
• Mitigated risk through fail analysis, position coverage, and client negotiation on fixed-income and equity products
DELOITTE New York, NY
Risk Consulting Intern Jun. 2015 – Aug. 2015
• Conducted 4 field interviews with plant management to identify potential risks in internal controls
• Wrote an operational risk report for a leading insurance firm to streamline work processes for the audit team
• Tested internal controls quantitatively on 60 samples with supporting materials to identify control weaknesses
BANK OF CHINA (Global Fortune 100) Shenzhen, China
Software Engineer Intern – PMO Resource Management System Jun. 2014 – Aug. 2014
• Led 7 interns in the design of the system which won 3 place in the 2014 Creativity Contest (out of 60 teams)
rd

• Developed and presented prototype to the CTO which was adopted, serving 1200+ employees at 3 offices
DATA SCIENCE PROJECTS
CHICAGO BOTANIC GARDEN (NGO Project with IBM Analytics) Feb. 2018 – Jun. 2018
• Identified $50K annual revenue growth opportunity in upselling from segmenting 1M members with GMM
• Visualized donation trend and segment profiles with Tableau to lead digital marketing and event redesign
PRINCIPAL FINANCIAL GROUP Sep. 2017 – Jun. 2018
• Built regime-switching machine learning models to forecast returns to power factor-based investment strategy
SHOPRUNNER Jan. 2018 – Mar. 2018
• Developed NLP-based collaborative filtering brand recommendation engine to engage users and retailer network
• Designed and implemented front-end web infrastructure with Flask hosted on AWS and RDS
LEADERSHIP & INTERESTS
LEADERSHIP: Co-founder of Phi Delta Theta fraternity, Student Leadership Council Board Member, Peer Mentor
INTEREST: Visual Design, Travel (20+ countries), Physical Fitness (ACE certificate in progress), Films, NGOs
LG LAURENGARDI
NER
MSI
NANAL
YTI
CSCANDI
DATE

CONT
ACT EDUCATI
ON
MASTEROFSCI
ENCEINANALYTICS E
xpect
edDec2018
(
859)492-
7381 Nor
thwes
ter
nUni
ver
si
ty/E
vans
ton,I
L GPA3.
95/4.
00

L
aur
enGar
dine
r2018 E
xpect
edCour
sewor
k:Pr
edicv
eAnalyc
sI&II,DeepLear
ning
,BigDat
a,
@u.
nor
thwest
ern.
edu T
extAnal
ycs,
DataVi
sual
i
zaon,Da
taMini
ng,
Re i
nfor
cementLea
rni
ng

l
i
nke
din.
com/
in/
leg
ardi
ner BACHELOROFSCI ENCEI
NDAT ASCI
ENCE May2017
Li
pscombUniv
ers
it
y/Na s
hvil
l
e,TN GP A3.
97/4.
00
g
ithub.
com/
leg
ardi
ner Minors
:Pur
eMa t
he macs
,Int
erne
tandS
oci
alMe
diaMa
rke ng
Re
leva
ntCour
sework:
Inf
ormaonStruc
tur
es,Re
sea
rchMethods
,Li
near
S
KIL
LS Al
gebr
a,St
asc alAnal
ysi
s&Dec
isi
onMode l
i
ng,Dat
aMini
ng&Ana l
ysi
s

Py
thon WORKEXPERI
ENCE
S
par
k SI
RISOFTWAREENGI
NEERI
NTERN J
un-
Sep2018

S
cal
a
I
den fiedupst
ea mfact
orsaffecngt hejoi
nabil
it
yofSi
rius
agelogdat
a
S
QL s
tre
a msbybuil
dinganexpl
anatorymode l
inSparkMLl
ib
R E
ngineer
edas t
rongl
ytypederi
vedda t
asetofSir
ius
agedatai
nS pa
rk
a
ndS ca
latoensuredat
aqua l
i
tyforfinancia
lreporng
J
ava
DAT
ASCI
ENTI
STI
NTERN Ma
y-Aug2017
CERTI
FICATI
ONS
Provi
dedada t
asc i
enc epers
pecv et otheAppliedResearc
hte am’
s
RPr
ogr
ammi
ng
on-goi
ngdeeplea r
ninga ut
omacs pee chr
e c
og nionprojec
t
Ge nga
ndCl
eani
ngDa
ta I
mpr ovedt
helang uagemode lbyma nipulang100M l i
nesoftexttra
ini
ng
E
xpl
ora
tor
yDa
taAna
lys
is datatomimicconv e
rsaona lspeec
ha ndcreangr eproduci
bl
es cr
ipt
Create
dtool
ingforE xpertSer
vic
estoa s
sesscl
ientdat
aa ndmode l

PROJ
ECTWORK tr
a ns
fer
abi
li
tyusingpr e-
tra
inedwor de mbeddingsandTensorboar
d

DAT
ASCI
ENCEI
NTERN J
un-
Aug2016
I
NDUSTRYPRACTI
CUM:SHOPI
FY
Oc
t2017-
May2018
Buil
tcampusc l
assi
fierwi t
hF l
ask,Sci-
kitLea r
n,Fiona ,
&S ha pel
y
Deve
lopal
gori
thmtoclas
sif
yus
ersas Pythonpa c
kagestopredictauser’suniversi
tyandon/ offca mpusstat
us
bot
sorre
alshoppe
rsfore-c
omme r
ce Perfor
me daqua ntav eanal
y s
isinEx c
e l
a ndtexta naly
sisinPython
pl
a ormbasedoncl
ic
k-st
reamdata fordecli
nedB2Bcontractsthatprompt edre porngpr ocesschanges
Createdanight
lyscr
iptinDa t
abricksusingS QL,Python, a
ndS parkto
HONORS enabletheme a
sureme ntofROI onor derinsert
sf orbrandpa rt
ne r
s
Nas
hvi
l
leTechnol
ogyCounc
ilS
tudent
I
MPL
EMENT
ATI
ONSI
NTERN J
an-
May2016,
Sep2016-
Apr2017
oft
heYear2017
Col
l
eg eofCompu ngandT
echnol
ogy De
sig
ned,
impl
eme
nte
d,a
nddoc
ume
nte
dJui
ceboxda
tav
isua
li
zaons
Out
standi
ngSeni
or2017
josephgilbert2018@u.northwestern.edu
512-964-5229
Joseph Gilbert github.com/jl-gilbert

Education
Northwestern University, McCormick School of Engineering - Evanston, IL Expected December 2018
Master of Science in Analytics
Cumulative GPA: 3.9/4.0
Current Coursework in: Text Analytics, Optimization & Heuristics

University of Minnesota, College of Liberal Arts – Minneapolis, MN May 2015


Bachelor of Science in Economics, Magna Cum Laude with Distinction
Minors in Mathematics and Management
Cumulative GPA: 3.8/4.0
Deans List, Gold Scholar Award, National Scholarship, Presidential Scholarship

Skills
R, Python, Java, Git, Bash, Spark, Hive, SQL, Tensorflow, HTML, D3.js, Tableau, AWS

Relevant Experiences
Ford Motor Company - Dearborn, MI June-September 2018
Global Data Insight & Analytics Intern
• Initiated efforts to enhance forecasting models built with Excel by using Python and PySpark to incorporate bigger and more
complex data sources
• Forged new relationships between various business units to build a more comprehensive understanding of the autonomous vehicle
industry outlook in the next decade, leading to more informed long-term strategy decisions
• Mined consumer data from mTAB to gain insight into risk and opportunity in Ford’s highest impact segment

Boom Lab - Minneapolis, MN June 2015-July 2017


Business Analysis Senior Associate
• Promoted from Associate to Senior Associate in March 2017 based on performance feedback from Fortune 50 online retail client
• Brainstormed with business and engineering leaders to determine most important features to add to or improve on ecommerce
search application, resulting in prioritization of future work and conceptual development of projects
• Contributed to sprint planning and standup facilitation on an Agile team; filled in as scrum master at times
• Provided detailed analysis and feedback to engineers to increase effectiveness and functionality of programs in development
• Managed a software engineering intern during Summer 2016 to develop a new spellchecker application from the ground up,
resulting in a 30% increase in spelling correction precision
• Applied econometric techniques to data in RStudio to calibrate algorithms used by guest-facing applications

Contata Solutions, Ltd. - Minneapolis, MN June-August 2014


Data Analytics Intern
• Conducted initial meetings with consulting clients to determine needs and evaluate datasets
• Built slideshow presentations for communicating results of text analytics projects to clients
• Verified accuracy of data scraped from the web to ensure effectiveness of analysis

Projects
We Energies – Industry Practicum October-June 2018
• Developed models in R tying overtime to productivity and employee attrition in customer care centers
• Visualized patterns in employee behavior with Python and Tableau
• Recommended HR strategy based on observed trends, designed to improve employee productivity, attendance, and satisfaction
• Presented project and findings at annual Northwestern University Analytics Exchange conference

NBA Stat Predictions Web App January-March 2018


• Envisioned, designed, and developed self-updating Flask web app to make predictions for NBA stat lines with automated machine
learning algorithms
• Hosted app and database on AWS EC2 and RDS platforms
 
 
 
 
 

Sarah Greenwood 
248.953.5227 ⠂SARAHGREENWOOD2018@U.NORTHWESTERN.EDU 

EDUCATION 
Northwestern University, ​Evanston, IL — ​Master of Science in Analytics DECEMBER 2018 (EXPECTED) 
GPA: 3.95/4.0
Masters of Science in Analytics Fellowship Recipient
Coursework: Databases, Data Mining, Predictive Analytics, Data Visualization, Big Data, Deep Learning, Text
Analytics, Machine Learning Model Deployment
Venture for America ​— ​Fellow JUNE 2015 - JUNE 2017 
Received training from a variety of firms, such as McKinsey, IDEO, Flatiron School, Kaplan, and Goldman Sachs
University of Michigan, ​Ann Arbor, MI — ​Bachelor of Science, Data Mining and Information Analysis MAY 2015
Minor in Spanish Language and Culture
GPA: 3.74/4.0 with University Honors
Coursework: Statistical Principles for Problem Solving, Statistical Computing, Quantitative Research Methods,
Effective Communication in Statistics, Data Mining
SKILLS 
Programming: ​R, Python, Java, SQL, C++, Spark, MapReduce, Hive, Javascript (d3)
Software: ​Git, AWS, Google Analytics, Excel, Tableau, Salesforce, Pardot, Adobe Marketing Cloud, Brandwatch
EXPERIENCE  
Ford Motor Company, ​Dearborn, MI — ​Smart Mobility, Machine Learning Intern JUNE 2018 - SEPTEMBER 2018 
• Investigated root causes of spark plug fouling for 5,000 vehicles by applying survival analysis and
temporal data mining on millions of rows of embedded modem and diagnostic trouble code data
• Conducted cluster analysis using modem data and connected vehicle data to identify driver behavior
segments using KMeans and Gaussian Mixture Models
• Created a data pipeline using Hive, Spark, and Hadoop to combine modem data, repair data, and
dealership data for millions of vehicles
• Optimized data processing pipeline for weekly dataset acquisition using PySpark, which resulted in a 40%
reduction in runtime and a 60% reduction in required processing power
Rendia, ​Baltimore, MD — ​Business Analyst and Strategist AUGUST 2015 - AUGUST 2017 
• Built and maintained logistic regression and random forest models which predicted customers’ likelihood
to renew with 85-90% accuracy based on customer usage patterns and characteristics
• Collaborated across departments to harness statistical models to proactively target at-risk accounts and
forecast cash flow
• Performed ad-hoc analysis to determine best practices for improving customer success and re-engaging
expired customers
• Developed data tracking methods to improve the cleanliness and quality of data, resulting in clearer,
faster, and more accurate reporting
• Identified prospective customers and planned and iterated on targeted outreach through market research
and search engine marketing
MRM//McCann, ​Birmingham​, ​MI — ​Marketing Analytics Intern JUNE 2014 - AUGUST 2014 
• Queried, analyzed, summarized, and presented insights from large social-media data sets to better
understand the public opinion of General Motors’ Credit Card competitors
• Analyzed variables associated with GM Card’s different credit cards’ programs to calculate under which
circumstances each credit card program is the most beneficial, and reported these findings internally
PROJECT WORK 
​ orthwestern University
Instacart Recommendation Engine,​ N JANUARY 2018 - MARCH 2018
• Developed a proof of concept collaborative filtering recommendation engine using Scikit-Learn Surprise
• Built and maintained front-end web application infrastructure as well as RDS and AWS components

 
VARUN GUPTA
+1-312-964-0262 | varungupta2018@u.northwestern.edu
EDUCATION
Northwestern University, Evanston IL Dec 2018 (Expected)
Master of Science in Analytics (MSiA) GPA 3.90/4.00
Relevant Coursework: Predictive Analytics 1&2, Data Mining, Deep Learning, Data Visualization, Java and Python
Programming, Analytics for Big Data, Databases and Information Retrieval, Reinforcement Learning, Text Analytics
Nanyang Technological University, Singapore Jun 2015
B.Eng. (Mechanical Engineering) with Business Minor. GPA 4.41/5.00
SKILLS AND LANGUAGES
• Programming Languages: Python, R, Java, SQL, HTML/CSS, Javascript (D3.js)
• Software: Spark, MapReduce, Hive, Hadoop, Tensorflow, NetworkX, AWS (E3, EMR, RDS), Unix, Tableau, Git, Office

PROJECTS
Analytics Consultant, TransUnion (Chicago, IL) – Practicum Project Sep 2017 – Jun 2018
• Created graph structures composed of shared personal identity attributes between customers to help predict the occurrence of
Synthetic Identity Fraud within a well-studied sample of 500,000 customers.
• Engineered graph features manually using parallel-processing to reduce the time consumed, and then fed these to an
XGBoost Tree model to appraise the value of using this graph data and to identify suspicious graph features.
• Developed a graph-kernel based Convolutional Neural Network model to bypass manual user-defined feature creation.
• Achieved test-set Recall above 70% and Precision above 90% with both models, far outperforming initial expectations. Used
these graphs to capture numerous suspicious under-performing customers that were undetected by previous models.
Gaming Analytics, Digital Creativity Labs – PvP Recommender System Framework for Teams Jan 2018 – Jun 2018
• Developed a novel, flexible framework for providing different types of recommendations to improve the performance of
teams in a Player versus Player (PvP) environment. Evaluated it by building a working recommender system for Destiny 2.
• Created multiple individual and team profiles by using Gaussian Mixture Models to find clusters of equipment-choice and
playstyle features. Shared the strategies employed by better or faster-learning teams among pools of nearest-neighbors.
Deep Learning Project – Object Detection for Autonomous Vehicles Apr 2018 – Jun 2018
• Adapted the Mask-RCNN model code to use it for an object detection and segmentation dataset provided by Baidu, with the
goal of quickly and accurately detecting object boundaries and correctly classifying them.
• Achieved a pixelwise accuracy (IOU) of over 50% on average, and over 90% on images with many close-range objects.
Big Data Analytics – Venmo Transaction Analysis Apr 2018 – Jun 2018
• Classified over 7 million Venmo transactions stored on Hadoop using emojis in their descriptions, and analyzed the
popularity of each transaction category over various time-frames using PySpark.
• Conducted network analysis to study in/out degree growth and the proportion of reciprocal relationships over time.
Full Stack Web App Development Jan 2018 – Mar 2018
• Built and deployed a Chicago crime-forecasting web application using Flask and HTML on an AWS EC2 instance.
• Automated regular weather-forecast API pulls to store parsed data in an RDS instance, which then fed these as inputs along
with historical crime data and a user-selected zipcode to a simple Python-based Random Forest model for predictions.
PROFESSIONAL EXPERIENCE
Analytics Intern, Enova International (Chicago) Jun 2018 – Aug 2018
• Integrated unused customer data from other Enova credit products to find groups of customers that strongly overperform
relative to the existing model’s expectations for longer-term, large principal UK loans using classification tree-based models.
• Investigated the process by which customers are purchased from aggregators by exploring the feature-space to find customer
groups that are either rarely or very frequently purchased by competitors, or far more likely to have an issued loan with one
Enova product than another, to maximize profit or volume through smarter acquisitions.
• Designed tests to sample applicants and measure the impact of suggested changes for both analyses.
Advanced Analyst, Ernst & Young LLP (Bangalore) Nov 2015 – Jun 2017
• Performed valuation as well as difference resolution procedures using analytical and simulation-based pricing models for
various vanilla and complex FX, interest rate, equity, and commodity derivative products as a member of the Derivatives
Valuation Centre (DVC) in EY’s Risk Advisory division.
• Led a team of Analysts performing valuation procedures, while handling all client communications.
• Developed a comprehensive tool, using VBA for Access and Excel, that automates the standard DVC process for a variety of
vanilla products by interfacing with the user via Access forms and Excel spreadsheets, and serves as an accurate one-stop
market data repository. Won the tri-annual “Extra-Miler Award” for potentially saving thousands of man-hours.
• Worked onsite in market risk model development and documentation, specializing in FX risk, for a global investment bank.
EXTRA-CURRICULAR ACTIVITIES
Sports Editor, The Tribune by NTU Students Union Aug 2013 – Aug 2014
• Selected articles, directed journalists, edited their writing, authored complete articles, and finalized the newspaper’s
appearance for each issue. Systemic content-choice changes resulted in improved readership of the Sports section.
Veronica Hsieh
DATA SCIENCE
Passionate about combining data with natural curiosity to
understand people & solve business questions creatively;
graduate student with prior consulting experience in
implementing technology solutions

CONTACT EDUCATION
(408) 507-0639 MS in Analytics, McCormick School of Engineering
veronica.hsieh@gmail.com Northwestern University | Expected December 2018
linkedin.com/in/veronica-hsieh
GPA: 3.8
BS Economics Mathematics, Business Administration
LANGUAGES University of Southern California | Aug 2010 – Sept 2014
• Python
• R EXPERIENCE
• Spark
• Java (MapReduce) Jun 2018 Spotify – New York, NY
• D3 Aug 2018 Data Science Intern, User Growth & Markets
• HTML/CSS • Developed a clustering model to identify similar markets
• SQL
based on product feature usage, listening behavior, and
content preferences
TOOLS & FRAMEWORKS • Conducted ad-hoc analysis to support strategy and ops
teams on evaluating growth opportunities in various
• Tableau markets; created a product transition model to analyze
• Git changes in user engagement over time
• Flask
• MS Excel, Project Oct 2017 BP North America – Chicago, IL
• Jira
• Distributed computing Jun 2018 Customer Segmentation & Lifetime Value Analysis
• Built a customer segmentation model using purchasing
behavior, fuel preferences, and station demographic data for
the driver loyalty program; wrote a script to automate the
RELEVANT COURSEWORK
data processing and feature engineering for the model
Predictive Analytics • Created a Tableau dashboard for the marketing team to
Machine Learning measure the impact on customer lifetime value and perform
Optimization scenario analysis on marketing campaigns
Data Mining
Data Visualization Sept 2014 Optimity Advisors – New York, NY
Analytics for Big Data July 2017 Senior Associate, Associate
Deep Learning
• AMC Networks – Managed the development and delivery of a
Text Analytics
web API which passed show and movie metadata to various
Probability Theory
Mathematical Statistics consumer applications
• Moody’s Investor Services – Led a team of UX/UI designers
and software developers on creating an enterprise survey tool
which automated the distribution and ingestion of financial
INTERESTS
data collected from 500+ companies
iMentor ~ Oct 2017 – Present
Music Discovery OTHER PROJECTS & PREVIOUS WORK
Outdoor Activities
• Cryptocurrency Analysis
Fashion Editorials • ShopRunner – Recommender system
Collecting Condiments • Beats by Dre – Global sales analysis
Wenze Hu
wenzehu2016@u.northwestern.edu | (425)-589-7187

EDUCATION
Northwestern University Evanston IL, USA
Master of Science in Analytics Dec 2018 (expected)
Select Courses: Deep Learning, Social Network Analysis, Text Analytics, Data Visualization, Analytics for Big
Data, Analytics Value Chain, Data Mining, Predictive Analytics, Optimization & Heuristics

Tsinghua University Beijing, China


Master of Engineering Jul 2004
2nd Prize – Tsinghua-Guanghua Scholarship; Tsinghua Excellent Freshman Award (top 1%)

SELECT DATA SCIENCE PROJECTS


Unemployment Insurance Fraud Detection (Internship Project) Jun 2018 – Aug 2018
• Converted a social network of 2 million unemployment insurance claims to Graph Kernel matrices.
• Designed and implemented a Convolutional Neural Network model which takes in the Graph Kernel
matrices to detect frauds.
Northwestern Admissions Data (Natural Language Processing) Sep 2017 – Dec 2017
• Analyzed archived MSiA admission emails using NLTK (Python, Natural Language Processing).
• Generated FAQ and reference answers using Okapi BM25 ranking model (Gensim).
Personalized Restaurant Recommendation (Web & Android App, Recommendation) Nov 2017 – Jan 2018
• Designed and implemented Java Servlets to allow users to explore and get recommended restaurants by
applying a collaborative filtering & sorting algorithm on Yelp data (Apache Tomcat, JSON, MySQL).
• Designed and implemented a LBS Android app for users to explore, visit and get recommended restaurants.
• Integrated Google Map API to display restaurant locations.

PROFESSIONAL EXPERIENCE
On Point Technology Chicago, IL
Data Scientist Intern Jun 2018 - Aug 2018
• Analyzed 2 million unemployment insurance claims by using Graph Kernels and Convolutional Neural
Network to detect frauds. AWS GPU was utilized to accelerate the training process.
Everbridge (Nasdaq: EVBG) Beijing, China
Technical Project Manager | Scrum Evangelist Sep 2015 - Aug 2017
• Coached 6 Scrum teams (40 people) to accomplish the organizational Scrum transformation from traditional
workflow (accelerated the release cadence from quarterly to monthly).
• Led cross-functional 10-person Scrum team to achieve all release milestones with 100% fulfillment.
Yottaa Beijing, China
Product Manager | Certified Scrum Master Jan 2014 - Mar 2015
• Supervised the Traffic Management product line and led a cross-functional 5-person Scrum team to
increase team productivity by 50% (measured in the story points completed) in 2015 Q1.
France Telecom Beijing, China
Product Manager | Certified Scrum Master Dec 2007 - Dec 2013
• Supervised customization & delivery of signature Android devices (6M+ per year).
• Led 2 Scrum teams covering 5 feature mobile applications & Managed a 20-person outsourcing QA team.
Siemens Beijing, China
R&D Engineer Jul 2004 - Dec 2007
• Implemented and maintained the protocol stack of Siemens carrier-grade WAP gateway in Java.
• Met with clients as technical specialist and worked across organization to support Pre-Sales, Product
Management, QA, and Customer Service.

SKILLS
Programming: Java, Python, R, JavaScript, Android SDK
Machine Learning Package: TensorFlow, PyTorch
Distributed Computing: AWS, Hadoop, Hive
Rishabh Joshi
rishabh@u.northwestern.edu | github.com/rishabh-joshi | 224-565-3647

Education
Northwestern University, Evanston, IL Expected December 2018
Master of Science in Analytics GPA: 4.00/4.00
Coursework: Predictive Analytics, Databases, Data Mining, A/B Testing, Deep Learning, Big Data Analytics (Hadoop, Spark & Hive)
Indian Institute of Technology (IIT) Guwahati, India June 2017
Bachelor of Technology in Mathematics and Computing GPA: 9.19/10.00
Honors: Institute Merit Scholarship 2014 (Rank 1/50 in the department)
Coursework: Statistical Methods & Time Series Analysis, Probability, Linear Algebra, Optimization, Data Structures and Algorithms
Technical Skills
Python (tensorflow, keras, scikit-learn, xgboost, pandas, seaborn, networkx, numpy), R (xgboost, caret, dplyr, tidyr, ggplot2), C++, Java,
SQL, MATLAB, Hadoop, MapReduce, Spark (pyspark, sparklyr), Hive, HBase, Amazon Web Services, d3.js, Tableau, Bash, Git
Experience
TransUnion, Chicago, IL | Financial Services Analytics Intern June 2018 – September 2018
 Worked closely with the Sr. Director of Analytics to design and productionize a Mortgage Interest Rate Estimator from end-to-end
using 1.2 trillion records which is estimated to generate more than $1M in revenue.
 Collaborated with the sales and credit teams to collect information on previously unknown motorcycle loans; identified these
loans from 119 billion trades using SQL, Hive (Hadoop) and clustering on Spark MLlib (85% accuracy) to build a risk score model.
 Automated and scaled audit reports generation with effective visualizations such as Maps and Waterfall Charts to spot trends and
anomalies in auto loans; enabled rapid product iteration to prescreen customers for auto loans targeted marketing.
Digital Creativity Labs, Chicago, IL | Data Science Consultant December 2017 – June 2018
 Developed a novel team-based recommender system for MMOG video games to improve player performance.
 Created multiple behavioral profiles for 100,000 players of Destiny II using unsupervised learning methods including Gaussian
mixture models, K-Means, and Archetype Analysis to recommend appropriate weapons and optimal playing strategies for players.
TransUnion, Chicago, IL | Fraud Analytics Consultant December 2017 – June 2018
 Engineered graph structures and novel features for 500,000 customers using parallel processing to represent shared identity
attributes and predicted synthetic identity fraud by training an XGBoost model on those features.
 Automated a time-consuming manual feature creation process by training a graph kernel based Convolutional Neural Network
(CNN) model and further improved upon the XGBoost model.
 Elevated the performance of the original productionized model by 25%; uncovered several suspicious customers and previously
undetected sharing patterns; presented to the entire analytics department of more than 70 people.
Saarland University, Germany | Machine Learning Intern May 2016 – July 2016
 Built an author citation network (1.3 million nodes, 172 million edges) to distinguish between authors with the same name by
comparing their affiliation, specialization, and co-authors using social network analysis and fuzzy matching on 1.2 million papers.
Heritage Institute of Technology, India | Data Science Intern May 2015 – July 2015
 Developed three algorithms in C++ to solve the community detection problem in dynamic social networks that out-performed the
state-of-the-art algorithm in optimizing the community structures.
Project Work
Activity Classification in Videos with Deep Learning for Better Ad Placement April 2018 – June 2018
 Built an application for real-time human activity classification in videos to improve the allocation of ads with 90% accuracy.
 Generated features from the video frames using the Inception Model and passed the features to an LSTM Model to utilize the
temporal flow of the frames and identify one of the 101 possible activities; built the models using TensorFlow and Keras on GPU.
Full-Stack AWS Web App Development, Character Recognizer January 2018 – March 2018
 Designed a handwritten character recognition web app following agile methodology with Flask, HTML, and JavaScript, deployed
on the AWS EC2 Cloud Computing Platform, with a CNN model trained on 800,000 images stored in the Amazon RDS database.
Customer Purchase Activity Prediction November 2017
 Predicted customer purchases by creating features to capture the recency, frequency, and monetary value using regression
models; successfully identified the top 1,000 customers with highest expected payoff and lowest RMSE among 10 teams.
Extracurricular Activities and Volunteer Work
 Certified KPMG Six Sigma Green Belt holder.  Coursera Mentor for four data science courses since Sep ‘17.
 Top 50 finalist in the national level ITA Coding Challenge ’17.  Tutor for underprivileged children from Dec ‘14 to Dec ‘15.
BROOKE KENNEDY
brookekennedy2018@u.northwestern.edu · (270) 617-1700 · www.linkedin.com/in/brooke-kennedy

EDUCATION
DECEMBER 2018 (EXPECTED)
MASTER OF SCIENCE IN ANALYTICS, NORTHWESTERN UNIVERSITY, EVANSTON, IL
Coursework: Predictive Analytics • Data Mining, Big Data • Java and Python Programming • Deep Learning • Data
Visualization • Databases and Data Warehousing • Analytics Consulting • Optimization • Text Analytics | Awards:
Enova Data Smackdown Team Competition – 1st Place

MAY 2017
BACHELOR OF ARTS IN COMPUTER SCIENCE, BELLARMINE UNIVERSITY, LOUISVILLE, KY
Honors: Summa Cum Laude | Minor: Mathematics | Awards: Computer Science Faculty Merit Award • National
Science Foundation STEM Scholar • Dean’s List • Clare Boothe Luce Undergraduate Research Scholar • Kentucky
Academy of Science Annual Meeting 1st place undergraduate engineering poster
TECHNICAL PROFICIENCIES
Python • Java • R • SQL • Tableau • Hadoop • Spark • Hive • JavaScript (D3) • HTML/CSS • PHP • Git • Linux
EXPERIENCE
JUNE 2018 – SEPTEMBER 2018
DATA SCIENCE INTERN, OPEX ANALYTICS, EVANSTON, IL
Analytics firm that combines machine learning and optimization to answer complex questions
o Developed full stack solution utilizing Python, PostgreSQL, Tableau, and internal deployment platform to
help Fortune 150 quick service restaurant identify and reduce global supply chain risks by recommending
backup options in supply chain for times of contingency
o Led initiative on reproducible data science through company-wide presentation and published article on
company’s public blog, leading to adoption of techniques in new client projects

OCTOBER 2017 – JUNE 2018


STUDENT DATA SCIENCE CONSULTANT, SHOPIFY, EVANSTON, IL
Canadian e-commerce company offering a platform for online stores and retail point-of-sale systems
o Used Python and SQL to analyze Shopify clickstream data
o Utilized supervised machine learning algorithms to distinguish between sneaker bots and human
shoppers

OCTOBER 2017 – DECEMBER 2017


STUDENT DATA SCIENCE CONSULTANT, GREENWICH.HR, EVANSTON, IL
Company offering real-time labor market intelligence data
o Analyzed Greenwich.HR labor market data and developed multiple predictive models to recommend job
roles and salaries based on skills and presented analysis to Greewich.HR CEO

MAY 2017 – JULY 2017


QA TECHNOLOGY INTERN, THE LEARNING HOUSE INC., LOUISVILLE, KY
Academic program management company offering technology-enabled education solutions
o Performed quality assurance testing including weekly regression tests on software deployments
o Automated test scripts in Java and Python using Selenium WebDriver to improve testing speed
o Created an upgrade test plan for internal enterprise resource planning software (ERP) by documenting
test cases
o Used PHP and MySQL to create a script that logs needed information to files on a monthly basis
Arvind Koul

EDUCATION
arvindkoul1995@gmail.com
Master of Science in Analytics
(773) 961-4829
Northwestern University
09/2017 – 12/2018 Evanston, IL

linkedin.com/in/arvindkoul Post Graduate Diploma in Applied Statistics


IGNOU
SKILLS 08/2016 – 07/2017 New Delhi, India

R (caret, xgboost, keras,


tidyverse family) Bachelor's in Management Studies
University of Delhi
07/2013 – 06/2016 New Delhi, India

Python (tensorflow, scikit-learn,


numpy, pandas, matplotlib)
WORK EXPERIENCE
Analytics Intern
SQL, Hive TransUnion
06/2018 – 09/2018 Chicago, IL
Achievements
Hadoop, Spark Implemented challenger xgboost model to predict credit card bust-outs (3M records) resulting in a 26%
reduction of annual cumulative bad debt.
Used auto-encoders to automate feature generation from raw transactions data to feed into the xgboost
model.
Tableau
Improved interpretability of xgboost model by tracing the path of each observation along all the trees in the
model thereby decomposing each prediction into impacts due to an observation's feature values.

Analytics Consultant
RELEVANT Principal Financial Group
COURSEWORK 10/2017 – 06/2018 Des Moines, IA
Achievements
Predictive Analytics Simulated current Random Forest model used by the firm to predict factor returns in Russell 1000 market.
Conducted feature engineering, developed lasso-logistic model, and implemented regime modeling (using
Data Mining markov chains) on weekly Russell 1000 market data (1994 - 2017) to improve the factor-based investment
process.
Deep Learning

Big Data Analytics Co-founder


Fishdeal (subsidiary of YesLife Enterprises)
Data Vizualisation 03/2014 – 02/2017 New Delhi, India
Achievements
Statistical Inference Co-founded Fishdeal, a retail company for all kinds of aquaria. Relevant analytics experience included
conducting A/B tests to increase customer engagement (improved response rate by 15%).
Probability Theory

Corporate Finance ACADEMIC PROJECTS


Image Colorization (04/2018 – 06/2018)
Used a sequence of convolutional neural networks in Python for colorization of photos of different bird species, with initial
layers used for feature extraction, and deeper layers for colorization.

Movie Recommender Web App (01/2018 – 03/2018)


Built a recommendation system using collaborative filtering in Python on the MovieLens dataset. Designed and developed
flask web app, which was hosted on AWS Elastic Beanstalk.

Customer Response Prediction (11/2017 – 12/2017)


Predicted customer response to a catalog mailing using logistic (probability of purchase) and linear (Total amount of
purchase) regression in R.
Matthew Tucker Lewis
MatthewLewis2018@u.nothwestern.edu
(720) 290-0372
EDUCATION
Northwestern University – Evanston, IL Anticipated December, 2018
Master of Science in Analytics
Cornell College - Mount Vernon, IA May, 2017
Bachelor of Arts in Business Analytics
Bachelor of Arts in Psychology
Thesis: Mobile Phone Applications for Behavior Change
PROFESSIONAL EXPERIENCE
ABC Supply – Chicago, IL
Data Science Intern June 2018 – September 2018
 Leveraged in-house and external data sources for feature engineering to develop a predictive model
for new markets and branch productivity
 Utilized advanced web scraping techniques with Selenium to leverage new data sources for
predictive modelling and D3 visualizations
Synchrony Financial – Chicago, IL
Data Science Consultant November 2017 – Present
 Engineered a chatbot solution for expediting customer service interaction, reducing call center load
and costs using Python, NLTK, and other NLP tools
 Led the team through stakeholder meetings and technical advisor sessions to ensure
alignment of product and expectations
 Design and implement an interactive personal shopper, building upon the chatbot
infrastructure. Integrate natural language processing to ensure seasonally appropriate,
personalized recommendations for cross-selling opportunities.
ProMazo Inc. – Chicago, IL
Project Manager November 2017 – February 2018
 Managed and worked with students to capture and prioritize the enterprise analytics demand of over
100 stakeholders to facilitate the efficient adoption of advanced analytics stack
Cornell College Quantitative Reasoning Studio - Mount Vernon, IA
Peer Consultant August 2015 – May 2017
• Guided students seeking help with quantitative subjects: homework and exam reviews in Statistics,
Economics, Calculus; Experimental Design and Paper Review in Biology, Psychology, Analytics
TECHNICAL SKILLS
Languages Tools/Skills Selenium
R, Python Tableau NLP
Java Spark, Hive Web Scraping
SQL MapReduce, HDFS Network Analysis
HTML, CSS AWS (S3, EBS, EC2) Linear Programming
D3, JavaScript Git Analytic Solver
Flask Applications SPSS, Minitab
RELEVANT PROJECTS
Google Landmark Recognition
- Designed a Deep Learning model based on Xception CNN to classify landmark photos, reaching
93% accuracy with 100 classes
Venmo Transaction Classification
- Used Spark RDDs and Spark data frames to cluster and classify both Emoji and Text transactions
- Modelled Transactional relationships between users to better understand customer activity
Xiaowei (Wei) Li
xiaoweili2018@u.northwestern.edu (415) 606-1185
EDUCATION
Northwestern University, Evanston, IL Sept 2017 – Dec 2018 (Expected)
M.S. in Analytics (MSiA) GPA: 3.82/4.0
Relevant Coursework: Predictive Analytics I & II, Data Mining, Analytics for Big Data, Deep Learning, Text Analytics,
Data Visualization, Modern Database & Data Warehouse, Analytical Value Chain, Analytical Consulting Leadership

St Mary’s College of California, Moraga, CA Sept 2009 – Feb 2013


B.S. Mathematics, summa cum laude, Minor in Economics GPA: 3.89/4.0

SKILLS
Python (pandas, Seaborn, sklearn, H2O, Keras), R, SQL, Spark, MapReduce, Java (Hadoop), Tableau, Git, AWS, D3.js

EXPERIENCE
SoFi, Inc. San Francisco, CA Data Science Intern, Marketing Analytics Jun 2018 – Sept 2018
• Created data pipeline to clean, transform, and combine over 200 million prospects’ credit attributes, card transactions,
and demographic data stored in AWS S3 and data warehouse with Hive and PySpark
• Developed xgboost uplift model for direct mail campaign cadence with an AUC of 0.83 and improved lift of top 2
deciles by 12%; scored prospects by propensity to fund personal loan; extracted recommendation for monthly A/B
test experiment and re-allocated marketing budget to reduce cost-per-start by 3%
• Built regression models using ensemble method to predict and profile members and applicants by their potential
values in SoFi Money product to facilitate marketing with lead prioritization; deployed the model and integrated
results by collaborating with warehouse team

Gridsum, Inc. Beijing, China Data Scientist, Data Center Dec 2015 – Jul 2017
• Segmented mobile gaming app users with k-means clustering based on activity ranking, in-app behavior pattern,
and purchase history; automated weekly Tableau reports to facilitate design of activities, resulting in increased
ARPU (average revenue per user) of 21% over a quarter
• Formulated quantitative approach to exploring mobile app user retention: developed solution using decision tree
based on users’ in-app behavioral data; collaborated with marketing and operations teams to design A/B test
strategies that improved 30-day retention rate by 6%
• Oversaw web/app user behavior analytics projects that assisted clients in enhancing online marketing effectiveness,
including user segmentation, churn analysis, customer lifetime value (CLV), lead generation, and campaign
optimization
• Leveraged data science tools and techniques such as regression and classification models, A/B testing, k-means
clustering, recommender systems in Python and R using data from disparate data sources (online, offline,
transactional, aggregated, qualitative)

Berkeley Research Group, LLC San Francisco, CA Senior Associate Jul 2013 – Oct 2015
• Conducted statistical and probabilistic studies of claims value and potential insurance recovery in light of factual
and legal uncertainties; performed allocations of liability costs to insurance coverage by analyzing historical claims
and estimating future liabilities; helped client to successfully recover 90% insurance claim payments of $1.6
billions from 120+ insurers
• Developed quantitative financial models in R and Excel using structured data on pricing, sales volumes, costs, and
competing products to assist clients in damage analysis relating to disputes in antitrust, intellectual property,
securities, financial reporting, and mergers and acquisitions

SELECTED CASES/PROJECTS
ShopRunner – Retailer Network Analysis (Github: https://bit.ly/2LQnPu1) Jan 2018 – Jun 2018
• Evaluated retailers’ in-network value by aggregating over 5 million members transactions to generate retailer
network; constructed retailer segmentation model with network features using PCA and gaussian mixture models;
profiled 100+ active retailers to strengthen network connection and facilitate cross-sell; developed an interactive
dashboard with D3.js and deployed it on AWS to visualize active retailers’ interactions and sales trends

Zurich North America Insurance – Workers’ Compensation Pricing Model Sept 2017 – Jun 2018
• Migrated current GLM model from SAS to Python; created benchmark metrics to measure and monitor model
performance in terms of accuracy, execution, and business impacts; profiled and segmented over 90,000 companies
according to their risk level; developed new pricing models with boosted trees, random forest, and RNN models
and improved pricing tool accuracy by 10%
Zili Li
zilili2018@u.northwestern.edu • (406)208-7878
EDUCATION
Northwestern University – GPA 3.92 Expected Graduation: December 2018
Master of Science in Analytics
 Coursework: Predictive Analytics, Databases, Data Mining, Analytics Value Chain (A/B Testing), Deep Learning, Data
Visualization, Analytics for Big Data
 Award: ABC Supply Hackathon 1st Place – Identified delivery vehicles that needed to be replaced, estimated the potential
loss due to replacement using time-series analysis, and proposed recommendations for reallocating resources
The Ohio State University – GPA 3.97 May 2017
Bachelor of Science in Business Administration with Specialization in Accounting, summa cum laude
 Honors Thesis: The Impacts of the Affordable Care Act on Preventive Services among Racial Groups
Bachelor of Science in Actuarial Science, summa cum laude
 Exams: Probability (November 2014), Financial Mathematics (April 2015), Models for Financial Economics (July 2015)

TECHNICAL SKILLS
 Programming Languages: Python (pandas, numpy, scikit-learn, tensorflow), R (dplyr, tidyr), Java, JavaScript (D3.js), SQL
 Tools: Hadoop, Spark, Hive, Amazon Web Services (AWS), Azure, Linux, Tableau
WORK EXPERIENCE
Expedia Group Chicago, Illinois
Air Shopping Optimization Intern June 2018 – September 2018
 Conducted quantitative analysis for 160 million searches to understand the difference in conversion rates by device types
 Used K-means clustering to segment mobile and traditional browser users based on their search attributes
 Identified the key factors that drove flight booking conversion rates, quantified their importance by building tree-based
models with AUC of 0.78, and provided implementable recommendations for optimizing the flight search results
State of Ohio Columbus, Ohio
Research Intern June 2017 – August 2017
 Utilized data analytics to monitor childcare providers’ activities, detect potential fraud, and improve investigation process
 Developed a proactive monitoring dashboard by manipulating and analyzing data from multiple sources, identifying
patterns and anomalies, and communicating with subject matter experts for continuous improvement
The Dow Chemical Company Midland, Michigan
Tax Accounting Intern May 2016 – August 2016
 Identified and analyzed the discrepancies between two software systems in deferred taxes for more than 400 foreign entities
and contacted foreign tax managers to understand the discrepancies and determine appropriate adjustments
PROJECTS
Chicago Park District Practicum Project, Northwestern University October 2017 – June 2018
 Worked with the client to develop a tiered pricing system and remove the needs for manual price adjustments with a
regression tree model
 Created interactive dashboards that assisted the CFO and pricing team to visualize socio-economic data and update prices
Game Analytics Research, League of Legends, Northwestern University January 2018 – June 2018
 Assisted broadcasters in real-time storytelling by using archetype analysis to detect anomalous events and exotic plays in the
massive esports game League of Legends
 Performed exploratory data analysis in MySQL, identified sequences of events, and extracted features from raw data
ShopRunner Repurchase Analysis, Northwestern University January 2018 – March 2018
 Engineered features (RFM) from top 10 retailers data for survival analysis in R, evaluated the network effects and the
impacts of time-dependent covariates on retention rates, and presented actionable marketing strategies to the Analytics team
Book Recommender System, Northwestern University January 2018 – March 2018
 Built a recommender system for 10K books based on 6 million user reviews using content-based and collaborative filtering
methods in Python and deployed the model as a web app on AWS EC2
Xinyue (Emma) Li
Emeryville CA 94608 · (530) 564-2418 · xinyueli2018@u.northwestern.edu
EDUCATION
Northwestern University Evanston, IL
MS in Analytics, GPA: 3.80/4.00 Expected 12/2018
• Relevant Courses: Predictive Analytics, Data Mining, Data Visualization, Analytics for Big Data (Hadoop/Spark/Hive),
Database Design and Information Retrieval, A/B Testing
University of California, Davis Davis, CA
B.S. Applied Statistics, GPA: 3.91/4.00; B.S. Managerial Economics, GPA: 3.76/4.00 06/2017

SKILLS
R, Python (Pandas, NumPy, Scikit-learn, Keras), SQL (MySQL, Hive, Netezza), Spark, Hadoop, Java, D3.js, HTML/CSS,
Tableau, AWS, SAS, C, Stata, Matlab

WORKING EXPERIENCE
TransUnion Chicago, IL
Data Science Intern 06/2018 – 09/2018
• Constructed risk score models to review personal loan applications using various methods such as tree-based models
(XGBoost, Gradient Boosting, C5.0, Random Forest), SVM and Artificial Neural Networks
• Researched and implemented methods of variable interpretation in Neural Networks for adverse selection
• Performed quantitative analysis on 1B+ trades records to identify the customers’ capacity to absorb ongoing credit products
as the interest rate increases, which improved its cycle readiness through early identification of a shift in consumers’ debt
Graduate Analytics Consultant 09/2017 - 06/2018
• Created graph components and generated features on 500k+ credit accounts with shared identity information
• Trained Boosting Trees with XGBoost to identify the fraudulent accounts and achieved 95.9% precision
• Researched and applied a Convolutional Neural Network using Graph Kernels to improve the fraud detection performance
• Detected undiscovered suspicious accounts and improved the previous model by 25%
Agricultural Issues Center Davis, CA
Undergraduate Researcher 07/2016 - 06/2017
• Analyzed the effect of the legalization on Cannabis price in California through Hypothesis Testing in R
• Performed exploratory data analysis and analyzed and researched reasons for price change across different agricultural
commodities through Functional Principal Component Analysis on the 1995-2015 California agricultural exports
Standard Chartered Bank Shanghai
Intern 08/2016 - 09/2016
• Predicted key indicators (revenue growth, EBITDA, taxable profit, etc) of the client to ensure liquidity for debt issuance
• Explored potential collaboration opportunities between Standard Chartered Bank and Alibaba through analyzing Alibaba’s
operational model and cash flow

PROJECTS
Predictive Modeling on Clothing Sales 10/2017 - 12/2017
• Cleaned data inconsistencies and imputed missing values with MICE algorithm and KNN methods
• Generated features measuring the recency and frequency of consumers’ purchasing behaviors
• Evaluated the efficacy of catalog-driven marketing through predicting customers’ future purchases with stacking Logistic
Regression and Multiple Linear Regression models
• Estimated the expected profit to assist the company’s marketing strategy decision
Gaming Analytics on Destiny II 01/2018 - 06/2018
• Designed a Player versus Player recommendation system framework based on team play to improve team performance
• Implemented clustering analysis through K-means, GMM, Archetype Analysis on 16M+ matches from Destiny II to create
player profiles
• Produced team profiles accordingly and provided recommendations via K-Nearest Neighbor method
• Submitted paper to AIIDE(Artificial Intelligence and Interactive Digital Entertainment Conference) 2018
Venmo Transaction Study 04/2018 - 06/2018
• Conducted quantitative analysis with effective visualizations on 7M+ transactions via PySpark and SparkSQL to summarize
Venmo's social network
• Analyzed different emoji use patterns in various time frames to learn users' spending habits using RDD and Spark data frame
• Clustered the transaction messages with PySpark using text-based attributes to improve the text classification algorithm in
each segmentation

ACTIVITIES AND LEADERSHIP


• Vice President of Career Development Department at CSSA – Davis, CA 01/2016 - 06/2017
• Academic Coordinator at UC Davis Statistics Club – Davis, CA 06/2015 - 12/2016
• Volunteer at NYBL Foundation of America – Sacramento, CA 01/2015 - 03/2016
YUQING LIU
C: (530) 574-9428 | E: YuqingLiu2018@u.northwestern.edu | https://www.linkedin.com/in/yuqing-liu11
EDUCATION
Northwestern University | Evanston, IL Expected Dec 2018
Master of Science in Analytics
University of California, Davis | Davis, CA Sep 2013 – Jun 2017
Bachelor of Science in Computational Statistics, Major GPA: 3.87/4.00
University Honor Program, Outstanding Undergraduate Research Scholar, Outstanding Performance in Statistics
SKILLS
Programming: R, Python, SQL, C, C++, Java, MATLAB, SPSS, SCIP, HTML/CSS/Bootstrap, JavaScript, D3
Data Science: Machine Learning, A/B Testing, Recommender System, Time Series Analysis, Latent Class Analysis,
Consumer Analysis, Visualization, Model Lifecycle, Deep Learning, Data Warehousing
Tools: AWS, Tableau, Qlik, Plotly, Microsoft Azure, tensorflow, scikit-learn, MapReduce, Spark, Hive
Expected by graduation: Text Analysis, Social Network Analysis
INDUSTRY EXPERIENCE
The Boeing Company St. Louis, MO
Data Science Intern Jun 2018 – Aug 2018
• Applied word clustering based on syntagmatic analysis, LSTM, and classification models to find patterns in red
status project updates written by managers, diagnosed pain points in the production process, and provided data
supported insights of censoring to management.
• Identified causes of safety incidents at Boeing with text analysis of incident reports and categorized injury levels.
• Improved the current time series model for finance expenses prediction by constructing stacked models, acquiring
market data and feature engineering.
Principal Financial Group Evanston, IL
Student Researcher Sep 2017 – June 2018
• Use Regime Switching Model to predict economic factor’s forward return to support factor based investing.
Lam Research Corporation Fremont, CA
Data Science Intern Jun 2016 – Jun 2017
• Built classification tools for 30,000+ semiconductor parts using Support Vector Machine and Natural
Language Processing methods in R and Python; performed analysis and created dashboards in Qlik.
• Improved inventory planning by predicting demand for parts with regression model and survival analysis; reduced
average delivery time from 8 days to 6 days.
• Presented model performance with data visualization techniques to both technical and nontechnical audience;
beat the accuracy of the Microsoft Azure team.
RESEARCH
Manifold Learning with Outliers, Davis, CA Sep 2015 – Jun 2016
• Researched the current algorithms and tuning parameters applied in Principal Curve method and analyzed
disadvantages when abnormalities exist in data.
• Improved the implemented Principle Curve method in R by making the fitted curve less sensitive to outliers.
PROJECTS
Moving Object Detection – CVPR 2018 Challenge Apr 2018 – June 2018
• Performed image segmentation of moving objects in street-view images with Mask-RCNN and YOLO algorithms.
Zillow Home Value Prediction Web Application Jan 2018 – Mar 2018
• Predicted future home value based on Zillow Research data and managed the model following production level
lifecycle and agile software development requirements.
• Created a web application providing visualization and model results with Java, HTML and AWS.
Consumer Analysis – ShopRunner Jan 2018 – Mar 2018
• Estimated consumer lifetime value with retention models for retailer segmentation.
• Developed metrics to measure customer and retailer similarities; built recommendation system for top retailers
with collaborative filtering and dimensionality reduction methods.
AWARDS
iidata Convention 1st Place, Stephen Curry Shooting Prediction | University of California, Davis Jan 2017
Hackathon 2nd Place, Optimizing Fleet Productivity | Northwestern University MSiA & ABC Supply May 2018
JUNXIONG LIU
jliu18@u.northwestern.edu  https://www.linkedin.com/in/junxiongliu  https://github.com/junxiongliu  612-356-8798

EDUCATION
Northwestern University, Evanston, IL Expected December 2018
Master of Science in Analytics GPA: 3.87/4.00
Coursework: Big Data, Data Mining, Data Visualization, Data Warehouse, Deep Learning, Predictive Analytics, Text Analytics
Carleton College, Northfield, MN March 2017
Bachelor of Arts in Mathematics/Statistics, cum laude GPA: 3.78/4.00
Honors & Activities: 2013-14 Dean’s List, Chess Club President, 2014 Pan-Am Intercollegiate Chess Championship

SKILLS

Programming R (tidyverse, data.table, Shiny), Python (pandas, numpy, sklearn, seaborn), SQL, Java, JavaScript (D3.js)
Tools & Systems Git, Hadoop, MapReduce, Spark, Hive, HBase, Bash, Amazon Web Services, Tableau, Linux

EXPERIENCE
Data Science Intern - Predictive Analytics, Zurich North America, Schaumburg, IL June - August 2018
· Re-examined and improved geographical pricing strategies for commercial auto line with Python, R, Spark, and Hive
· Designed reproducible PySpark pipelines to reduce dimensions of 20GB geographical data with more than 2,000 features
· Collaborated closely with business partners and constructed novel model evaluation metrics that best served business needs
· Built machine learning models (XGBoost, GLM, etc) that outperformed Zurich’s current model by 10%. Results and recom-
mendations will be fully utilized in next generation pricing models
Graduate Student Data Science Consultant, Shopify, Evanston, IL October 2017 - June 2018
· Researched bot behaviors with Shopify’s web clickstream data and prototyped bot detection methods in Python and R
· Developed reproducible scripts to clean unstructured URL data and engineered important features relevant to bot behaviors
· Implemented algorithms to propagate bot behaviors and built a Random Forest model achieving 99% accuracy (93% baseline)
Graduate Student Data Science Consultant, ShopRunner, Chicago, IL January - June 2018
· Segmented and visualized 146 active retailers in ShopRunner’s e-commerce network in team of 5 with R and D3.js [GitHub]
· Generated valuable features with PCA and performed analysis with Gaussian Mixture and Ordinal Logistic Regression models
· Designed an interactive dashboard incorporating KPI and segmentation analysis results for ShopRunner’s internal use
Data Analysis Intern, Delos, New York, NY April - July 2017
· Conducted literature review on statistics-related health and building science papers as part of team’s core research projects
· Scraped, cleansed, and visualized data from various sources with Python and R and delivered research recommendations
Statistical Consultant, Carleton College, Northfield, MN January 2016 - March 2017
· Participated in and led 4 data-oriented consulting projects for clients from Hennepin County or Northfield community
· Cleaned (dplyr, tidyr), analyzed (regression and clustering models), and visualized (ggplot2, Shiny) massive data (generally
200K+ observations) with R. Generated 100+ pages of reports and delivered 5+ presentations

PROJECTS

Graduate School Projects, Northwestern University, Evanston, IL September 2017 - Present


· eSports Analytics: Researching unsupervised and supervised techniques to detect anomalous events in eSports game Dota 2
· College Recommendation: Built a web app to guide students’ college application processes with Flask and HTML [GitHub]
· Labor Analysis: Analyzed job market data (4M observations) for a HR company and presented insights to the CEO
Multiple Data Competitions, DrivenData/Kaggle, Online February 2018 - Present
· Santander Challenge: Building models to predict customers’ value of transactions with R and Python (currently top 25%)
· Pump it Up: Used supervised methods to predict Tanzania waterpoint conditions in Python and R. Finished top 8% [GitHub]
· Power Laws: Conducted time-series analysis to forecast building energy consumption with R. Finished top 8% [GitHub]
Minnesota Water Quality, MinneMUDAC, Eden Prairie, MN October - November 2016
· Built GLM models to quantify relationships between Minnesota water quality and property values from 70GB water data
· Won 1st place (for highest level of business insights and most actionable recommendations) out of 15 undergraduate teams
and awarded total prize of $2,600
Daniel F. Lütolf-Carroll
+1(914)-837-8918 • daniel.lutolf@u.northwestern.edu • github.com/dlutolf
Education
Northwestern University (Evanston, IL) September 2017 – December 2018
Robert R. McCormick School of Engineering and Applied Science
Master of Science in Analytics, Candidate in the Class of 2018
• Relevant Courses: Predictive Analytics, Deep Learning, Optimization & Heuristics, Data Visualization, Data Mining, A/B Testing, Text
Analytics, Big Data Analytics, Java & Python Programming, Social Network Analytics, Reinforcement Learning
Stanford University (Palo Alto, CA) June – July 2014
Stanford Graduate School of Business
Certificate from the Summer Institute for General Management
• Relevant Courses: Strategy, Accounting, Finance, Statistics, Economics, Operations, and Organizational Behavior
Iona College (New Rochelle, NY) September 2010 – June 2014
Bachelor of Science, Major in Chemistry and Minor in Mathematics
• Patrick J. Martin Scholar (Top Scholarship Award at Iona College) and Dean’s Honor List (2010, 2011, 2013, 2014)
• Supported faculty research on bilayer surfactants by developing Java programs to for image processing of surfactant crystallization.

Skills
• Programming: Java, R, Python (keras, xgboost), C++, SQL, HTML/CSS, Javascript, PHP, VB, Objective C, Lisp, Fortran, Perl, Lua
• Software: Gaussian (Monte Carlo), MATLAB, AWS, Google Cloud Platform and APIs, Apache Web Server, Git, Minitab
• Simulations: Computational modeling of chemical processes using ab initio, semi-empirical, or experimental methods.
• Development: iPhone iOS and Android mobile apps, ETL with distributed databases (Hadoop, Hive, HBase, MapReduce, Spark)

Projects
Improved Advertisement: Activity Classification in Videos using Deep Learning
• Built a real-time application to classify human activity from 101 classes with 90%+ accuracy using Tensorflow and Keras.
• Model: Convolutional Neural Net (CNN) leveraging transfer learning of InceptionV3 and LSTMs integrating temporal flow of frames.
Bot Detection Algorithm
• Practicum project with ecommerce platform Shopify – Data pulled directly from production Kafka streams
• Designed and developed a predictive model using clickstream analytics to classify bot traffic at the application level
Modeling Customer Purchases and Labor Data Analysis
• Built a predictive regression and classification model using Python/R (xgboost) to identify target customers for marketing based on
activity and RFM (frequency, recency and monetary) metrics
• Developed a visualization dashboard and predictive model using client’s labor data (6 million+)

Work Experience
Molex (Lisle, IL)
Data Science Intern June 2018 – August 2018
• Designed, built and executed an ensemble model of Deep Learning and Computer Vision to digitize 40 years of document archives.
• Developed production code in Java and Scala to leverage Google APIs Optical Character Recognition and applied ETL using Spark.
• Assisted in the design of a predictive model using transducer vibration data to determine corrosion levels in industrial pipes.
Netspan AG (Liestal, Switzerland)
Project Manager August 2015 – May 2017
• Managed a remote mobile phone App development team based in India. Responsible for the design of product specs for a social media
event hub customer. Supervised app testing and implemented quality controls. Negotiated development team’s compensation.
IT Consultant July 2008 – August 2009
• System administrator for Windows and Linux servers. Part of a team that programmed customer applications and websites.
Cenciarini & Co Merchant Bank (Milan, Italy)
Summer Intern June – July 2013
• Identified a need for and programmed a customized Visual Basic (VB) spreadsheet system to efficiently analyze large dataset of an
Italian client to improve its account’s receivables collections and pinpoint problematic invoices.
• Programmed a VB system using GPS technology to optimize transport delivery times between client warehouses & local pharmacies.

Additional Information
• American Chemical Society, Award for Excellence in Undergraduate Inorganic Chemistry at Iona College (2014)
• Elected to Gamma Sigma Epsilon, the National Chemistry Honor Society (2014)
• Trilingual in English, German, and Italian – Lived in Spain, Argentina, Mexico, Italy, Switzerland; traveled extensively in Asia
SPENCER MOON
linkedin.com/in/moonspencer| spencer.s.moon@gmail.com | (214) 228-2372

PROJECTS League of Legends Dec 2017 – Present


• Define anomalous behaviors in professional eSports gameplay, generate models for predicting
events in real-time, and visualize these behaviors for coaches, producers and players
TransUnion Sep 2017 – Jun 2018
• Detected synthetic fraud accounts by building graph structures with personal and credit data
and feeding them as inputs for neural networks

PROFESSIONAL Atlassian New York, NY


Data Science Intern, Trello Jun 2018 – Sep 2018
• Supplemented A/B testing of feature limitation by modeling change in product usage after
implementation, estimating lift in premium plan conversion rate, and determining workarounds
• Productionized data pipelines for summary tables in Trello database to eliminate potential
errors from complex joins and reduce query execution time by 45%
• Designed visual dashboard for mobile metrics including signups, monthly active users, app
rating, and Net Promoter Score
Phreesia New York, NY
Analyst, Provider Insights Mar 2017 – Aug 2017
Select Engagement Experiences:
• Closed enterprise client deal worth $1.1M in recurring revenues by presenting product value
and return on investment
o Projected annual financial impact using product utilization rates and payment collection
rates in early adopters to show additional value that can be delivered through expansion
o Determined increase in appointments through patient usage of clinical assessments
• Estimated patient tendency of paying balances and submitting demographic information
through product modalities in order to highlight gaps in user interaction to engineering teams
o Created automated dashboards to provide organizational performance, end user data, and
market trends for sales department pursuing customer leads
FTI Consulting New York, NY
Consultant, Health Solutions Aug 2014 – Feb 2017
Select Engagement Experiences:
• Built web-based dashboard that summarizes individual patient history to improve information
transfer between care settings for 300-bed nonprofit children’s hospital
o Mined millions of health records using SQL to parse out specific hospital events and created
timeline visualization in QlikView, allowing physicians to interact with queried data
• Served as lead analyst to conduct comprehensive assessment of care processes and quality
outcomes for 375-bed hospital system to capture clinical and financial opportunities
o Analyzed clinical variation of costs incurred by physicians when performing various
procedures and found $4.4M in opportunities for internally standardizing care
o Identified $6.2M in potential savings by comparing client’s average length of stay across all
service lines to external benchmark consisting of peer provider organizations

COMMUNITY PACPI, Nonprofit focused on eliminating fatalities from pediatric AIDS, Pro-Bono Consultant
MicroMentor, Mentoring program for small business owners, Mentor

EDUCATION Northwestern University, Master of Science Evanston, IL


Concentration: Analytics Sep 2017 – Present
• Cumulative GPA: 3.90/4.00
• Activities & Awards: Enova Data Smackdown Competition – 3rd Place
Northwestern University, Bachelor of Science, Cum Laude Evanston, IL
Concentration: Economics & Learning and Organizational Change Sep 2010 – Jun 2014
• Cumulative GPA: 3.75/4.00
• Activities & Awards: Lending for Evanston and Northwestern Development · Students
Consulting for Nonprofit Organizations · Arthur Siehrs Scholarship · Weinberg Research Grant

SKILLS Python · R · SQL · Java · Spark · AWS · Mode · Tableau · Javascript · Microsoft Office · Korean
INTERESTS

Microfinance · English Premier League · Streetwear · USA’s Mr. Robot and HBO’s Silicon Valley
Kehan Pan
773-273-0336 | kehanpan2018@u.northwestern.edu | Github: https://github.com/pankh13

EDUCATION
Northwestern University, Evanston, IL Expected Dec 2018
Master of Science in Analytics GPA: 3.9 /4.0
Coursework: Predictive Analytics, Data Mining, Data Visualization, Data Warehousing, MapReduce & Hadoop, Deep Learning
Tsinghua University, Beijing, China Sep 2013 - Jul 2017
Bachelor of Engineering in Industrial Engineering GPA: 3.9 /4.0
Award: National Scholarship (top 0.2% in China), China Merited Undergraduate Student (top 5000 in China)
Coursework: Data Structure & Algorithm, Operations Research, Database Management Systems, Modelling and Simulation
Georgia Institute of Technology (Exchange), Industrial & Systems Engineering, Atlanta, GA Aug 2015 - Dec 2015
SKILLS
Programming: Python, R, Java, C, HTML, JavaScript (d3), SQL, Bash, Scala
Software & Tools: Tableau, MATLAB, LATEX, Gurobi, Airflow, Docker, AWS, Spark, Redis, Excel, Linux OS
EXPERIENCE
Data Intelligence Intern | Balyasny Asset Management, Chicago, IL Jun 2018 - Aug 2018
• Devised web scraper infrastructure and internal Python package for data group based on AWS Auto Scaling Group, Selenium,
Splash and Scrapy; distributed computation, making scrapers 50 times more efficient, completely anonymous and untraceable
• Designed scrapers for eBay, MercadoLibre, and Carvana, collecting 4 million pages of product information per day, capturing
data of ~80% total revenue; modeled data in time series and delivered to Excel function; helped PM generate millions of return
• Built data ETL and data QA platforms full-stack to analyze, visualize and monitor data with Redis, Plotly Dash& Flask;
scheduled via Airflow; deployed with Gunicorn, Docker & Nomad; saved 80% cloud resource; delivered reports 3 times faster
Data Analyst Intern | Lenovo, Beijing, China Jun 2016 - Dec 2016
• Developed a statistical methodology for processing call center transcripts data; increased efficiency by 5 times (adopted by
international branches); added 25 Q&As to customer support knowledge base, reducing the number of related calls by 10%
• Created a model with Python and Java based on word embedding and Neural Network to analyze customer feedback sentiment;
increased accuracy by 90% through optimization; used time-series analysis to model and monitor sentiment trend
• Designed centroid-based clustering algorithm with Python to discover new customer pain points and their importance
PROJECTS
Landmark Picture Recognition Deep Learning Bot, Northwestern University Apr 2018 - Jun 2018
• Presided over the construction of a deep learning model to classify pictures of landmark based on Xception and convolutional
neural network with Keras and TensorFlow; optimized with transfer learning and stepwise training
• Reached 93% top 1 error on test set among 100 landmarks, outperforming human recognition (at 60%)
Day Camp Pricing Model, Chicago Park District (CPD) & Northwestern University Sep 2017 - Jun 2018
• Collected customer social-economical data via Google API and geospatial mapping; visualized customer and park data as an
interactive dashboard with Tableau
• Reengineered the pricing model with machine learning & integer programming approach and integrated into an Excel app
Recommendation System Design, ShopRunner & Northwestern University Jan 2018 - Mar 2018
• Design lead of a hybrid recommendation algorithm based on customer data and product description; addressed usage of text
mining and collaborative filtering in cold start recommendation system problem
• Conducted Design of Experiments to find optimal strategy, reaching recommendation recall rate of 82% on test set
Movie Recommendation Website Based on Natural Language Processing, Northwestern University Dec 2017 - Mar 2018
• Scraped 8 million movie ratings, reviews and posters from Amazon and stored in RDS PostgreSQL database and S3
• Trained doc2vec (word2vec) model with Python; predicted movie genres and similar movies (70% consistent with IMDB)
• Built a website full-stack with Flask, HTML, JavaScript, Bootstrap and AWS to provide visualization and interaction
LEADERSHIP EXPERIENCE & INTERESTS
President, Volunteer and Public Welfare Association | Tsinghua University, Beijing, China Jul 2016 - Jun 2017
President, Students’ Union of Department | Tsinghua University, Beijing, China Jun 2016 - May 2017
Interests: freediving (semi-professional), swimming (university amateur group champion), taekwondo (1st dan), ballroom dance
MICHAEL PAULEEN
503.781.8393 mobile • mpauleen@u.northwestern.edu

Work Experience
Airbnb San Francisco, CA
Data Scientist Intern Summer 2018
• Joined new Lux team to develop host and guest growth strategies for newly acquired LuxuryRetreats rentals business.
• Defined key metrics and designed A/B experiment to analyze overall business impact of Lux launch on Airbnb.
• Conducted analysis of visitor conversions at searching, booking, and payment stages; presented findings to leadership to
direct product development.
• Built model to identify cross-listed properties on Airbnb and LuxuryRetreats and delivered to leaders to drive host
onboarding process design.
Enodo Chicago, Illinois
Data Scientist Intern Fall 2017 – Winter 2018
• Designed new spatial clustering product to generate comparable rental sub-markets for analysis of rental values.
• Developed novel Bayesian models to analyze drivers of multifamily real-estate values in markets nationwide.
• Synthesized domain knowledge, market data and building features to quantify the value of amenities in each market.
Allstate Chicago, Illinois
Data Scientist Intern Summer 2017
• Used deep learning and transfer learning to segment cars in images to prototype automation of claims estimates.
• Built Python tool to facilitate labeling of over 450k Allstate claimant submitted images for training and validation.
• Achieved 93% pixel-wise detection accuracy for instance-aware semantic segmentation of damaged vehicles.
• Established end-to-end training and validation strategy to fine-tune model and assess performance for business goals.
Data Engineer Intern Summer 2016
• Created and maintained new schema and data pipeline to enable sales and loss modeling at first contact for dataset of
90MM quote records.
• Developed new metrics to analyze agent adoption of new marketing technologies to evaluate impact on conversion rate.
Education
Northwestern University Evanston, Illinois
M.S. in Analytics
• GPA: 3.90/4.00 December 2018
• Courses: Predictive Analytics, Analytics for Big Data, Data Visualization, Deep Learning
B.S. in Industrial Engineering, Magna Cum Laude June 2017
• GPA: 3.84/4.00
• Courses: Stats for Data Mining, Optimization Methods in Data Science, Machine Learning
• 2017 ICSA All-Academic Sailing Team, 2017 MCSA All-Conference Sailing Team
Projects
WE Energies – Employee Productivity Analysis Fall 2017 – Spring 2018
• Develop explanatory models in R to correlate employee productivity, overtime and attrition in customer care centers
• Visualized trends in call volumes and customer wait times to highlight opportunities to increase caller satisfaction.
ML Strategies for Forex Trading • 40 hours Spring 2017
• Develop ensemble and boosted methods to predict short-term price movements in EUR-USD Forex market and designed
a profitable trading strategy and complete back-testing simulation framework.
Skills
• Technologies: Python, R, PySpark, H2O, Hadoop, SQL, BigQuery, C++, AMPL, Caffe, MXNet, Tableau, d3
• Foreign Language: French and Spanish (ILR Level 4 – Full professional proficiency)
Leadership and Interests
• Northwestern University Sailing Team (Vice-Commodore 2x, Regatta Chair), rock climbing, credit card churning
Christian John Rozolis Work Experience
ChristianRozolis2017@u.northwestern.edu Northwestern University, Department of Industrial Engineering
Cell: (815) 451-9675 Graduate Research Assistant, June 2017- Present
• Use historic runner data and visualization in ongoing humanitarian logistics project for the
Northwestern University, McCormick School of
Chicago and Houston Marathons
Engineering and Applied Science
• Train machine learning models on historic runner data to simulate runner speed and locations for
Bachelor of Science, Industrial Engineering
marathon situational awareness planning
Minor in Psychology, Sep. 2013-Jun. 2017
• INFORMS Analytics Society Innovative Applications in Analytics Award (IAAA) Winner
GPA: 3.701 | Honors: Tau Beta Pi
ABC Supply Company, Inc.
Master of Science, Analytics
Data Science Intern, June – September 2018
Sep. 2017-Dec. 2018 (expected)
• Aggregated and combined various data sources from the business to create a visualization tool to
GPA: 3.877 | Honors: INFORMS IAAA
aid users in understanding population and geographic data all in one tool (d3js, Flask)
• Selected and engineered relevant features from delivery, population, sales, and homebuilder data to
Languages
develop a predictive model for valuable market identification and branch success
• Python • Java HP Inc. Advanced Technology and Platform Solutions
• R • C++ Operations Planning Industrial Engineering Intern, June – August 2016
• Spark (RDD, DFs) • d3, JS, CSS • Performed current-state manufacturing start process analysis and implemented solutions aligning
• SQL (U-SQL, MySQL, PostgreSQL, SQL Server) inter-department communication and metric reports

Tools/Skills • Designed and implemented linear programming model used to reduce resource requirements and
operating costs by optimizing cycle-time planning process
• Tableau • Hadoop DFS
• Created line management course curriculum and facilitation guide used to train technicians and
• Azure Data Lake • MapReduce
supervisors in the theory of constraints and best practices
• Azure Web Apps • Git
Walt Disney Parks and Resorts Industrial Engineering Co-Op
• Amazon Web Services • Flask Apps
Walt Disney World Transportation, May-August 2015
• Hive • SPSS • Internally consulted on projects, responsible for planning, data acquisition, and analysis
• SAS (SQL, Data Step) • MATLAB • Led client meetings to coordinate project tasks, goal-setting, and implement recommended
• Linear Programming • Simio solutions matching agreed upon measures of success

• ArcGIS • @Risk • Developed internal tools using SQL and Excel that improved information flow and analytics of

• Microsoft Access, • Webscraping available data for current and future projects
Walt Disney World Facilities and Operations Services (FOS), January–May 2015
Excel, Project
• Extracted and analyzed maintenance workforce data using SAS to minimize future headcount

Volunteering needs due to park operational changes
Northwestern University, NSFP • Optimized attractions maintenance schedules using historical trends to reduce unnecessary costs
Orientation Assistant, September 2017
Leadership Experience
Northwestern University, Class Gift Committee
Ultimate Chicago, Youth U19 Chicago League Volunteer Coach, June 2018-August 2018
Executive Board Member, January-June 2017
Cary-Grove High School Choir Department Northwestern University, New Student and Family Programs (NSFP)

Audition Assistant, Fall 2013-Winter 2016 Board of Directors, Director of Staff Training and Design, November 2015-December 2016
Cary-Grove High School Tennis Camp • Collaborated with staff training team to develop and teach a course involving social justice
Instructor/Coach, Summer 2013, 2014 education, campus inclusion, and effective dialogue skills for 200+ student leaders
Disney’s Ultimate EnginEARing Exploration (DUEE) • Interviewed and evaluated candidates for peer leadership, board member, and professional roles in
Industrial Engineering Representative, Summer 2015 the office
Peer Adviser and First Year Experience Co-Instructor, Spring 2014-Spring 2016
Interests
• Provided academic and campus life guidance to new students throughout their first year
• Discovering Music • Public Speaking
• Data Visualization • Teaching • Led biweekly sessions to assist with course, major, and minor selections and facilitated dialogues

• Tennis, Volleyball, • Freehand Sketching about emotional, mental, and physical health as a college student
Ultimate Frisbee
Jingwei (Will) Song

willsong@u.northwestern.edu | (310) 910-4205


EDUCATION
Northwestern University, Evanston, IL December 2018 (Expected)
Master of Science in Analytics (MSiA) GPA: 3.8
• Relevant Courses: Predictive Analytics, Database System, Data Mining, Java & Python Programming,
Analytics for Big Data, Deep Learning, Data Visualization, Text Analytics (upcoming)

University of California Los Angeles, Los Angeles, CA June 2017


Bachelor of Science in Statistics GPA: 3.9
• Relevant Courses: Intro to Probability, Mathematical Statistics, Linear Models, Computational Statistics with R,
Markov Chain Monte Carlo Methods, Intro to Programming in C++

SKILLS
Programming Languages: R, Python, Java, SAS, SQL, HTML, C++
Big Data Analytics: Hadoop, Spark, AWS, D3.js, Tableau

PROFESSIONAL EXPERIENCE
TransUnion, Chicago, IL June 2018 – August 2018
Insurance Analytics Intern
• Conducted independent research on the relationship between daily events and public interest in the insurance
industry by applying XGBoost and Random Forest models
• Utilized NLP techniques, such as sentiment analysis, n-grams and TD-IDF, to perform feature engineering using news
and trends data scraped from major media websites and Google API in R
• Reproduced key functionalities of a costly commercial modeling software for insurance pricing into an interactive R
Shiny app to improve global communication within TransUnion’s insurance realm
• Developed documented R functions that evaluate policyholder risks from existing SAS codes
Doodod Technology, Beijing, China July 2017 – August 2017
Data Scientist Intern
• Built web scraping templates with Python to extract millions of data points every day from China’s 10+
mainstream video websites and ticketing platforms for further analysis on China’s film market
• Screened and gathered data from major social media platforms like Weibo with Doodod’s customized tools and
further cleaned them in MySQL
• Analyzed aggregated datasets from Weibo in a used-car market analysis to boost companies’ brand exposure on
social media platforms by identifying relevant key opinion leaders

PROJECT EXPERIENCE
Yelp Data Visualization Project, Northwestern University April 2018 – June 2018
• Built an interactive dashboard in D3.js to display local catering sites in Las Vegas in detail using graphics such as
geo map, heat map, bubble chart and line chart
• Developed two interfaces – one with consumer insights and competitive landscape and the other with rating
patterns – for local businesses and potential customers respectively
Used Car Price Prediction Web App Development Project, Northwestern University February 2018 – March 2018
• Trained a linear regression model to predict future used car prices based on Kaggle’s used car transaction data
from the European sales market
• Built a user-friendly Flask web app in Python to interact with the model and deployed the app with EC2 instance
on Amazon Web Services
Valued Customers Predictive Analytics Project, Northwestern University November 2017 – December 2017
• Created a two-step model combining multiple linear regression and logistic regression models to identify
customers who were likely to respond to a mailed catalog with high dollar purchase
• Improved model performance and prediction results by conducting data cleansing, variable transformation and
model validation

INTERESTS AND ACTIVITIES


Languages: English, Chinese, French
Extracurricular activity: Chinese Student A Cappella Club at UCLA

CHRISTA SPIETH
MASTER OF SCIENCE IN ANALYTICS STUDENT

CONTACT EDUCATION

Christa Spieth Present Master of Science in Analytics


Northwestern University
715 316 1767 Graduating December 2018
christa.spieth@gmail.com
2013-2017 Mechanical Engineering & Mathematical Studies
linkedin.com/in/christa-spieth Minor in Computing
Andrews University
Graduated May 2017
Magna Cum Laude
SKILLS

R
Python RELEVANT EXPERIENCE
SQL
Tableau & MicroStrategy Summer 2018 Enova International
Java
Fraud Analytics Consultant
Jenkins
Created a SQL function to describe the lifecycle of loan applications –
information that would entail long, complex queries across analytics
teams. Designed updateable MicroStrategy visualizations for
management’s product reports. Organized data study to identify a
HONORS suitable vendor for account takeover predictions. Migrated and
streamlined fraud alerts for operations within Jenkins.
Enova Data Smackdown, First Place
ABC Supply, Second Place
President’s Full Tuition Scholarship
Spring 2018 Synchrony Financial
National Merit Finalist
Data Science Consultant
Undergraduate Research Scholarship
Developed a chatbot proof of concept in Python, capable of common
Phi Kappa Phi Honor Society
interactions in an inbound customer service call. With a predictive
Pi Mu Epsilon Honor Society
model-based chatbot that identifies customer intent, recurrent issues
Engineering Excellence
could easily be tracked and collected as structured data to produce
Who’s Who Among Students
business insights to show where improvements could be made.

Summer 2016 Texas Tech University


RELATED Knowledge Representation REU Researcher
COURSEWORK Proposed standardized method of converting clinical practice
guidelines into declarative programming-based algorithms. Developed
and implemented thyroid nodule algorithms while collaborating with
Analytics for Big Data
PhD and medical school students, physicians, and professors.
Analytics Value Chain
Business Communication
Data Management for BI
Data Mining 2015-2016 Andrews University
Data Visualization Mathematics Student Researcher
Databases and Information Retrieval Collaborated with biology researcher on the subject of Antillean
Deep Learning manatees. Utilized MATLAB to find a deterministic mathematical model
Predictive Analytics for weight as a function of standard morphometric measurements
Presented research at university research conference open to public
and at Michigan Academy of Science, Arts, and Letters.
Penny (Mengyu) Sun
765.491.0208 | mengyusun2018@u.northwestern.edu

EDUCATION
Northwestern University Evanston, IL
Master of Science in Analytics Sep 2017- Dec 2018
Relevant Courses: Predictive Analytics, Data Mining, Big Data, Text Analytics, Deep Learning, Data Visualization, Machine
Learning Model Deployment, Databases and Data Warehouse, Social Network Analysis
Imperial College London London, UK
Master of Finance Aug 2012 – Aug 2013
Peking University Beijing, China
Bachelor of Economics Freshman Scholarship, Li & Fung Scholarship Sep 2008 – Jul 2012

TECHNICAL SKILLS:
Languages/Tools: Python, R, Java, Spark, Hive, SQL, Git, Bash, AWS, Azure, Flask, Tableau, JavaScript (D3, Leaflet)
Python Libraries: Pandas, Scikit-learn, TensorFlow, Keras, PySpark, Seaborn, NLTK, CRON

WORK EXPERIENCE
OPEX Analytics Evanston, IL
Data Science Intern Jun 2018- Aug 2018
• Designed and developed inventory risk assessment tools for 8 regional Supply Planning teams of the world’s largest
consumer goods company, which is expected to improve workflow operational efficiency by 80%
• Automated daily data scraping process from 2 multi-dimensional data sources, created interactive geospatial
dashboard for to visualize network inventory supply and actionable insights utilizing optimization algorithm
• Performed root cause analysis for current multi-stop routing solution and improved on cost saving prediction model
by introducing billing, fuel and carrier acceptance data
Synchrony Financial Practicum Chicago, IL
Data Consultant Intern Oct 2017- Jun 2018
• Created a chatbot prototype in Python to handle customer service tasks utilizing natural language processing (stop
words, stemming, word2vector) and Naïve Bayes machine learning model on credit card customers e-chat data
• Presented on Synchrony Monthly Townhall and was highly praised by the Chief Information Office
KPMG, LLP London, UK
Assistant Manager Sep 2013 – Jan 2017
• ACA qualified accountant (ICAEW) specialized in financial services companies, recipient of KPMG Encore rewards
• Managing multiple engagements, and coordinating integrated audit efforts for medium-size, multi-location teams

PROJECTS & AWARDS


Flower Species Detection Web App Apr 2018 - Jun 2018
• Developed a flower image classification web app which implements Xception model and convolutional neural network
• Deployed front-end web infrastructure with Flask hosted on AWS Elastic Beanstalk and RDS

Venmo Transactions Unsupervised Learning Project June 2018


• Processed 7M+ transaction comments in PySpark, and identified most popular topics using TF-IDF and Latent
Dirichlet Allocation model and Gaussian Mixture Model, concluded meaningful insights from both models
Rubikloud Case Competition (2nd/39 teams with cash prizes) May 2018
• Identified promotion strategies for retailers by engineering novel features, clustering customers, computing segment
transition matrix and customer lifetime value, and predicting segment improvement with random forest model

ShopRunner Recommender Project Jan 2018 – Mar 2018


• Developed NLP-based collaborative filtering brand recommendation engine to engage users and retailer network
YIWEI (PHYLLIS) SUN
Evanston, IL | (617) 838-6926 | yiweisun2018@u.northwestern.edu
EDUCATION
Northwestern University Evanston, IL
M.S in Analytics (MSiA) 09/2017 – 12/2018
 Overall GPA: 3.85/4.00
 Coursework: Big Data, Data Mining & Machine Learning, Predictive Analytics, Deep Learning, Database System, Text
Analytics, Business Intelligence, Time Series, Value Chain (A/B Testing), Data Visualization, and Bayesian Data Analysis
Boston University Boston, MA
B.A. and M.A. in Statistics with Minor in Computer Science 09/2012 – 01/2017
 Overall GPA: 3.68/4.00, Master GPA: 3.88/4.00, cum laude
 Leadership: Teaching Assistant in Statistics I & II; Research Assistant; Treasurer, Mathematical Association of America

SKILLS
 Programming: R, Python, SQL, Spark, Hadoop/MapReduce, Hive, Java, JavaScript, HTML/CSS, SAS
 Software: Unix, Git, AWS, Tableau, Salesforce, Bloomberg Terminal, Google Analytics, Qlikview, InfoSource
WORK EXPERIENCE
Chicago Mercantile Exchange (CME Group) Chicago, IL
Business Intelligence Intern 06/2018 – Present
 Create decision rules in SQL through volume trend analysis by K-means clustering and Random Forest in R to classify
client trading accounts to improve efficiency of customer segmentation for sales and marketing teams
 Develop Quarterly Liquidity Deck Report for all offices in APAC, EMEA and North America for customer reference
 Visualize average daily volume and revenue for Ad-hoc analysis to generate business opportunities in a timely manner
 Improve pricing and sales data quality in Hadoop and Oracle Database and design test for new liquidity tool
 Perform correlation analysis for futures returns across top performing futures in R on price data through Bloomberg API
Chicago Park District (CPD) Chicago, IL
Data Science Consultant (Industry Practicum) 09/2017 – 06/2018
 Visualized enrollee, discount, and park information datasets (510K+ rows) as an interactive dashboard in Tableau and
channeled external demographic datasets in Python to identify key day camps’ trends at park level
 Updated CPD’s summer camp prices for over 600 parks by developing Random Forest, Gradient Boosting and Decision
Tree to improve participation rates of the camps while minimizing the additional overhead required to process the discounts
Scitics Inc. Acton, MA
Data Analytics Intern 12/2016 – 05/2017
 Conducted exploratory data analysis and variable selection on membership and national event data (30K+ rows) for
American Marketing Association (AMA)
 Predicted customer renewal probability using logistic regression and validated model through concordance statistic (AUC)
to help retention team develop marketing plan

RESEARCH PROJECTS
Google Landmark Recognition 04/2018 – 06/2018
 Produced bootstrap images to ensure 1000 training and 100 testing images for top 100 most frequent landmarks
 Predicted landmark labels with 93% accuracy through training the Xception model with fine tuning and transfer learning by
adding two fully connected layers
Toxic Comment Classification Web App 01/2018 – 04/2018
 Conducted sentiment analysis to classify toxic comments from Wikipedia by a logistic classification model with 84.3%
accuracy and deployed the interactive Flask Web App on AWS
ShopRunner Repurchase Prediction Analysis 01/2018 – 04/2018
 Developed marketing strategies to improve retention rate with network effect analysis on the top 15 retailers’ data in R
PUBLICATION
F. Fang, Y. Sun and K. Spiliopoulos, “The Effect of Heterogeneity on Flocking Behavior and Systemic Risk”, Statistics and
Risk Modelling, Vol. 34, No. 3-4, (2017), pp. 141-155
 Investigated the default activities for banking agents based on their mean-reversion rates and volatilities in heterogeneous
mean-field interacting coupled diffusions using Monte Carlo Simulation to help stabilize the financial system
SAURABH TRIPATHI +1(312)-273-7157
saurabhtripathi2018@u.northwestern.edu

EDUCATION Master of Science - Analytics • Northwestern University


Evanston, Illinois • Dec 2018
Bachelor of Technology - Mechanical Engineering • Indian Institute of
Technology (IIT)
Varanasi, India • May 2012

SKILLS Languages: Python, R, Scala, Bash, C#, Data Science Coursework: Predictive
JavaScript, jQuery, HTML5, CSS3, T-SQL, Analytics, Data Mining, Deep Learning,
git-bash  Text Analytics,  Analytics of Big Data,
 Data Visualization, Analytical
Libraries: Tensorflow, Keras, SciKit- Consulting. 
Learn, NLTK, Gensim, Scrapy,
Statsmodels, NumPy, PySpark, SciPy, Databases: SQL Server 2008 R2,
Pandas, matplotlib, seaborn, D3,  Caret, PostgreSQL, Hadoop, Spark, Hive
randomForest, rpart, nnet

WORK HISTORY Data Science Intern • GoDaddy


Tempe, AZ • June 2018 to Current
Built a highly configurable and intuitive, end-to-end machine learning pipeline,
capable of running and tuning various supervised learning/ deep learning
algorithms, on the fly.The pipeline served as a one click solution for
benchmarking and figuring apt algorithm for a dataset.
Built a classification model to predict intent of customer support call.
Data Science Consultant • Synchrony Financial
Chicago, IL • January 2018 to June 2018
Built  a chatbot capable of basic interactions that are commonplace in an
inbound customer service call in banking industry
Full Stack Developer • Seven Lakes Technologies
Bangalore, India • December 2015 to July 2017
Developed an intelligent dynamic routing solution which directs the field
personnel to the highest priority assets to maximize production efficiency of
the oil fields under them.
 Led a team of UI developers to overhaul the front-end architecture to
optimize CPU usage and reduce memory footprint of organizations  data
visualization product.
Technology Analyst • Infosys Technologies
Mysore, India • August 2012 to December 2015
Developed a solution for analyzing risk and value associated with any
opportunity (oil well) based on conventional analysis techniques.
Created and implemented database schema and architecture for multiple
projects.

ACADEMIC Built a deep learning model to predict genre of  based on movie poster.
PROJECTS Built a recommendation engine for ranking vendors for an e-commerce platform.
Built a predictive model to capture customer purchase response to a catalog
mailing.

ACCOMPLISHMENTS Awarded Most Valuable Employee at Infosys Technologies for two consecutive
years in 2014, 2015.
Ziwen (Vincent) Wang
ziwenwang2018@u.northwestern.edu | (425) 208-1748 | Los Angeles, CA 90034
LinkedIn: https://www.linkedin.com/in/ziwen-wang/ | GitHub: https://github.com/vincent9514 | Tableau: https://public.tableau.com/profile/vincent.wang1896#!/

EDUCATION
Northwestern University, Master of Science in Analytics, GPA: 3.93/4.00 Evanston, IL, Dec 2018
• Coursework: Predictive Analytics I&II, Deep Learning, Data Mining, Analytics for Big Data, Data Visualization, A/B
kTesting, Text Analytics, Databases & Information Retrieval

The Hong Kong Polytechnic University, BEng in Industrial and Systems Engineering, GPA: 3.78/4.00 Hong Kong, May 2017
• Coursework: Operations Research, Object-oriented Programming, Modeling and Simulation, Calculus I-III
TECHNICAL SKILLS & LANGUAGES
• Programming: Python, R, SQL, Java, JavaScript (D3.js, Node.js), HTML/CSS, C++, Bash
• Software: Tableau, PySpark, MapReduce, Hadoop, Hive, AWS (EC2, Beanstalk, EMR, S3), Git, Tensorflow, Dialogflow
• Certifications: Associate Certified Analytics Professional (INFORMS), Lean Six Sigma Green Belt (IISE)
• Languages: Fluent in English, Mandarin, and Cantoneses
PROFESSIONAL EXPERIENCE
KPMG US LLP Chicago, IL
Data Science Intern, Artificial Intelligence Jun 2018 – Aug 2018
• Launched a Chatbot automated development pipeline with functions including query variations generation (Tensorflow), real-
time query simplification (OpenNMT), and testing automaton (Node.js), improving Chatbot intent detection accuracy by 32%
• Trained sentential paraphrase generation and simplification models using LSTM seq2seq on 70 M+ paraphrase pairs
• Developed a production-ready HR Chatbot with 200+ intents using NLP and Machine Leaning deployed on Google Dialogflow
• Maintained an internal signals repository and scoring engine on AWS, providing external data for modeling purposes and
offering solutions including customer retention and demand planning for Fortune 5 telecom company on a subscription basis
BP North America Chicago, IL
Data Science Practicum Consultant Sep 2017 – May 2018
• Conducted customer segmentation analysis (GMM, K-Means, PCA) on 10M+ transaction records, accounting for purchasing
behavior, seasonality, demography, and geography data in Python; created end-user Tableau dashboards for marketing teams
• Built a dynamic customer lifetime value (CLV) model to track customer migration and optimize targeted marketing decisions
Digital Creativity Lab Chicago, IL
Research Scientist Nov 2017 – May 2018
• Created a production-ready recommender system applying unsupervised learning (K-means, GMM, Archetypal Analysis) in
Python on 16M+ matches’ data from Destiny II, a massively multi-player online game (MMOG) by Bungie Studio
• Constructed player profiles and team representations based on equipment preference, player playstyle, and character preference
• Implemented KNN to recommend teams and players with similar playstyles but higher performance or faster improvement rate
• Submitted paper to AIIDE 2018: Artificial Intelligence and Interactive Digital Entertainment Conference
Audi, Innovation Research Beijing, China
Data Analytics Intern May 2017 – Aug 2017
• Delivered Price-Sensitivity Meter models on 60K+ rows of customer feedback to determine the optimal price ranges
• Conducted Latent Factor Analysis and Principal Component Analysis (PCA) to identify the drivers on product premium
PROJECTS
Big Data Analytics | Venmo Text Mining and Network Analysis Jan 2018 – May 2018
• Extracted and classified emoji from 7M+ Venmo transactions using PySpark and RegEx to analyze its popularity
• Conducted Venmo’s network analysis by exploring in-degrees and out-degrees, as well as reciprocal relationships using RDD
Advanced Data Visualization | Chicago Botanic Garden Jan 2018 – May 2018
• Created a visualization dashboard using D3.js and Tableau to map the network of 300+ endangered plant species in the Midwest
Deep Learning | Object Detection and Segmentation for Autonomous Driving Jan 2018 – Apr 2018
• Implemented Mask R-CNN model to segment road objects in images (93G) provided by CVPR Autonomous Driving conference
• Generated evaluation scripts based on pixel-wise IOU (Interaction over Union) and achieved 50%+ accuracy for testing dataset
Full-Stack Web App Development | Flask Framework and AWS Deployment Jan 2018 – Apr 2018
• Built a web app with Flask, HTML, and JS deployed on AWS Beanstalk and EC2 to predict FIFA soccer player transfer value
• Evaluated 11 different supervised learning models including Lasso, Ridge, GAM, Random Forest, Gradient Boosting, Neural
Network, XGB, etc. on 18K soccer players records with 40+ attributes stored in Amazon RDS PostgreSQL database
LOGAN WILSON
(760)-450-4934 / loganwilson@u.northwestern.edu / github.com/lwilson18

EDUCATION
Northwestern University, Evanston, IL Dec. 2018 (Expected)
Master of Science in Analytics – GPA: 3.88
Relevant Coursework: Big Data, Predictive Analytics, Data Mining, Data Visualization, Social Network
Analysis, Text Analytics, Database Management, Deep Learning, Data Warehousing
Washington and Lee University, Lexington, VA May 2017
Bachelor of Science in Mathematics and Engineering with Minor in Music – GPA: 3.72 (Cum Laude)

TECHNICAL SKILLS
• Lanugages: Python, R, SQL
• Libraries: Spark, NLTK, Pandas, Scikit-learn, Flask, H2O, D3, spaCy
• Tools: Amazon Web Services, Hadoop, BigQuery, Excel

RELEVANT EXPERIENCE
BuzzFeed, New York, NY June 2018 – Aug. 2018
Data Science Intern
• Developed algorithm to identify trending topics in the news by clustering and labeling content
through natural language processing and ranking according to pageview trends
• Collaborated with stakeholders to build product solution connecting data pipelines from internal data
sources to implement algorithm in real-time as a RESTful API with caching
• Pitched tool to editors to explain underlying algorithm and integrate into news curation workflow to
influence content seen by hundreds of thousands of users daily
DJ Random Forest – Song Recommendations through Machine Learning Nov. 2017 – Feb. 2018
Independent Research Project - http://djrandomforest.us-east-1.elasticbeanstalk.com/
• Aggregated audio features for 80,000 songs and over 8,000 artists with Spotify Web API
• Tuned and cross-validated random forest model for predicting song preferences from user ratings
• Developed Flask web application hosted on AWS implementing model to make song
recommendations in real-time and visualize personalized audio profiles
• Built and maintained database of song ratings to analyze trends in musical preferences

OTHER EXPERIENCE
Zurich North America, Schaumburg, IL Sept. 2017 – June 2018
Analytics Consultant
• Evaluated accuracy, execution, and business impact of workers compensation pricing tool
• Leveraged deep learning in H2O to improve pricing tool accuracy by 10%
• Profiled and segmented over 90,000 companies by risk level to identify high-value customers
Twine Analytics, San Diego, CA Aug. 2017 – Sept. 2017
Data Engineer Intern
• Developed analytics platform for aggregating and processing biopharmaceutical data
• Implemented scalable Spark modules to perform ETL on data from over 200,000 clinical trials
MyEyeDr., Vienna, VA June 2017 – Aug. 2017
Marketing Analytics Intern
• Designed data-driven pricing strategy to streamline customer purchases of eyewear products
• Supported CRM initiatives through analysis of customer spending patterns

AWARDS AND ACTIVITIES


• Johnson Scholar – Four-year full merit-based scholarship to Washington and Lee University
• Analytics Council – Industry Chair – Student organization for planning analytics networking events
• Musicians on Call – Volunteer Musician – Nonprofit bringing live music to healthcare facilities
• Keynotes – Assistant Music Director – Northwestern University’s graduate student a cappella group
Hao Xiao
847-644-2536 | haoxiao2018@u.northwestern.edu
EDUCATION
Northwestern University, Evanston, IL Sep 2017 - Dec 2018 (Expected)
 Master of Science in Analytics (GPA 3.90/4.00)
 Core courses: Predictive Analytics, Data Mining, Deep Learning, Big Data, Text Analytics, A/B Testing
Peking University, Beijing, China Sep 2012 - Jul 2017
 BS in Urban Planning, B.A. in Economics
 Awards: 2013 Merit Award (top 20%), 2014 TC Scholarship (top 10%), 2015 CASC Scholarship (top 5%)
SKILLS
 Python, R, SQL, Java, C++, JavaScript/HTML/D3, Hadoop/MapReduce, Spark, Hive, Tableau, Git, AWS
WORK EXPERIENCE
TransUnion Chicago, IL
Data Science Intern Jun 2018 - Sep 2018
 Designed and conducted research on big data GLM algorithm implementation in R (Spark, H2O, etc.) and proposed
feasible solutions to help Insurance Analytics team transfer modeling platform from Emblem to R
 Developed an internal R package and an interactive R Shiny App to assist in insurance modeling and visualization
 Built GLM and GBM to predict auto insurance risk; elevated the performance of the previous model by 1.1%
 Further improved prediction performance by 4.6% by combining GBM and Recurrent Neural Network
Fraud Analytic Graduate Consultant Nov 2017 - Jun 2018
 Created graphs of identity sharing between customers and built xgboost models to detect synthetic identity fraud
 Automated time-consuming graph feature creation process by developing a graph-kernel based Convolutional Neural
Network model; elevated the performance of the original productionized model by 25%
ShopRunner Chicago, IL
Data Science Graduate Consultant Jan 2018 - Jun 2018
 Evaluated retailers’ network value by developing metrics with Principle Component Analysis and segmenting on
network features using Gaussian Mixture Model
 Provided key insights and possible actions on defined segments to facilitate cross-sell and network growth
 Developed an interactive dashboard incorporating KPI and segmentation analysis in D3 for ShopRunner's internal
use; deployed sample application on Amazon Web Services[Github Link]
China Sustainable Transportation Center Beijing, China
Spatial Data Science Intern Jul 2016 - Aug 2016
 Developed an end-to-end road congestion data analysis pipeline including open-source data collection and
processing, metric calculation and visualization; utilized Python, JavaScript, and ArcGIS
PROJECTS
Venmo Transaction Comments Classification May 2018 - Jun 2018
Big Data course project Evanston, IL
 Identified emoji popularity patterns over 7M+ Venmo transaction comments utilizing RDD and Spark data frame
 Developed metrics of comments to help comment classification and information extraction
Video Activity Classification May 2018 - Jun 2018
Deep Learning course project Evanston, IL
 Developed deep learning models (CNN + LSTM, 3D CNN) to classify 101 human activities in videos
 Created a video processing pipeline adding dynamically-changing activity class tags to videos in Python
Customer Lifetime Value Analysis Apr 2018 - May 2018
Runner-up of 2018 Analytics By Design Competition Toronto, Canada
 Identified customer value increase opportunities by engineering novel purchase pattern features, segmenting
customers and building revenue prediction models on each segment
Wenjing (Karen) Yang
wenjingyang2018@u.northwestern.edu | 404-632-3558

EDUCATION
Northwestern University, Evanston, IL Expected December 2018
M.S. in Analytics (MSiA), 3.7/4.0
Relevant Coursework: Predictive Analytics, Data Mining, Deep Learning, Text Analytics, Java & Python Programming, Data
Visualization, Big Data Analytics, Data Warehousing, Databases, Recommender Systems
Emory University, Atlanta, GA May 2017
B.S. in Applied Mathematics, double major in Economics, 3.9/4.0
Honors: High Honors in Economics, Dean’s list (6 semesters), Phi Beta Kappa Honor Society

SKILLS
Programming: Python (pandas, sklearn, seaborn, nordypy), R, Java, SQL, JavaScript (D3.js), HTML/CSS
Tools & Systems: Hadoop, MapReduce, Spark, Hive, AWS (EC2, S3, Redshift, EMR), Git, Tableau, Bash, Ubuntu, Stata

EXPERIENCE
Nordstrom, Seattle, WA June 2018 - August 2018
Data Science Intern | Grab-and-Run (GnR) Exploratory Analysis and Predictive Modeling
• Led exploratory data analysis and machine learning methodologies of the GnR project, and organized weekly meetings
with Nordstrom loss prevention department
• Manipulated 10M+ data records stored on SQL server and AWS Redshift database using Nordypy in EC2 Ubuntu system
• Investigated time-series and geographical patterns of GnR occurrences from 2010 to 2018 with Python and SQL queries
• Experimented machine learning algorithms to predict expected money loss of future GnR incident, and to study occurrence
triggers to provide strategic advice on store organizations and employee deployments
Synchrony Financial, Chicago, IL October 2017 - May 2018
Data Science Practicum Consultant | Chatbot Development
• Developed a chatbot prototype capable of common interactions in an inbound customer service call by applying Naïve-
Bayes classification model on simulated dataset using Python (Phase I)
• Boosted model performance in terms of classification rate from 85.3% to 94.5% by incorporating eChat data with data
manipulations of upsampling and downsampling, and text analytics methods of stopwords and stemming (Phase II)
ShopRunner, Chicago, IL February 2018 - June 2018
Graduate Student Analytics Consultant | Network Segmentation and Visualization
• Constructed retailer segmentation model with network features using PCA and Gaussian Mixture models (GMM) in R
• Profiled 140+ active retailers at ShopRunner network to facilitate cross-sell and improve network completeness
• Designed a user interactive dashboard for visualizing dynamics of network clusters and changes of retailer performances
with D3.js and HTML/CSS for internal use as strategic insights

PROJECTS
Venmo Transaction Big Data Project, Northwestern University, Evanston, IL April 2018 - May 2018
• Classified emoji popularity patterns in various time frames using RDD and Spark data frame
• Analyzed Venmo network by visualizing in-degrees, out-degrees and reciprocal relationships on 7M+ transactions using
SparkSQL and PySpark
Movie Recommender System Project, Northwestern University, Evanston, IL January 2018 - March 2018
• Implemented a content-based recommender system with text analytics by NLTK library in Python
• Deployed the model as web application on AWS Beanstalk using Flask library to bridge Python and HTML
Tong Yin
(310) 745-4851· tongyin2018@u.northwestern.edu· www.linkedin.com/in/tongyin10

EDUCATION
Northwestern University | Evanston, IL
M.S. in Analytics Expected December 2018
•Current GPA 3.90/4.00
•Coursework: Predictive Analytics, Machine Learning, Deep Learning, Big Data Analytics, Database Design
& Information Retrieval, A/B Testing, Data Mining, Data Visualization, Optimization, Text Analytics, NLP
University of California, Los Angeles | Los Angeles, CA
B.S. in Financial Actuarial Mathematics with Minor in French September 2013 – June 2017
•Cumulative GPA 3.92/4.00
•Honors: Summa Cum Laude, Phi Beta Kappa Honor Society, Dean’s Honors List
•Actuarial Exams Passed: Probability (01/2016); Financial Mathematics (08/2016)

TECHNICAL SKILLS
Programming Techniques: SQL, R, Python, Hive, Hadoop, Spark, JavaScript (D3.js), C++, Java
Tools: Tableau, AWS (RDS, EC2, Elastic Beanstalk), Git, Flask, Adobe Analytics, Omniture, Excel, PowerPoint

PROFESSIONAL EXPERIENCE
Expedia Group|Chicago, IL
Product Analytics Intern June 2018 – August 2018
•Assessed the incremental value and self-selection bias of Favorite/Shortlist feature in predicting customer
quality using Hive and adapting Random Forest Classifier and Logistic Regression with Python scikit-learn
• Constructed data pipelines from 40M+ data points and examined flight customers' cross-device
shopping patterns through the segmentation of various shopping attributes using Hive and Python

Principal Financial Group|Practicum


Student Analytics Consultant October 2017 – June 2018
•Evaluated and simulated the current Random Forest forecast model on factor returns on a financial risk
factor dataset with 200+ feature variables and 100K+ data points with R
•Researched and implemented Regime Switching model to enhance factor-based investing strategy

California Department of Insurance|Rate Regulation Branch, Los Angeles, CA


Student Analytics Assistant June 2016 – August 2016
•Predicted marketing channels for insurance companies by building an Ensemble model with
Bootstrap Aggregation using R
•Visualized mismatches and helped the branch target 20+ companies out of compliance with regulations

PROJECT WORK
Multivariate Time Series Prediction of User Behavior with Amplero September 2018 – Present
• Capture the longitudinal behavior of mobile phone users along various dimensions by applying state-space
models and optimize marketing decisions with Python

Customer Segmentation & Visualization Project with ShopRunner January 2018 – June 2018
•Segmented and analyzed 140+ retailers from 5M+ transaction data in ShopRunner’s network using Principal
Component Analysis (PCA) and Gaussian Mixture clustering model (GMM) with R
•Developed an interactive KPI dashboard with D3.js that enabled ShopRunner to visualize its retailer network

H1B Petition Status Prediction Web Application January 2018 – April 2018
• Engineered features from raw case certification data and forecasted H1B petition case status with Python
• Designed and deployed a web application by incorporating Flask, HTML, and AWS components
Ethel Shiqi Zhang
shiqizhang2018@u.northwestern.edu | +1 (213) 550-6743
https://github.com/0ethel0zhang

TECHNICAL SKILLS
• Languages: SQL, Python, R, Java, JavaScript, Spark, Hadoop, Tensorflow, Sklearn, MDX, GraphDB
• Software: Tableau, Spotfire, PostgreSQL, Git, Google Analytics, AWS, Microsoft Office Suite, Matlab
EDUCATION
Northwestern University Evanston, IL
Master of Science in Analytics, School of Engineering Sep 2017 – Dec 2018
• Relevant Coursework: Predictive Analytics, Python and Java Programming, Database and SQL, Data
Mining, A/B Testing, Data Visualization, Deep Learning, Text Analytics, Optimization
University of Southern California Los Angeles, CA
Bachelor of Science in Business Administration, Concentration in Finance Aug 2011 – May 2015
• GPA: 3.88, GMAT: 760 (99 percentile), Marshall Honor Student (1% of student body)
PROFESSIONAL EXPERIENCE
Lazard Frères & Co., LLC New York, NY
Data Science Intern Jun 2018 – Present
• Built GMM clustering and comparables-finding models to enable bankers to perform benchmarking
analysis; the models are at production-level and integrated into a company-wide data analysis tool
• Explored the relationship between financial metrics and stock price using Boosted Tree (AdaBoost) and
Random Forest models in Python to facilitate C-suite executives to make sound business decisions
• Prepared a data pipeline to clean, combine, and transform billions of rows of transaction data in Spark
Shell Game Venture Los Angeles, CA
Co-founder and Data Science Lead Jul 2016 – Jul 2017
• Co-founded a business venture that optimizes the return of select equity and alternative investments
• Utilized Excel, R, Python and SQL to create a local web application and optimization models that
automate the data warehousing and performance evaluation processes for select portfolio in a team of 5
Ernst & Young, LLP (EY) Irvine, CA
Project Management Consultant Jul 2015 – Jul 2016
• Increased internal sales win percentage by 20% through big data analytics utilizing Spotfire and Tableau
• Expedited the procurement and the implementation of a software through benchmarking analysis
• Developed a project management plan for client’s more than 70 breakthrough initiatives by setting up a
PMO equipped with project management and visualization tools such as reporting dashboards
• Led the incubation of EY Presents and organized 20 members to practice public speaking monthly
Morgan Stanley Beverly Hills, CA
Global Wealth Management Intern Aug 2013 – Jun 2014
• Piloted a marketing project that was projected to increase the team’s asset value by 1%
• Managed relationships with 30 plus high-net-worth clients daily through phone calls and emails
DATA SCIENCE PROJECTS
Chicago Botanic Garden (Joint Project with IBM Analytics) Jan 2018 – Jun 2018
• Identify clusters for client’s 1 million donors in Python to analyze gifting patterns and upselling trends
• Visualize the clusters on Tableau and present upselling recommendations to the EVP along with IBM
Chicago Park District (CPD) – Day Camp Price Modeling Sep 2017 – Jun 2018
• Reengineered the pricing model for Day Care program using machine learning methods such as neural
network, boosted tree, K-nearest neighbors, and factor analysis in R and built an interactive dashboard
Music Recommender Web Application Development Project Jan 2018 – Apr 2018
• Built a recommender system using Flask with communications to Spotify API and dynamically read
user inputs to present the recommendations in a web application hosted by AWS and RDS
SKILLS, AWARDS & INTERESTS
• Languages: fluent in Mandarin Chinese (written and spoken), conversational in French
• Awards: Winner of ’18 NYC Product Tournament, EY Bravo awards for excellent professional services
• Interests: Music, piano, traveling, psychology, board games, yoga, and track (high-school track team)
NumPy, SciPy, pandas, scikit-learn, dplyr, ggplot2, statistics, machine learning, data extraction, data cleaning, data analysis, large datasets, forecasting algorithms, Time-Series Regression, Logistic Regression, Factor Analysis, Excel, PowerPoint, market research

You might also like