You are on page 1of 1

MATTHEUS LIM JEN SERN

Tel: +44 7778226740 | Email: mattheus.lim@bayes.city.ac.uk


LinkedIn: https://www.linkedin.com/in/mattheus-lim/ | Website: https://mattlim96.github.io/portfolio/

EDUCATION
Bayes Business School (formerly Cass), London, UK 2021 - 2022
MSc Business Analytics
• Applied Research Project with Rolls-Royce to optimise existing speech assistive software for MND patients utilising
Stanza NLP pipelines and pre-trained dialog models from Facebook (Meta) ParlAI Python framework.
• Ranked #1 in Network Analytics Project on “Regulatory Impact on Ethereum Ecosystem”; selected out of 60 students to
contribute to a research paper about “Role of Regulation on Cryptocurrency Markets”.
• Machine Learning, Deep Learning, Data Visualisation, Applied NLP, Data Management Systems (Python, R, SQL).
University College London (UCL), London, UK 2015 - 2018
BSc Statistics (2:1)
• Achieved 1st Class Honours for dissertation – “Improving the Estimation of the Weight of Drugs Impregnated in Clothing”.

PROFESSIONAL EXPERIENCE
GetHarley (Aesthetic Dermatology Digital Platform), London, UK Nov 2021 - Present
Senior Data Analyst (Part-Time)
• Appointed as Interim Manager of data team to on-board, supervise and coach 2 part-time Data Analysts.
• Developed and deployed Python and SQL scripts to Heroku through GitHub for monthly reports.
• Built data pipelines from Stripe, Calendly, Acuity and Google Calendar APIs to Google Sheets for performance tracking.
• Queried data from PostgreSQL database for ad-hoc reports and data analyses using SQL, Python and Google Sheets.
Shopee (Leading SEA E-Commerce Platform), KL, Malaysia May 2019 - Aug 2021
Data Scientist / Data Analyst (Full Time)
• Awarded highest rating in final performance review (1 of 100 employees); promoted to Senior Associate within 1.6 years.
• Mentored and guided 3 interns from the Shopee Apprentice Program (3 months).
• Extracted keyword trends (2 to 3 n-grams) from products added to cart using Scikit-learn for Hot Products curation.
• Optimised product negotiation process through Google API; preventing price tagging errors costing c.£15k from recurring.
• Conducted product and voucher countering initiatives by scraping competitor’s data using Python and Google Apps Script.
• Executed special initiatives by Category Managers involving project design, monitoring, project management (managed a
team of ~200 employees), data analysis and performance reporting utilising R and Google Sheets.

ADDITIONAL INFORMATION
Data Science Projects:
Wind Turbines health prediction (RNN and CNN Time Series Prediction in Python – TensorFlow)
• Devised and compared LSTM, GRU and Simple sequence-to-vector RNNs with validation accuracy followed by tuning.
• Implemented 2D-CNN by reshaping time series data; significantly improved training speed, and accuracy by ~6%.
• Assessed final RNN and CNN via test set accuracy and confusion matrix to select model with minimised monetary cost.
Insurance claim fraud detection with Neural Networks (Autoencoder Anomaly Detection in Python – TensorFlow)
• Created synthetic data by oversampling and under-sampling via SMOTE to deal with unbalanced data prior to training.
• Built FCNN with KerasTuner; chose optimal threshold using custom cost function to balance FN and FP classifications.
• Developed and tuned Autoencoder with non-fraud only data; doubled sensitivity (TPR) and retained > 80% accuracy.
Machine Learning R Shiny app (Decision Tree and Random Forest in R)
• Designed an interactive web app to train and predict a League of Legends match winner using data from first 10mins.
• Devised and tuned Decision Tree and Random Forest models for classification using Caret and Rpart libraries in R.
• Visualized Decision Tree and Random Forest Variable Importance to identify key features of a winning team.
• Evaluated performance of classification models through overall accuracy, precision and recall metrics.
Customer segmentation and RFM (K-Means Clustering in R)
• Applied Independent Component Analysis to reduce dimension of customer data from Kaggle.
• Utilised recency, frequency, and monetary metrics to elucidate customer segments generated from k-means clustering.
• Produced 4 segments with differing characteristics and proposed marketing strategies to target each segment.
Effects of Covid-19 on UK business foundation patterns (EDA and Visualization in Python)
• Analysed Companies House data of businesses and sectors within a 12-month period before and after UK lockdown.
• Plotted time series and pie-of-pie charts of incorporated UK firms with Matplotlib to explore general trend.
• Ranked sectors in London with a diverging stacked bar chart to elucidate top performing sectors after lockdown.
Influence of Elon Musk’s tweets on Dogecoin (Web scraping, EDA, and Visualization in Python)
• Scraped Elon Musk’s tweets with Selenium and fetched Dogecoin trading data from Binance API via CCXT python library.
• Tested significant difference in price change, trading volume, volatility, and max price change of Dogecoin 15 mins before
and after Elon Musk’s Dogecoin tweets using t-test from Statsmodels python library to capture market reaction.
Professional Qualifications: Blockchain Certified Associate Developer
IT Skills: Python (Pandas, Numpy, Matplotlib, Scikit Learn, Tensorflow, PyTorch, NLTK, Selenium, BeautifulSoup), R, SQL.
Languages: English (Native), Mandarin (Conversational), Malay (Conversational), Cantonese (Beginner), Spanish (Beginner).
Achievements: Joined Malaysian MENSA Society with 168 score on the Ravens Advanced Progressive Matrices Scale.
Interests: Interested in web3, technology, programming, dancing, singing and DJ-ing.

You might also like