Overview
Work History
Education
Programming Languages:
Databases:
Visualization:
Open Source Libraries:
Models Implemented:
Certification
Coursework:
Projects
Timeline

Rahul Sonti

Data Analyst
Santa Clara,CA

Overview

3
3
Certifications
6
6
years of post-secondary education
1
1
year of professional experience

Work History

Graduate Research Assistant

George Mason University
, VA
08.2020 - 12.2020
  • Consolidated the Transparency Reports Data for multiple tiers of organizations including Twitter, Google, Facebook, and others based on the organization's scalability.
  • Developed visualizations using Tableau, and reports for fetching data insights to perform required analysis for classification, and prediction models.

Machine Learning Engineering Intern

Token Metrics Inc
, Virginia
05.2020 - 08.2020
  • Generated Feature Importance using Grid Search Cross-Validation on XG Boost Regressor using the Organization’s data available from 2012-2020 to get the attributes that are affecting the Opening and Closing Prices of Crypto Currencies.
  • Successfully leveraged Long Short-Term Memory Neural Networks to predict the closing prices of 42 cryptocurrencies including Bitcoin, and Ethereum for the future 3 months period with an Accuracy of 88.3%.
  • Created interactive Visualizations using Tableau software, Matplotlib, and Seaborn libraries for attributes affecting the Closing Prices.

Research Intern

Keshav Memorial Institute of Technology, Hyderabad
05.2018 - 08.2018
  • Successfully built a remote-control vehicle that ran on DragonBoard 410C microcontroller to capture images at remote locations.
  • Utilized Azure IOT Hub as part of the process.
  • Developed a Windows store application with user interface using C# XAML libraries for the vehicle operation.
  • Developed Windows store application with core IOT libraries for the transmission of messages.

Education

Master of Science - Data Analytics Engineering

George Mason University, Fairfax, VA
01.2019 - 12.2020
  • Graduated with 3.74 GPA

Bachelor of Technology - Computer Science

JNTU, Hyderabad
08.2014 - 05.2018
  • Graduated with 3.8 GPA

Programming Languages:

  • Python, R, Java, C#, C language

Databases:

  • SQL, Postgres SQL, MongoDB

Visualization:

  • Tableau, Power BI

Open Source Libraries:

  • NLTK, Keras, Tensorflow, ScikitLearn, Textblob, Tweepy, OpenCV

Models Implemented:

  • BERT, LSTM, ANN, Random Forests, XG- Boost, ADA Boost, Decision Trees, Perceptron

Certification

Tableau Certified Analyst

Coursework:

  • Machine Learning (Python & Statistics)
  • Analytics Big Data to Info (Python, R, SQL, MongoDB, Hadoop Hive)
  • Analytics/Decision Analysis
  • Principles Data Management / Mining (SQL, MongoDB)
  • Decision and Risk Analysis
  • Visualization for Analytics
  • Metadata Analytics Big Data
  • Info: Represent, Process, Viz
  • Deep Learning

Projects

Sentiment Analysis comparison on two keywords using Twitter API

Nov 2020 - Dec 2020

  • Administered Tweepy library for acquiring 1000 tweets each for two input keywords using the Twitter API.
  • Leveraged the TextBlob NLP library for gathering the required information form the tweet context, and attained a sentiment score for each tweet.
  • Computed the average sentiment scores for tweets and ranked them to determine the more popular keyword among both.

Social Media Disinformation Network

Aug 2020 – Dec 2020

  • Acquired a huge dataset (80 GB) from the Twitter Transparency Report and the Twitter Archives and appended them such that there are no duplicate records in the final dataset using SQL operations.
  • Extracted multiple datasets in unsupported file formats, and converted them using glob, and csv packages to be compatible for developing the "Bidirectional Encoder Representation from Transformers" classification model.
  • Successfully implemented the BERT pretrained classifier API to develop a Disinformation classification model with an Accuracy of 96%.

Prediction of Yards Thrown in a NFL Game

Jan 2020 – May 2020

  • Integrated the BeautifulSoup4 library to scrape the web and acquire the data from the NFL Stats Website.
  • Used NLTK and Regular Expressions to parse through the data, tokenize it and fetch just the important information from the context.
  • Successfully developed an XG Boost Regressor Model to predict the number of Yards thrown by a Quarterback and the with an accuracy of 82%.
  • Generated average sentiment scores for each of the keywords and ranked them to compare the popularity.

Timeline

Graduate Research Assistant - George Mason University
08.2020 - 12.2020

Tableau Certified Analyst

06-2020

IBM Python for Data Science

06-2020

IBM Machine Learning with Python

06-2020
Machine Learning Engineering Intern - Token Metrics Inc
05.2020 - 08.2020
George Mason University - Master of Science, Data Analytics Engineering
01.2019 - 12.2020
Research Intern - Keshav Memorial Institute of Technology, Hyderabad
05.2018 - 08.2018
JNTU - Bachelor of Technology, Computer Science
08.2014 - 05.2018
Rahul SontiData Analyst