Summary
Overview
Work History
Education
Websites
Skills
Publications
Languages
Leadership & Achievements
Timeline
Generic

Abhishek Amberkar

Jersey City,NJ

Summary

Results-driven graduate with an MS in Machine Learning and over 3 years of experience as a Data Engineer and Data Analyst. Demonstrated expertise in designing, developing, and optimizing data pipelines and warehouses, as well as conducting in-depth data analysis to solve complex business challenges. Proficient in utilizing Apache Spark, Airflow, AWS, SQL, Python, and Tableau to deliver scalable solutions and drive actionable insights. Committed to meeting established timelines and consistently delivering high-quality results.

Overview

7
7
years of professional experience

Work History

Data Analyst

CodersData, LLC
01.2024 - Current

Data Analysis, Reporting & Machine Learning

  • Performed exploratory data analysis (EDA) using SQL, Excel, and Python, delivering actionable insights that informed business decisions and improved operational efficiency
  • Developed interactive dashboards using Power BI, providing real-time insights on key performance indicators (KPIs), helping stakeholders monitor performance and make data-driven decisions
  • Applied machine learning techniques to identify trends and patterns, improving business forecasting and decision-making processes

Data Pipelines & Automation

  • Built and optimized data pipelines using Azure Data Factory and Azure Data Lake, automating data extraction and transformation processes, and improving reporting efficiency by 25%
  • Automated reporting workflows using Python scripts and Excel macros, streamlining data processing and reducing manual errors, ensuring timely delivery of reports and insights

SQL Optimization & Data Preparation

  • Optimized SQL queries for data extraction and analysis, leveraging techniques such as joins, indexing, and query optimization to handle large datasets efficiently, reducing query execution times by 30%
  • Preprocessed and cleaned data for machine learning models and reporting, ensuring data accuracy, consistency, and reliability in analyses and predictions

Collaboration & Stakeholder Communication

  • Collaborated with cross-functional teams, including marketing, finance, and operations, to gather requirements and deliver data solutions that align with business objectives
  • Presented complex data insights to non-technical stakeholders through clear reports and visualizations, enabling informed decision-making and enhancing business outcomes

Graduate Teaching and Research Assistant

Stevens Institute of Technology
01.2022 - 05.2023

Course Assistance & Student Mentorship

  • Assisted in creating course content, grading 100+ assignments, and leading study sessions for 50+ students in Big Data & Data Mining courses, resolving doubts and ensuring a strong understanding of core concepts and methodologies

Machine Learning Research & Data Analysis

  • Conducted a comparative study on startup success, leveraging acquisition data and machine learning models (MLP Neural Network, Decision Tree, Gradient Boosting Trees, Random Forest) with Apache Spark, achieving 90% accuracy and high AUC scores.
  • Contributed to neural machine translation project using NLP techniques, actively participating in the implementation of a sequence-to-sequence model with word embeddings and attention mechanisms to enhance the accuracy and fluency of translations

Data Engineer

Larsen & Toubro Infotech Limited
04.2018 - 07.2021

ETL Development & Data Pipeline Optimization

  • Developed homegrown scalable distributed ETL pipelines using PySpark, Apache Spark (Java/Scala), AWS, Hive, HDFS, and Airflow, to replace existing ETL processes, reducing processing time by 35% for major financial clients
  • Managed and integrated 150M+ records from various sources, including Hive, Oracle, and flat file formats (CSV, JSON, Avro, XML, Parquet)
  • Enhanced data pipeline efficiency by optimizing both Spark jobs and SQL queries using advanced techniques such as partitioning, indexing, caching, and optimized join strategies, leading to faster query execution and improved performance for large scale financial datasets

Automation & Workflow Management

  • Automated workflows using Apache Airflow and Python scripts, enhancing job scheduling, monitoring, and logging for financial transactions, increasing operational efficiency by 20% and reducing manual intervention
  • Improved data processing by addressing bottlenecks and employing advanced data validation and caching techniques, ensuring continuous data integrity, reliability, and performance

Data Modeling & Machine Learning

  • Designed and implemented scalable data models and schemas for efficient data storage and retrieval, improving the performance, scalability, and efficiency of financial data architecture
  • Leveraged machine learning techniques to analyze financial data, including customer transaction data, leading to improvements in customer retention and reduced churn through predictive modeling

CI/CD, Testing & Deployment

  • Led CI/CD pipeline deployments using Jenkins, Bitbucket, and SonarQube, ensuring reliable releases across SIT, UAT, and production environments, maintaining code quality for financial applications
  • Conducted thorough unit, integration, and performance testing using JUnit and ScalaTest, ensuring robustness and reliability of data pipelines handling sensitive financial data

Collaboration & Stakeholder Engagement

  • Collaborated with cross-functional teams within Agile SDLCs, working with financial stakeholders to translate business requirements into data engineering solutions, ensuring timely and accurate delivery of projects
  • Developed interactive dashboards and reports using Tableau, delivering actionable insights and financial performance metrics to stakeholders, enabling data-driven decision-making

Leadership & Mentorship

  • Led proof-of-concept (POC) projects to explore and implement new ETL technologies using AWS services (S3, Lambda, Redshift, Glue, EMR), improving data processing speed and scalability for complex financial data
  • Mentored junior developers through code reviews, promoting clean coding practices and ensuring high-quality contributions to data engineering projects

Data Governance & Security

  • Ensured data quality, integrity, and security through rigorous validation, reconciliation, and encryption protocols, safeguarding sensitive financial data and ensuring compliance with industry standards and regulatory requirements

Education

Master of Science - Machine Learning

Stevens Institute of Technology
Hoboken, NJ
05.2023

Bachelor of Engineering - Electronics Engineering

University of Mumbai
India
08.2017

Skills

Languages & ML Frameworks – Python, R, Java, Scala, Pandas, NumPy, Matplotlib, Seaborn, NLTK, Scikit-learn, Keras, TensorFlow, PyTorch, SpaCy  

Big Data & Data Warehousing – Apache Spark, Hadoop, Kafka, Hive, Cassandra, Airflow, Snowflake, DBT, PostgreSQL, MySQL, MongoDB, Oracle  

Visualization & CI/CD – Tableau, Power BI, Arcadia Data, Git/GitHub, BitBucket, SVN, DataDog, Docker, Kubernetes, Jenkins, Jira, Linear  

Cloud Services – Azure(Databricks, Data Factory, Data Lake), AWS(EMR, S3, Redshift, Glue, Lambda), GCP(Compute Engine, BigQuery, Dataflow)

Publications

Smart Glove for Sign Language - Presented at National Level Conference on Frontiers in Engineering and Technology (NSCFET) 2017

Languages

English
Full Professional
Hindi
Native or Bilingual
Marathi
Native or Bilingual

Leadership & Achievements

  • 1st Runner-Up in the LTI - IBM Data Hackathon 2022 for IT Support Ticket Optimization.
  • Delivered an 'Introduction to Machine Learning' online session for IEEE RAIT in 2023.
  • Volunteered at IEEE RAIT tech fest ELIXIR 2016.

Timeline

Data Analyst

CodersData, LLC
01.2024 - Current

Graduate Teaching and Research Assistant

Stevens Institute of Technology
01.2022 - 05.2023

Data Engineer

Larsen & Toubro Infotech Limited
04.2018 - 07.2021

Master of Science - Machine Learning

Stevens Institute of Technology

Bachelor of Engineering - Electronics Engineering

University of Mumbai
Abhishek Amberkar