Summary
Overview
Work History
Education
Skills
EXTRA-CURRICULAR ACTIVITIES
Certification
Languages
Timeline
Projects
Generic

Meghna Choudhury

New York,NY

Summary

Results-driven data engineering professional with solid foundation in designing and maintaining scalable data systems. Expertise in developing efficient ETL processes and ensuring data accuracy, contributing to impactful business insights. Known for strong collaborative skills and ability to adapt to dynamic project requirements, delivering reliable and timely solutions.

Overview

7
7
years of professional experience
1
1
Certification

Work History

Data Engineer (SDE II)

JPMorgan Chase & Co
United Kingdom
12.2022 - 08.2025
  • Gained hands-on expertise in data ingestion, quality, enrichment, mining, and governance, contributing to scalable and reliable data management practices over an initial period of 3 months of training
  • Demonstrated expertise in developing comprehensive Big Data pipelines for Operational Risk, achieving 100% reconciliation from OLTP sources to HDFS with daily integrity checks via Hive and Impala
  • Delivered 100% high-accuracy data orchestration using Apache Spark, facilitating critical business operations and decision-making
  • Briefly explored Sapiens UI to strengthen familiarity with metric handling processes
  • Participated and contributed to a global-scale risk and control data migration, analysing architectures and implementing efficient pipelines to a Data Lakehouse entirely from an MVP to operational in the production phase.
  • Maintained 100% compliant enrichment and seamless loading across AWS cloud/on-prem platforms via Databricks
  • Proposed and validated an efficient time complex multi-workflow pipeline job leveraging Lakehouse features, projected to improve efficiency after extensive proof-of-concept research
  • Refined an advanced metric handling framework through optimised SQL queries, cutting report time by 6 hours with accuracy

Software Developer

Capgemini Technology Services Pvt Ltd
India
09.2018 - 08.2021
  • Delivered an enhancement project in an Agile environment for enterprise content management within insurance sector(recognized as Star Employee for 2 consecutive years)
  • Collaborated with 3 cross-functional teams to optimize business applications (Claims, Personal Lines, and Underwriting), improving performance, usability, and reliability
  • Achieved a 90% reduction in UI issues within 6 months, enhancing user experience and system stability

Education

MS - Data Science

Columbia University
New York, NY, USA
12-2026

PG Diploma - Robotics and Artificial Intelligence

University of Glasgow
Glasgow, UK
08-2022

B.TECH - Electronics and Communication Engineering

SRM University
Chennai, IND
05-2018

Skills

  • Programming Languages: Java (Experienced), Python (Skilful), R (EDA & Visualization), SQL (Experienced)
  • Data Engineering: Big data ETL/ELT, data ingestion, transformation, enrichment, reconciliation, data quality/governance, and lakehouse migration; Spark/PySpark, Spark SQL, Hive, HDFS, Databricks (Lakeflow/Auto Loader/Jobs, notebooks, workflows, Unity Catalog, Delta Sharing), AWS/on-prem pipeline orchestration, and SQL/query optimization; strong foundations in schema design/evolution, Protobuf/gRPC, OLAP/OLTP, caching/indexing/Bloom filters, partitioning/replication, ACID/2PL/2PC/consensus, and performance tuning
  • Artificial Intelligence: Machine learning, deep learning, and generative AI, including regression/classification, clustering and dimensionality reduction, neural networks and transformers, prompt engineering, embeddings, RAG, vector search, ANN/HNSW, LangChain, tool-calling agents, and model evaluation
  • Data Science & Analytics: Statistical analysis, algorithm analysis, hypothesis testing, probability, regression and classification, feature engineering, exploratory data analysis (EDA), clustering, model evaluation, A/B testing, data visualization, and predictive modeling

EXTRA-CURRICULAR ACTIVITIES

  • Music: Piano, Bass guitar
  • Drawing and Painting: Charcoal art, digital painting, watercolor painting
  • Founded and led the IT & Technology Club in high school, promoting innovation and peer learning through events and workshops
  • Represented peers as part of the Directorate of Student Affairs, helping deliver the cultural and technical festival
  • Organized events and cultural initiatives as part of the Fun Committee in my first work organization

Certification

Oracle University: Oracle Certified Associate Java SE8 Programmer

Languages

English
Native or Bilingual
Hindi
Native or Bilingual
Bengali
Native or Bilingual
Japanese
Elementary

Timeline

Data Engineer (SDE II)

JPMorgan Chase & Co
12.2022 - 08.2025

Software Developer

Capgemini Technology Services Pvt Ltd
09.2018 - 08.2021

PG Diploma - Robotics and Artificial Intelligence

University of Glasgow

MS - Data Science

Columbia University

B.TECH - Electronics and Communication Engineering

SRM University

Projects

1. FoodVantage: Multimodal AI Grocery Health App for diet awareness -2026
Built a Streamlit-based grocery health app using Gemini vision/agent workflows, DuckDB, Plotly, and Open Food Facts for real-time item scanning, metabolic scoring, allergy detection, nutrition retrieval, personalized insights, and AI meal planning.

2. International Student AI Consultant for Education and Career Pathways (In Progress) -2026
Designing a FastAPI-based multi-agent RAG system for program matching, visa guidance, and career analysis for international students, with retrieval over official datasets, vector indexing of visa/regulatory PDFs, evaluation/logging, and a dashboard prototype.

3. Data Analysis and Visualization of Cosmic Deep Space -2025
Conducted EDA and data wrangling in R using tidyverse and ggplot2 on the JWST UNCOVER DR3 Abell 2744 photometric catalog (~74K sources), quantifying missingness and completeness across NIRCam filters and building hexbin and dot-density visualizations to analyze brightness-redshift trends and spatial non-uniformity.

4. FXCurrencies Model -2025
Developed a Python-based USD/CHF FX trading framework on ~6 years of Bloomberg OHLC data, combining optimized exponential-smoothing crossover strategies (grid-searched α/β, confirmation/gap filters, momentum exits) with ML-based probabilistic signal models including KNN, Random Forest, XGBoost, and neural networks; evaluated performance using time-series splits, trading simulations, and risk-return metrics such as Sharpe, CAGR, and max drawdown.

5. Machine Learning Clustering for Colorectal Cancer Tissue Patches -2022
Performed unsupervised model selection on 5,000 colorectal cancer tissue patches across 9 tissue types, benchmarking K-Means, GMM, Hierarchical Clustering, and Louvain over multiple feature spaces; compared deep embeddings from ResNet50, InceptionV3, VGG16, and PathologyGAN with PCA/UMAP reduction and evaluated cluster quality using Silhouette Score and V-measure.

6. Baxter Robot Playing Chess -2022
Simulated a Baxter robot to detect, pick, and move chess pieces, modeling human-like spatial reasoning and move execution with 90% accuracy.

7. Navigation Hand Device for the Visually Impaired -2021
Built a real-time LiDAR-based navigation aid in embedded C++, integrating servo control and STL-based spatial mapping for obstacle-aware guidance.

8. Humanoid Kid-Size Virtual Soccer Simulations -2021
Developed a virtual soccer team in ROS using behavior-based decision algorithms for attacking, defending, and goalkeeping, achieving 90% accurate gameplay in simulation.