Summary
Overview
Work History
Education
Skills
Websites
Timeline
Generic

Arjun Soni

Atlanta,GA

Summary

Senior Data Engineer with expertise in data architecture development and maintenance. Over 7+ years of experience in designing and managing advanced data pipelines using cloud platforms such as GCP and Azure. Proficient in SQL, Python, and ETL processes, focusing on automating tasks and enhancing data quality. Proven track record of leading teams to create innovative data solutions that improve system efficiency and support strategic decision-making.

Overview

8
8
years of professional experience

Work History

Senior Data Engineer (GCP)

Tata Consultancy Services Ltd (Client: Macy's Inc)
Atlanta, GA
05.2022 - Current
  • Designed and maintained high-performance databases for analytics and reporting needs.
  • Developed and optimized ETL processes, achieving 42% reduction in processing time.
  • Automated legacy pipelines, enhancing reliability by 29%.
  • Created ETL scripts to centralize and transform data from diverse sources.
  • Deployed machine learning models in production for real-time predictions.
  • Collaborated with cross-functional teams to gather requirements and deliver tailored solutions.
  • Utilized SQL, Python, and R for advanced analytics on structured and unstructured data.
  • Led Customer Data Lake activities to design scalable ETL pipelines.
  • Write, debug, and implement complex queries to manage relational and NoSQL databases (e.g., DynamoDB, MongoDB).
  • Built real-time Pub/Sub, Kafka streaming pipelines to ingest real-time data into GCP BigQuery.
  • Collaborated with cross-functional teams to ensure seamless integration of data solutions into existing systems.

Data Engineer / Data Analyst

Client: Macy’s Inc
Atlanta, GA
02.2021 - 04.2022
  • Designed and implemented scalable data pipelines for efficient ingestion and processing using Python and SQL.
  • Developed ETL processes to integrate data from multiple sources, ensuring accuracy and reliability.
  • Collaborated with cross-functional teams to gather requirements and support data-driven initiatives.
  • Conducted thorough data analysis with SQL and Python to generate actionable insights.
  • Streamlined data collection across departments, enhancing overall operational efficiency.
  • Led design and maintenance of enterprise data architecture solutions for optimal performance.
  • Deployed machine learning models for predictive analytics using Spark and TensorFlow.
  • Managed version control and deployment of data applications with Git and Jenkins.

GRADUATE ASSISTANT (FINANCIAL AID & SCHOLARSHIP)

Northern Illinois University
DeKalb, IL
01.2019 - 12.2020
  • Automated data processes using Python to increase efficiency in financial aid operations.
  • Transformed unstructured data for Oracle Database, enhancing data accessibility.
  • Supported data collection and analysis for various studies, contributing to research quality.
  • Executed SQL queries to extract critical information, aiding decision-making.
  • Streamlined data management practices, significantly boosting data accuracy.

Senior Data Science Analyst

Byjus Think And Learn
Hyderabad, Telangana
05.2017 - 08.2018
  • Analyzed large datasets to identify trends and support decision-making processes.
  • Developed predictive models using statistical techniques for data-driven insights.
  • Presented findings and recommendations to stakeholders in clear, visual formats.
  • Utilized SQL and Python for data extraction, transformation, and analysis tasks.
  • Conducted data quality assessments to ensure accuracy and reliability of information.
  • Developed and implemented predictive models to identify customer segments for targeted marketing campaigns.
  • Performed statistical analyses on structured and unstructured datasets with Python and R programming languages.
  • Integrated various data sources into centralized repositories for easier access and improved scalability.
  • Automated manual tasks such as cleaning raw datasets using scripting languages such as SQL or Python.
  • Prepared well-structured presentations with clear visualizations that explain complicated concepts in an easy-to-understand manner.
  • Designed experiments to test hypotheses about user behavior and develop insights into product usage patterns.

Education

Master of Science - Operations And Management Information Systems

Northern Illinois University
Dekalb, IL
12-2020

Bachelor of Science - Computer Science

GITAM UNIVERSITY
Hyderabad, TS
04-2017

Skills

  • Database design and ETL development using Airflow, Dataflow, and Dataproc
  • Data warehousing, data migration, normalization, and integration
  • Machine learning, predictive modeling, and descriptive modeling
  • Real-time data processing using Pub/Sub, Kafka
  • Cross-functional collaboration
  • Version control (CI/CD) tools: Git, Jenkins, GitLab
  • Statistical analysis and modeling
  • API development and data pipeline design
  • Python programming and scripting
  • Advanced SQL and query optimization
  • Databases: NoSQL, BigQuery, MongoDB, CosmosDB, and relational databases
  • Interpersonal and communication skills
  • Data visualization tools: Tableau, Power BI, Looker
  • Project management: Agile methodologies and Scrum ceremonies
  • Cloud technologies like Google Cloud Platform, Azure, and AWS
  • AI enthusiast: GEMINI Code Assist, Vertex AI

Timeline

Senior Data Engineer (GCP)

Tata Consultancy Services Ltd (Client: Macy's Inc)
05.2022 - Current

Data Engineer / Data Analyst

Client: Macy’s Inc
02.2021 - 04.2022

GRADUATE ASSISTANT (FINANCIAL AID & SCHOLARSHIP)

Northern Illinois University
01.2019 - 12.2020

Senior Data Science Analyst

Byjus Think And Learn
05.2017 - 08.2018

Master of Science - Operations And Management Information Systems

Northern Illinois University

Bachelor of Science - Computer Science

GITAM UNIVERSITY