Summary

Overview

Work History

Education

Skills

Certification

Awards

Accomplishments

Affiliations

Work Availability

Languages

Timeline

Moses G.

Lead Data Scientist & Cloud Engineer | Cloud Architecture, Data Engineering, Machine Learning Engineering

Austin,TX

Summary

Passionate Lead Data Scientist & Cloud Engineer with 10+ years architecting and leading scalable, high-performance data platforms that turn complex data into strategic business assets.

My expertise centers on AWS, where I've designed end-to-end solutions using Amazon S3 for robust data lakes, AWS Glue for serverless ETL/ELT, Amazon EMR (PySpark/Spark) for big data processing, Amazon Redshift and Spectrum for analytics warehousing, Amazon Kinesis for real-time streaming, AWS Lambda & Step Functions for event-driven orchestration, Amazon Athena for query-on-lake, and AWS Lake Formation for governance and security. I've led migrations to AWS-native architectures, optimized petabyte-scale pipelines (10–15TB+ daily), reduced costs by 35–45% through auto-scaling, Spot Instances, and serverless shifts, and achieved 99.9% uptime with sub-second latency for mission-critical analytics.

I also bring strong Azure capabilities, including Azure Data Factory for pipeline orchestration, Azure Databricks for Spark-based transformations and Delta Lake, Azure Synapse Analytics for unified warehousing and big data processing, and ADLS Gen2 for storage—enabling hybrid/multi-cloud strategies and seamless integrations when needed.

Beyond clouds, I excel in general data science engineering leadership: mentoring teams of 8–12 engineers on best practices, enforcing data governance and quality frameworks, collaborating cross-functionally with data science, analytics, and product teams, and delivering resilient batch/streaming pipelines that support ML models, BI dashboards, and real-time decision-making. Tools like Python (Boto3/PySpark/Pandas), SQL, Apache Spark/Kafka, Airflow (MWAA), Terraform/CloudFormation, and Docker are my daily drivers for building reliable, cost-efficient systems.

What drives me is solving tough data challenges at scale—whether slashing processing times by 50–70%, enabling self-service analytics for hundreds of users, or aligning infrastructure with revenue-generating outcomes. I've delivered millions in business value through optimized, governed platforms.

Open to connecting on multi-cloud data architecture, AWS/Azure migrations, team leadership in data engineering, or opportunities to build next-gen scalable solutions. Let's discuss how we can drive impact together! 🚀

Cloud Data Engineering & Distributed Systems

• Pipeline orchestration and ETL across Databricks, Airflow, Glue, and Data Factory

• Distributed data processing (Spark, Databricks)

• Lakehouse and data warehouse architectures (Delta Lake, BigQuery, Redshift etc)

Overview

years of professional experience

Certification

Work History

Lead Data Scientist & Cloud Engineer

AT&T

Remote

12.2022 - Current

• Act as a technical liaison between customers, service engineering teams, and leadership to design and deploy AWS solutions.

• Led architecture of enterprise data lake on Amazon S3 with AWS Glue Crawlers and Data Catalog, enabling scalable ingestion and discovery for petabyte datasets.
• Designed and implemented high-throughput ETL/ELT pipelines using AWS Glue and PySpark, processing 10TB+ daily and reducing runtime by 45%.
• Architected real-time streaming solutions with Amazon Kinesis Data Streams/Firehose to S3/Redshift, supporting sub-second analytics for mission-critical apps.
• Optimized Amazon Redshift clusters and Spectrum queries on S3, improving BI query performance by 3x and saving $400K+ annually in compute/storage.
• Built serverless data pipelines using AWS Lambda triggered by S3 events, integrated with Glue jobs for automated transformations and cost efficiency.
• Led cross-functional teams in adopting AWS Step Functions for workflow orchestration, enhancing pipeline reliability and reducing manual interventions by 70%.
• Implemented data governance frameworks with AWS Lake Formation and IAM policies, ensuring compliance and secure access across 500+ users.
• Mentored 7+ junior engineers on AWS best practices (Glue, EMR, Lambda), resulting in 30% improved team productivity and faster delivery.
• Engineered cost-optimization strategies across AWS services (Glue DPUs, EMR scaling, Athena partitioning), cutting monthly data platform expenses by 40%.
• Developed event-driven architectures with AWS Lambda and EventBridge, automating data quality checks and notifications via SNS.
• Integrated Amazon Athena for ad-hoc querying on S3 data lakes, enabling self-service analytics and reducing dependency on heavy warehousing.
• Built and implemented data infrastructure, ingesting and transforming data via ETL/ELT for large-scale apps.

Led development of machine learning models to enhance predictive analytics and data-driven decision-making.
Directed cross-functional teams in implementing data strategies, improving operational efficiency across departments.

Senior Data Engineer

AT&T

Remote

11.2019 - 12.2022

• Designed and implemented large-scale distributed ETL pipelines across AWS and Databricks environments to support enterprise analytics.
• Led modernization from on-prem data platforms to cloud-based architectures, improving performance and operational efficiency by about 30 percent.
• Architected enterprise data models and analytics solutions supporting self-service reporting for 200+ users.
• Led a cross-functional team to design and implement a new CI/CD pipeline, reducing deployment time by 30% and presenting the results to senior leadership.
• Built and operated production MLOps platforms using SageMaker, Docker, MLflow, and CI/CD pipelines to standardize model delivery.
• Served as senior technical advisor, translating business requirements into scalable AWS solutions for data and analytics initiatives.
• Engineered a scalable AWS data pipeline using S3 and Lambda, ensuring GDPR compliance for a machine learning workflow
• Designed and secured multi-account AWS environments using VPC Peering and Transit Gateway
• Managed and mentored data engineers and analysts, improving delivery quality and architectural consistency.

Senior Data Scientist & Engineer

Adidas

Remote

10.2018 - 11.2019

• Enhanced data visualization techniques with relational databases like SQL, Python, ArcGIS, R, SAS, Analytical Tools (Regression Analysis, Web Analytics), Predictive Modeling, Data Visualization (R-Shiny, Power BI, and Tableau), reducing data analysis time by 50% and increasing data insights by 20%.
• Developed predictive analytics and statistical models to optimize program performance, operational efficiency, and financial forecasting for clients in healthcare, government, and technology.
• Automated reporting and pipeline workflows using Python, SQL, and Azure Data Factory, reducing manual processing time by up to 40%.
• Partnered with business stakeholders to translate requirements into scalable BI dashboards and data pipelines supporting enterprise KPIs and strategic decision-making.
• Established early MLOps best practices, including experiment tracking, dataset versioning, and reproducible model training pipelines.

Data Research Analyst

New York City Department Of Education

New York, NY

09.2014 - 10.2018

• Conducted in-depth longitudinal analysis of student performance using SQL, Excel and R, driving a 20% improvement in test scores by identifying key performance gaps.
• Designed and implemented data-driven curriculum optimization experiments, reducing inefficiencies by 25% and enhancing learning outcomes.
• Translated complex data insights into actionable recommendations, presenting findings to senior stakeholders and aligning strategies with educational objectives.
• Partnered with school administrators and educators to develop interactive dashboards, streamlining student progress tracking and informing data-driven instructional decisions.
• Designed and implemented training sessions for staff on effective data utilization practices.

Education

Post-Graduate Certificate - Cloud Computing

The University of Texas at Austin

05.2025

Master of Science - Data Science

City University of New York

06.2018

Bachelor of Science - Statistics

University of Ilorin

06.2011

Skills

Cloud Platforms: Azure (AKS, ARM),
AWS (EC2, EKS), GCP
DevOps tools: JIRA, Jenkins, Slack,
AzureDevOps
Build Tools: Ant, Maven, MS Build
SCMs: SVN, Git, GitHub, Bitbucket,
GitLab, Azure Git
IAC Tools: Terraform,
CloudFormation
Containers/Orchestration: Docker,
Kubernetes
Application/Web Servers: Tomcat,
WebLogic 9.x/10.x/12c, Apache
2.x/1.3.x, JBoss 7.1
Operating Systems: Ubuntu 18.0.4,
Red Hat Linux, Windows, HP-UX and
Solaris 10
Programming & Scripting
Languages: Ruby, Python, Shell
scripting, UNIX Shell Scripts (Ksh,
Bash), Git Bash
Web Technologies : HTML5, CSS3,
JavaScript, JSON
Frameworks and Libraries: Angular,
Flask, RESTful APIs, React
Database Technologies: Oracle, SQL
Server, MySQL, PostgreSQL, S3, RDS,
DynamoDB
Methodologies: Agile, Scrum Networking/Security Tools: IAM,
ELB, Putty, VMware

Certification

• Certified AWS Certified Solutions Architect –Professional

• Certified AWS Certified Solutions Architect – Associate

• Certified Power BI Associate, Microsoft

• Certified Database Fundamentals (T-SQL), Microsoft

Awards

Recognized for Outstanding Leadership in mentoring junior data scientists and driving cross-functional collaboration at City of New York.

Accomplishments

• Developed a predictive analytics pipeline using Python and Scikit-learn to identify high-risk properties for health and safety violations, based on historical inspection, complaint, and maintenance data.
• Integrated multi-source data using SQL and PySpark in Databricks, ensuring clean, scalable datasets for machine learning model training.
• Designed and published Power BI dashboards for operational managers and field inspectors to visualize risk levels across buildings, zones, and violation types.

• Built a forecasting model using R and AWS SageMaker to predict peak inspection periods and optimize inspector scheduling, improving field coverage and reducing overtime costs by 20%.
• Automated data extraction and cleansing using SQL scripts, enhancing the timeliness of inspection reports.
• Conducted clustering analysis to group buildings based on historical violations, population vulnerability, and inspection history to develop proactive inspection routes.

• Designed and implemented a financial forecasting system using predictive modeling (Random Forest, Linear Regression) to simulate various budget scenarios.
• Used Azure Machine Learning to deploy models and monitor performance in real-time.

Affiliations

American Statistical Association
American Society for Quality

Work Availability

monday

tuesday

wednesday

thursday

friday

saturday

sunday

morning

afternoon

evening

swipe to browse

Languages

English

Native or Bilingual

Timeline

Lead Data Scientist & Cloud Engineer

AT&T

12.2022 - Current

Senior Data Engineer

AT&T

11.2019 - 12.2022

Senior Data Scientist & Engineer

Adidas

10.2018 - 11.2019

Data Research Analyst

New York City Department Of Education

09.2014 - 10.2018

Post-Graduate Certificate - Cloud Computing

The University of Texas at Austin

Master of Science - Data Science

City University of New York

Bachelor of Science - Statistics

University of Ilorin

Moses G.

Summary

Overview

Work History

Lead Data Scientist & Cloud Engineer

Senior Data Engineer

Senior Data Scientist & Engineer

Data Research Analyst

Education

Post-Graduate Certificate - Cloud Computing

Master of Science - Data Science

Bachelor of Science - Statistics

Skills

Certification

Awards

Accomplishments

Affiliations

Work Availability

Languages

Timeline

Lead Data Scientist & Cloud Engineer

Senior Data Engineer

Senior Data Scientist & Engineer

Data Research Analyst

Post-Graduate Certificate - Cloud Computing

Master of Science - Data Science

Bachelor of Science - Statistics

Similar Profiles

Mahesh RachamallaMahesh Rachamalla

RAVI POLINENIRAVI POLINENI

Raja Sekhar Reddy KalathurRaja Sekhar Reddy Kalathur

ROHIT NIKAMROHIT NIKAM