Summary
Overview
Work History
Education
Skills
Websites
Certification
Awardsactivities
Timeline
Generic

ABID SHAIK

Dallas,TX

Summary

Results-driven Data Engineer with a solid background in architecting and implementing innovative data solutions. Proficient in leveraging AWS cloud services, including Amazon Redshift, EMR, and Glue, to optimize data processing workflows for both streaming and batch data. Experienced in using cutting-edge technologies like Apache Q and AWS Bedrock. Skilled in database management, ETL processes, and advanced data modeling. A dedicated professional committed to ensuring data accuracy, driving operational efficiency, and enhancing overall data infrastructure. Expertise in designing, developing and maintaining highly scalable, secure and reliable data structures. Accustomed to working closely with system architects, software architects and design analysts to understand business or industry requirements to develop comprehensive data models. Proficient at developing database architectural strategies at the modeling, design and implementation stages.

Overview

14
14
years of professional experience
1
1
Certification

Work History

Sr. Data Engineer

Amazon.com
12.2020 - Current
  • Trained LLMs in AWS Bedrock to build an AI assistant that integrated inputs from internal wiki documents and data from Redshift and DynamoDB, providing a self-service mechanism for customers to answer analytical and operational questions efficiently
  • Led efforts to create a unified set of recruiting metrics, driving consistency in reporting and enabling better decision-making across teams by collaborating with senior stakeholders
  • Played a pivotal role in migrating a critical data reporting system to a modern microservices architecture, addressing challenges, resolving blockers, and ensuring smooth transitions to new platforms
  • Led a cross-functional team to create an innovative case management tool that significantly improved user experience, earning leadership recognition for creativity and efficiency
  • Spearheaded process improvements, including developing runbooks and checklists, which resulted in a 30% reduction in operational tickets and improved overall service reliability
  • Designed and implemented a custom maintenance utility for a large-scale data platform, reducing system costs by 50% and improving platform stability
  • Architected and implemented data ingestion pipelines that enabled near real-time analytics, improving reporting capabilities for key business teams
  • Provided leadership in technical reviews, offering guidance on system architecture, data pipelines, and performance improvements, ensuring solutions were scalable and cost-effective
  • Developed an automation framework that reduced time-to-market for new data pipelines by 70%, increasing team efficiency and responsiveness to business needs
  • Designed and implemented security protocols that improved the ability to respond to incidents quickly, ensuring data protection and compliance with organizational standards
  • Led the integration of large-scale data systems, enabling advanced tracking and analytics, supporting business growth, and improving reporting accuracy.

Data Engineer

CGI Group
06.2015 - 12.2020
  • Designed and implemented both the front-end and back-end systems that run on AWS on par with organization compliance and security policies
  • Designing and deploying scalable, highly available, and fault-tolerant systems on AWS
  • Designed AWS Cloud Formation templates to create custom-sized VPC, subnets, NAT to ensure successful deployment of Web applications and database services across the environments
  • Implemented detailed monitoring for a cloud environment and notification system using cloud watch and Simple notification system
  • Created workflows for AWS Data pipeline jobs using AWS resources with defining activities, schedules, and parameters and managed Amazon Redshift clusters
  • Developed a migration plan to migrate On-premise application to AWS in terms of time, cost, security, and availability - This includes provisioning EC2 instances, Auto scaling, creating S3 Buckets along with life cycle creation, Data encryption, provisioning EFS, configuring IAM users/roles, CloudWatch and SNS, creating custom VPC, subnets, a VPN connection to US data center and CloudWatch dashboards for easy monitoring
  • Imported data from AWS S3 into Spark RDD and performed transformations and actions on RDDs
  • Acted as a Liaison between business and Information systems in migrating the Healthcare Public plans data to Health Rules Payor (HRP)
  • Designed and developed a data repository in Oracle via ETL loads to accommodate the business process re-design of a pharmaceutical commercial sector
  • Designed the HR migration system to transfer the HR related data from Workday to ADP
  • Designed and developed ETL Batch processes to transform huge scales of data to migrate the documents from one system to another
  • Designed the architecture of an Enterprise Data warehouse repository to create a central data repository that integrates various state's public transportation (MBTA) systems
  • Developed ETL jobs in Informatica to load the data into DW, and built interactive dashboards in OBIEE to provide analytic capabilities for the business management
  • Designed and implemented scalable infrastructure and platform for large amounts of data ingestion, aggregation, integration, and analytics in Apache Spark
  • Researched and developed intricate machine learning algorithms with Spark ML to predict the ridership of various subway stations at different times
  • Developed a scoring model using Informatica Data Quality to automate the de-duplication process of passenger vehicle licenses.

ETL Analyst

TECH MAHINDRA
02.2013 - 07.2014
  • Built a decision support model for Insurance policies, which reduced the cost of Insurance claims by 35%
  • Published customized interactive reports and dashboards using Tableau Server
  • Cleansed the raw transaction data from different Operational Data sources to eliminate data integrity issues
  • Supervised and managed a 7-person team in the daily operations of the project
  • Responsible for effective and timely delivery of all the project deliverables
  • Played a vital role as a Scrum Master, managing the team responsibilities in the transition phase from waterfall to agile methodology.

Systems Engineer

INFOSYS Ltd
10.2010 - 10.2012
  • Managed current and historical financial transactions of a multinational bank using Informatica and Oracle DB
  • Designed and developed various OBIEE Interactive Dashboards and reports with drilldowns, guided navigation, filters, and prompts
  • Developed Unix shell scripts to schedule and execute the Informatica jobs in UNIX
  • Tuned the existing Informatica mappings by checking for bottlenecks and reduced the execution time by 70%
  • Automated one of the production support processes by developing a tool that reduced the manual effort by 65%
  • Coached 17 junior developers on the concepts of data warehousing and Informatica.

Education

Master of Science in Management Information Systems -

University At Buffalo
Buffalo, NY

Bachelor of Technology in Computer Science -

Vellore Institute of technology University

Skills

  • Cloud Platform: AWS
  • Programming Languages: Python, SQL, Java, Scala
  • GenAI Tools: Amazon Q, AWS Bedrock
  • Data Processing Frameworks: Apache Airflow (MWAA), Apache Spark, , Hadoop
  • Database Management: Relational databases like MySQL, PostGreSQL, NoSQL databases (eg MongoDB, Cassandra)
  • Data Modelling: ERWin
  • ETL: Informatica Powercenter, SSIS
  • Scripting and Automation: Bash, PowerShell
  • Data Quality and Debugging: Jupyter Notebook

Certification

  • AWS Cloud Practitioner
  • AWS Solutions Architect (Associate)
  • AWS Data Engineer (Associate)

Awardsactivities

  • Received STAR awards 3 consecutive quarters for displaying consistent excellency and mentorship at Amazon.
  • Promoted to a Senior Data Scientist role within a year at CGI, in recognition of the performance and productive client relationships
  • Earned Pat on Back award at Tech Mahindra for exceptional leadership display as a Scrum Master

Timeline

Sr. Data Engineer

Amazon.com
12.2020 - Current

Data Engineer

CGI Group
06.2015 - 12.2020

ETL Analyst

TECH MAHINDRA
02.2013 - 07.2014

Systems Engineer

INFOSYS Ltd
10.2010 - 10.2012

Master of Science in Management Information Systems -

University At Buffalo

Bachelor of Technology in Computer Science -

Vellore Institute of technology University
ABID SHAIK