Summary
Overview
Work History
Education
Skills
Timeline
Generic

Shiva Myakala

Irving,TX

Summary


  • Data engineer with over 4 years of experience, proficient in managing large-scale data infrastructure, designing robust pipelines, and implementing ETL processes. Skilled in various programming languages, database technologies, and cloud platforms, with a strong collaboration background, delivering innovative solutions and optimizing data processes for organizational growth.
  • Strong Experience in working with Databases like Teradata and proficiency in writing complex SQL, PL/SQL for creating tables, views, indexes, stored procedures, and functions.
  • Designing and Developed Oracle PL/SQL and Shell Scripts performed Data Import/Export, Data Conversions, and Data Cleansing.
  • Ability to write the Business Requirements Document (BRD), gather and develop UAT Test Plans, administer the Traceability Matrix, and assist with post-implementation tasks.
  • Experience with Microsoft Azure services, such as Azure Data Factory (ADF), Azure Synapse Analytics, Databricks, Cognitive Search, Power platform, and Blob storage.
  • Used Azure Data Factory extensively for ingesting data from disparate source systems.
  • Develop intuitive visualizations and interactive dashboards both in the Power BI and Tableau environments.
  • Worked on Docker based containers for using Airflow.
  • Adequate knowledge and working experience in Agile and Waterfall Methodologies. Defining user stories and driving the agile board in JIRA during project execution, participate in sprint demo and retrospective.
  • Developed spark applications for performing large scale transformations and denormalization of relational datasets.
  • Built data pipelines, data architectures, and data sets. Integrated large, disconnected datasets using ETL / ELT Tools in on-premises as well as cloud environments.
  • Written python code for gathering the data from AWS snowflake, data preprocessing, feature extraction, feature engineering, modeling, evaluating the model, deployment.
  • Extensively worked on Spark using Scala, PySpark on the cluster for computational analytics, installed it on top of Hadoop, and performed advanced analytical applications by using Spark.
  • Expertise in AWS Resources like EC2, S3, EBS, VPC, ELB, SNS, RDS, IAM, Route 53, Auto scaling, Cloud Formation, Cloud Watch, Athena, and Security Groups.
  • Using dynamic SQL to extract information from unstructured JSON data and store it in relational data tables for analysis.
  • Developed database models, views, and APIs using Python for interactive web-based solutions.

Overview

6
6
years of professional experience

Work History

AWS Data Engineer

BlueCross BlueShield of Michigan
08.2022 - Current
  • Configured Spark streaming to get ongoing information from Kafka and store the stream information to HDFS
  • Working on enhancing the performance of the data pipeline in terms of speed as well are improving the quality of data to achieve 100% referential integrity
  • Proficient in SQL and have experience in developing Spark programs using Scala
  • Worked on setting up intervals, split intervals, and window intervals in Spark Streaming
  • Expertise in AWS security architecture with IAM, KMS, Cognito, API Gateway, CloudTrail, CloudWatch, Security Groups, NACL, and Route 53
  • Proficient in AWS services such as S3, Glue, Redshift, and EMR, and Azure services such as Azure Data Factory, Azure Data Lake Storage, and Azure Synapse Analytic
  • Implemented real-time data streaming pipeline using Athena, AWS Kinesis, Airflow, Dynamo DB, and deployed AWS Lambda code from Amazon S3 buckets
  • Skillfully configured batch intervals, split intervals, and window intervals in Spark Streaming for optimal data processing
  • Created PL/SQL objects (Procedures, Functions, Packages)
  • Used AWS services such as AWS EMR, AWS Lambda, Amazon Redshift, AWS Glue, AWS CFT, IAM, KMS, API Gateway
  • Successfully deployed automation on AWS with IAM, CloudFormation, CloudWatch, Lambda, and IAM service management, including custom user/group creation
  • Implemented data quality checks with Spark Streaming, effectively handling clean and faulty data with flags.

Azure Data Engineer

IBM India Pvt Ltd
04.2019 - 07.2021
  • Used Azure Data Factory as an orchestration tool for integrating data from upstream to downstream systems
  • Experience on Azure Databricks cloud to organize the data into notebooks and making it easy to visualize data through the use of dashboards
  • Analyzed the data flow from different sources to target to provide the corresponding design Architecture in Azure environment
  • Integrated existing APIs to Azure API management to get all the attributes like security, usage plans, throttling, analytics, monitoring, and alerts
  • Hands-on experience on complete software Development Life Cycle SDLC for the projects using methodologies like agile and hybrid methods
  • Implement One time Data Migration of Multistate level data from SQL server to Snowflake by using Python and Snow SQL
  • Worked with complex SQL, Stored Procedures, Triggers, and packages in large databases from various servers
  • Built different visualizations and reports in tableau using Snowflake data
  • Created High-level technical design documents and Application design documents as per the requirements and delivered clear, well-communicated and complete design documents
  • Built database Model, Views and API's using Python for interactive web-based solutions
  • Created dashboards in Tableau Desktop and published them onto Tableau Server
  • I also worked on POC for GCP for migrating the data GCP to configure the services Data Proc, Storage, and Big Query.

Data Engineer

Infotech Solutions
05.2018 - 03.2019
  • Experience in Developing Spark applications using Spark - SQL in Databricks for data extraction, transformation and aggregation from multiple file formats for analyzing & transforming the data to uncover insights into the customer usage patterns
  • Managed data loading into an OLAP application and performed aggregations for analysis
  • Wrote SQL scripts to meet business requirements
  • Developed Spark applications using PySpark and Spark-SQL for data extraction, transformation, and aggregation from multiple file formats
  • Review code developed by the team and validated the test results
  • Worked with SCRUM team in delivering agreed user stories on time for every Sprint
  • Developed Spark applications using Pyspark and Spark-SQL for data extraction, transformation and aggregation from multiple file formats for analyzing & transforming the data to uncover insights into the customer usage patterns
  • Responsible for estimating the cluster size, monitoring and troubleshooting of the Spark data bricks cluster.

Education

Masters in Applied Computer Science -

Northwest Missouri State University
12.2022

Computer Science and Engineering -

Jawaharlal Nehru Technological University
05.2018

Skills

ETL/ Data Warehouse Tools: Informatica Power Centre, Talend, Pentaho, SSIS, DataStage
Querying Languages: SQL, NoSQL, PostgreSQL, MySQL, Spark-SQL, Sqoop 1.4.4
Programming Languages: PYTHON, Scala, Hibernate, JDBC, JSON, HTML, CSS, SQL, R, Shell Scripting
Databases: Oracle, SQL Server, MySQL, Cassandra, Teradata, PostgreSQL, MS Access, Snowflake, NoSQL, Database (HBase, MongoDB).
Visualization: Tableau, Power BI
OLAP/Reporting: SQL Server Analysis Services and Reporting Services.
Integration Tools: Git, Gerrit, Jenkins, Maven
Methodologies: Agile, Scrum, Waterfall UML

Timeline

AWS Data Engineer

BlueCross BlueShield of Michigan
08.2022 - Current

Azure Data Engineer

IBM India Pvt Ltd
04.2019 - 07.2021

Data Engineer

Infotech Solutions
05.2018 - 03.2019

Masters in Applied Computer Science -

Northwest Missouri State University

Computer Science and Engineering -

Jawaharlal Nehru Technological University
Shiva Myakala