Summary
Overview
Work History
Education
Skills
Websites
Timeline
Generic

Sakhena Meghana Kuthada

Summary

Experienced Data Engineer with a decade of expertise in designing and optimizing data pipelines across AWS, GCP, and Azure platforms. Successfully implemented large-scale data processing using Hadoop (HDFS, MapReduce), Spark, and Databricks, resulting in enhanced speed and reliability of analytics workflows. Developed efficient ETL frameworks utilizing AWS Glue, Talend, and DataStage to streamline data ingestion and transformation processes. Architected scalable solutions on AWS EC2 with Autoscaling, Azure Synapse, and GCP BigQuery to support high-volume data and complex queries. Proficiently managed real-time data streaming through Kafka, Pub/Sub, and Azure Event Hubs to ensure low-latency analytics and rapid insights. Demonstrated expertise in optimizing database performance in Snowflake, PostgreSQL, MySQL, and DB2, leading to improved query execution and resource utilization. Leveraged Tableau, Power BI, and SSRS as BI and reporting tools to deliver actionable insights through interactive dashboards. Configured robust security measures with IAM Security, VPC Configuration, and best practices to safeguard data against breaches. Automated workflows using shell scripting, Python, and Unix to reduce manual interventions and accelerate deployments. Proven track record of adopting Agile methodologies and Microservices principles to effectively coordinate cross-functional teams and drive continuous integration for data-driven projects. Proactive and goal-oriented professional with excellent time management and problem-solving skills. Known for reliability and adaptability, with swift capacity to learn and apply new skills. Committed to leveraging these qualities to drive team success and contribute to organizational growth.

Overview

10
10
years of professional experience

Work History

Data Engineer

Abbvie
01.2023 - Current
  • Developed scalable data pipelines using AWS Data Pipelines and AWS Glue, ensuring efficient data flow and transformation for analytics platforms
  • Implemented and managed big data solutions on Hadoop ecosystem, utilizing HDFS and MapReduce to process large datasets efficiently
  • Designed and executed queries on Snowflake, optimizing data retrieval and handling for real-time decision-making
  • Configured AWS EC2 instances with Autoscaling, improving system responsiveness and cost-efficiency during varying load conditions
  • Employed Elastic Search for advanced data searches, enhancing the speed and accuracy of data retrieval across multiple data sources
  • Managed data storage solutions using AWS S3 and DynamoDB, ensuring data availability, durability, and security
  • Utilized Databricks and Spark to perform complex data analytics and ETL processes, reducing processing time by 40%
  • Automated data workflows with Talend, integrating various data sources and maintaining a consistent and reliable data environment
  • Optimized data processing with Apache Hive and Impala, streamlining query execution and reducing latency
  • Developed robust data models in PostgreSQL, supporting complex business queries and analytical reports
  • Managed continuous integration pipelines using Jenkins, ensuring smooth deployment of data applications across production environments
  • Scripted advanced data transformation tasks using UNIX Shell Scripting, enhancing automation and minimizing manual data handling
  • Designed and maintained MongoDB and Cassandra databases, optimizing performance for high-volume data handling and real-time access
  • Developed data ingestion frameworks using Sqoop and Pig, facilitating efficient data transfer between databases and Hadoop
  • Implemented security and compliance measures using Hibernate and Spring frameworks, ensuring data integrity and protection
  • Monitored and optimized AWS Redshift performance, delivering high-performance data warehousing solutions to meet business intelligence needs
  • Environment: AWS Services EC2, Autoscaling, Scala, Elastic Search, Snowflake, DynamoDB, UNIX Shell Scripting, AWS S3, AWS Glue, Hadoop (HDFS, MapReduce), Databricks, Spark, Talend, Impala, Hive, PostgreSQL, Jenkins, Nefi, Scala, Mongo DB, Cassandra, Python, Pig, Sqoop, Hibernate, Spring, Oozie, AWS Redshift, AWS data pipelines

Data Engineer

UBS
07.2020 - 12.2022
  • Configured Cloud Storage solutions to securely store and manage large datasets, enhancing data integrity and access speed
  • Developed ETL pipelines in Matillion, transforming complex financial datasets into structured formats for analytics and decision-making
  • Designed and implemented data warehouses using GCP BigQuery, optimizing financial data analysis and reporting processes
  • Managed robust Cloud SQL databases, ensuring high availability and security for transactional processing in financial services
  • Utilized DataStudio for creating dynamic dashboards, providing real-time financial insights to stakeholders and enhancing decision-making
  • Maintained Data Catalogs to ensure data governance, cataloging data across platforms to enhance discoverability and compliance
  • Implemented VPN Google-Client configurations, securing data transfers between cloud environments and on-premise networks
  • Architected Pub/Sub messaging systems, facilitating real-time data flow and integration across diverse financial applications
  • Developed and deployed SSIS packages, automating data integration and workflow processes for efficiency
  • Designed OLAP cubes using SSAS, supporting complex analytical queries and improving report performance
  • Automated reporting processes using SSRS, delivering timely financial reports and custom dashboards to meet business needs
  • Managed data transformations and quality control using DataStage and QualityStage, ensuring high-quality data in data warehouses
  • Administered databases across MySQL, MS-SQL, Oracle, and DB2, optimizing structures for performance and scalability
  • Executed federated queries across multiple databases, simplifying data access and integration without data replication
  • Enhanced data security by configuring IAM roles and policies, safeguarding sensitive financial data in compliance with industry regulations
  • Scripted automation using Python and shell scripts, streamlining data operations and reducing manual intervention in cloud environments
  • Environment: GCP, GCP Big Query, Cloud SQL, Cloud Storage, Matillion, DataStudio, Data Catalog, VPN Google-Client, Pub Sub, SSIS, SSAS, SSRS, DATASTAGE, QUALITYSTAGE, MySQL, MS-SQL, ORACLE, DB2, Federated Queries, IAM Security, Snowflake, GCP Databricks, Service Data Transfer, python, shell scripts, VPC Configuration
  • Enhanced data quality by performing thorough cleaning, validation, and transformation tasks.

Data Engineer

Empower Retirement
04.2018 - 06.2020
  • Automated data integration processes with Azure Data Factory, streamlining data flows and improving the efficiency of ETL operations
  • Configured and maintained Azure Databricks environments, optimizing big data processing and machine learning workflows
  • Implemented version control for data projects using Azure Databricks GIT Hub, ensuring consistency and collaboration in development
  • Managed real-time data messaging using Azure Service Bus, facilitating seamless data exchange across distributed systems
  • Administered Azure SQL databases, optimizing performance and scalability to support high-volume transactional data
  • Developed scalable data architectures using Azure Synapse, enhancing data warehousing capabilities and supporting advanced analytics
  • Designed and deployed business intelligence solutions using Power BI and Tableau, delivering insightful dashboards and reports to stakeholders
  • Integrated Salesforce data (SFDC) with enterprise data platforms, enhancing customer data analysis and supporting CRM initiatives
  • Developed complex SQL and T-SQL scripts for data manipulation and querying, ensuring data integrity and accuracy
  • Utilized Apache Kafka for building real-time data streaming applications, enhancing data availability and decision-making processes
  • Engineered data solutions on Hadoop, leveraging MapReduce and Hive for efficient data processing and storage
  • Optimized data storage and querying using Teradata, enhancing performance for complex analytical queries
  • Programmed automation scripts in Python, improving operational efficiency and reducing manual intervention
  • Configured Azure Event Hubs for event-driven architectures, enabling real-time analytics and system responsiveness
  • Maintained SQL Server 2017 databases, implementing best practices in database management and security
  • Conducted Unix shell scripting, automating system tasks and improving the robustness of data operations
  • Environment: Azure synapse, Azure data factory, Azure Databricks, Azure Databricks GIT Hub, Azure Service Bus, Azure SQL, SQL Server 2017, Tableau, Power BI, SFDC, SQL, T-SQL, Hive, Apache Kafka, Azure, Python, Power BI, Unix, SQL Server, Hadoop, Hive, Map Reduce, Teradata, SQL, Azure event hubs
  • Collaborated on ETL (Extract, Transform, Load) tasks, maintaining data integrity and verifying pipeline stability.

Hadoop Developer

Netenrich Technologies Pvt. Ltd.
06.2015 - 12.2017
  • Implemented real-time data processing pipelines using Apache Kafka and Spark, enabling instant data analysis and supporting business decision-making
  • Designed and maintained HBase databases, optimizing storage and retrieval of NoSQL data, significantly improving system performance and scalability
  • Utilized Hive and Pig for complex data transformations, facilitating efficient querying and data manipulation for analytics purposes
  • Developed scalable big data solutions using Hadoop ecosystems, employing MapReduce jobs to process large datasets, enhancing data handling capabilities
  • Integrated JDBC with Hadoop to connect and interact with relational databases, streamlining data migration and synchronization processes
  • Authored robust ETL scripts in Pig, automating data cleansing and preparation tasks, reducing manual efforts by 30%
  • Configured and managed AWS cloud services for deploying and scaling Hadoop applications, ensuring robust disaster recovery and high availability
  • Developed Microservices in Java 1.7, aligning with Agile methodologies to support continuous integration and deployment practices
  • Optimized data storage and processing using JSON formats for interchanging data within the Hadoop ecosystem, enhancing system interoperability
  • Implemented Agile development practices, leading sprints and promoting iterative development and testing, which increased project delivery efficiency by 25%
  • Environment: Agile, HBase, JSON, Spark, Kafka, JDBC, Hive, JSON, Pig, Hadoop, AWS, Microservices, Java 1.7, MapReduce
  • Enhanced data processing speed by implementing Hadoop-based solutions for large-scale enterprise projects.

Education

Bachelor of Science -

Aditya College Of Engineering & Technology
Kakinada
05-2015

Skills

  • Amazon Web Services (AWS)
  • AWS Services EC2
  • Autoscaling
  • AWS S3
  • AWS Glue
  • AWS Redshift
  • AWS data pipelines
  • DynamoDB
  • Elastic Search
  • Google Cloud Platform (GCP)
  • GCP
  • GCP Big Query
  • Cloud SQL
  • Cloud Storage
  • Matillion
  • DataStudio
  • Data Catalog
  • VPN Google-Client
  • Pub Sub
  • GCP Databricks
  • Service Data Transfer
  • IAM Security
  • Microsoft Azure
  • Azure synapse
  • Azure data factory
  • Azure Databricks
  • Azure Service Bus
  • Azure SQL
  • SQL Server 2017
  • Azure event hubs
  • Database Management
  • PostgreSQL
  • MySQL
  • MS-SQL
  • ORACLE
  • DB2
  • Cassandra
  • Mongo DB
  • Teradata
  • Data Integration/ETL
  • Talend
  • SSIS
  • SSAS
  • SSRS
  • DATASTAGE
  • QUALITYSTAGE
  • Sqoop
  • Oozie
  • Big Data Technologies
  • Hadoop (HDFS, MapReduce)
  • Spark
  • Databricks
  • Impala
  • Hive
  • Pig
  • Programming Languages
  • Scala
  • Python
  • Java 17
  • SQL
  • T-SQL
  • Shell Scripts
  • UNIX
  • Data Visualization
  • Tableau
  • Power BI
  • SFDC
  • Security & Networking
  • VPC Configuration
  • Software Development
  • Hibernate
  • Spring
  • Microservices
  • Streaming & Messaging
  • Apache Kafka
  • AWS Pub Sub
  • Data Formats & Tools
  • JSON
  • JDBC
  • HBase
  • Project Management
  • Agile
  • ETL development

Timeline

Data Engineer

Abbvie
01.2023 - Current

Data Engineer

UBS
07.2020 - 12.2022

Data Engineer

Empower Retirement
04.2018 - 06.2020

Hadoop Developer

Netenrich Technologies Pvt. Ltd.
06.2015 - 12.2017

Bachelor of Science -

Aditya College Of Engineering & Technology
Sakhena Meghana Kuthada