Summary
Overview
Work History
Education
Skills
Timeline
Generic

MOUNIKA K

Frisco,TX

Summary

Data Engineering professional having around 8+ years of experience in a variety of data platforms, with hands on experience in Big Data Engineering and Data Analytics. Practical Database Engineer possessing in-depth knowledge of data manipulation techniques and computer programming paired with expertise in integrating and implementing new software packages and new products into system. Offering 8-year background managing various aspects of development, design and delivery of database solutions. Tech-savvy and independent professional bringing outstanding communication and organizational abilities.

Overview

9
9
years of professional experience

Work History

Sr. Data Engineer

Ecolab
St Paul, MN
06.2022 - Current
  • Responsible for provisioning key AWS Cloud services and configure them for scalability, flexibility, and cost optimization
  • Create VPCs, subnets including private and public, NAT gateways in a multi-region, multi-zone infrastructure landscape to manage its worldwide operation
  • Manage Amazon Web Services (AWS) infrastructure with orchestration tools such as CFT, Terraform and Jenkins Pipeline O Create Terraform scripts to automate deployment of EC2 Instance, S3, EFS, EBS, IAM Roles, Snapshots and Jenkins Server Build Cloud data stores in S3 storage with logical layers built for Raw, Curated and transformed data management
  • Create data ingestion modules using AWS Glue for loading data in various layers in S3 and reporting using Athena and Quick Sight
  • Create manage bucket policies and lifecycle for S3 storage as per organizations and compliance guidelines
  • Create parameters and SSM documents using AWS Systems Manager Established CICD tools such as Jenkins and Git Bucket for code repository, build and deployment of the python code base
  • Build Glue Jobs for technical data cleansing such as deduplication, NULL value imputation and other redundant column removal
  • Also build Glue jobs to build standard data transformations (date/string and Math operations) and Business transformations required by business users
  • Used Kinesis Family (Kinesis Data streams, Kinesis Firehose, Kinesis Data Analytics) for collection, processing and analyze the streaming data
  • Create Athena data sources on S3 buckets for adhoc querying and business dashboarding using Quicksight and Tableau reporting tools
  • Copy Fact/Dimension and aggregate output from S3 to Redshift for Historical data analysis using Tableau and Quick sight
  • Use Lambda functions and Step Functions to trigger Glue Jobs and orchestrate the data pipeline.

Sr. Data Engineer

Ditech
Fort Washington, PA
10.2020 - 05.2022
  • Designed and setup Enterprise Data Lake to provide support for various uses cases including Storing, processing, Analytics and Reporting of voluminous, rapidly changing data by using various AWS Services
  • Used various AWS services including S3, EC2, AWS Glue, Athena, RedShift, EMR, SNS, SQS, DMS, Kinesis
  • Extracted data from multiple source systems S3, Redshift, RDS and Created multiple tables/databases in Glue Catalog by creating Glue Crawlers
  • Created AWS Glue crawlers for crawling the source data in S3 and RDS
  • Created multiple Glue ETL jobs in Glue Studio and then processed the data by using different transformations and then loaded into S3, Redshift and RDS
  • Created multiple Recipes in Glue Data Brew and then used in various Glue ETL Jobs
  • Design and Develop ETL Processes in AWS Glue to migrate data from external sources like S3, Parquet/Text Files into AWS Redshift
  • Used AWS glue catalog with crawler to get the data from S3 and perform SQL query operations using AWS Athena
  • Written PySpark job in AWS Glue to merge data from multiple tables and in Utilizing Crawler to populate AWS Glue Data Catalog with metadata table definitions
  • Used AWS Glue for transformations and AWS Lambda to automate the process
  • Used AWS EMR to transform and move large amounts of data into and out of AWS S3
  • Created monitors, alarms, notifications and logs for Lambda functions, Glue Jobs using CloudWatch
  • Performed end-to-end Architecture & implementation assessment of various AWS services like Amazon EMR, Redshift and 53
  • Used AWS EMR to transform and move large amounts of data into and out of other AWS data stores and databases, such as Amazon Simple Storage Service (Amazon S3) and Amazon DynamoDB
  • To analyze the data Vastly used Athena to run multiple queries on processed data from Glue ETL Jobs and then used Quick Sight to generate Reports for Business Intelligence
  • Used AWS EMR to transform and move large amounts of data into and out of AWS S3
  • Used DMS to migrate tables from homogeneous and heterogenous DBs from On-premises to AWS Cloud
  • Created Kinesis Data streams, Kinesis Data Firehose and Kinesis Data Analytics to capture and process the streaming data and then output into S3, Dynamo DB and Redshift for storage and analyzation
  • Created Lambda functions to run the AWS Glue job based on the AWS S3 events.

Data Engineer

Fifth Third Bank
Evansville, IN
07.2018 - 09.2020
  • Built S3 buckets and managed policies for S3 buckets and used S3 glacier for storage and backup on AWS
  • Designed, built, and coordinated an automated build & release CI/CD process using GitLab, Jenkins and Puppet on hybrid IT infrastructure
  • Involved in designing and developing Amazon EC2, Amazon S3, Amazon RDS, Amazon Elastic Load Balancing, Amazon SWF, Amazon SQS, and other services of the AWS infrastructure
  • Running build jobs and integration tests on Jenkins Master/Slave configuration
  • Conduct systems design, feasibility and cost studies and recommend cost-effective cloud solutions such as Amazon Web Services (AWS)
  • Involved in maintaining the reliability, availability, and performance of Amazon Elastic Compute Cloud (Amazon EC2) instances
  • Managed Servers on the Amazon Web Services (AWS) platform instances using Puppet configuration management
  • Integrated services like GitHub, AWS Code pipeline, Jenkins, and AWS Elastic Beanstalk to create a deployment pipeline
  • Involved in complete SDLC life cycle - Designing, Coding, Testing, Debugging and Production Support
  • Coordinate/assist developers with establishing and applying appropriate branching, labeling/naming conventions using Git
  • Used Kubernetes to deploy scale, load balance, scale and manage docker containers
  • Worked on JIRA for defect/issues logging & tracking and documented all my work using CONFLUENCE
  • Branching, Merging, Release Activities on Version Control Tool GIT
  • Used GitHub as version control to store source code and implemented Git for branching and merging operations.

Data Engineer

Amigos Software Solutions
Hyd, India
01.2017 - 04.2018
  • Responsible for building scalable distributed data solutions using Hadoop
  • Demonstrated a strong comprehension of project scope, data extraction, design of dependent and profile variables, logic and design of data cleaning, exploratory data analysis and statistical methods
  • Used spark steaming APIs to perform necessary transformations for building the common learner data model which gets data from Kafka in near real time and persists into Hive
  • Developed Spark scripts by using Python as per the requirements
  • Developed real time data pipeline using Spark to ingest customer events/activity data into Hive and Cassandra from Kafka
  • Performed Spark jobs optimization and performance tuning to improve running time and resources
  • Worked on reading and writing multiple data formats like JSON, AVRO, Parquet, ORC on HDFS using Pyspark
  • Designed, developed, and did maintenance of data integration in Hadoop and RDBMS environment with both traditional and non-traditional source system as well as RDBMS and NoSQL data stores for data access and analysis
  • Involved in recovery of Hadoop clusters and worked on cluster size of 310 nodes
  • Worked on creating Hive tables, loading, and analyzing data using Hive queries
  • Experience in proving application support for Jenkins
  • Developed a data pipeline with AWS to extract the data from weblogs and store in HDFS
  • Used Hive QL to analyze the partitioned and bucketed data and compute various metrics for reporting
  • Used reporting tools like Tableau and Power BI to connect with Hive for generating daily reports of data.

Software Associate

Careator Technologies Pvt Ltd
Hyderabad, India
08.2014 - 12.2016
  • Designed and Developed data integration/engineering workflows on big data technologies and platforms - Hadoop, Spark, MapReduce, Hive, HBase
  • Involved in gathering business requirements, logical modeling, physical database design, data sourcing and data transformation, data loading, SQL, and performance tuning
  • Used SSIS to populate data from various data sources, creating packages for different data loading operations for applications
  • Transformed and analyzed the data using Pyspark, HIVE, based on ETL mappings
  • Experience in developing Spark applications using Spark-SQL in Databricks for data extraction, transformation, and aggregation from multiple file formats
  • Developed and executed a migration strategy to move Data Warehouse from an Oracle platform to AWS Redshift
  • Performance tuning: RDBMS are used to manage large volumes of data, and performance issues can arise due to inefficient queries, indexing, or other factors
  • A data engineer should have experience in performance tuning and optimization of RDBMS
  • Created Data visualizations using Databricks' integrated visualization tools and third-party tool Power BI
  • Implement and deliver MSBI platform solutions to develop and deploy ETL, AWS, Azure, Azure analytical, data analytics, reporting, and scorecard/dashboards on SQL Server using SSIS and SSRS
  • Extensively worked with SSIS tool suite, designed and created mapping using various SSIS transformations like OLEDB command, Conditional Split, Lookup, Aggregator, Multicast, and Derived Column
  • Scheduled and executed SSIS Packages using SQL Server Agent and Development of automated daily, weekly, and monthly system maintenance tasks such as database backup, Database Integrity verification, indexing, and statistics updates
  • Worked extensively on SQL, PL/SQL, Scala, and UNIX shell scripting
  • Expertise in creating PL/ SQL Procedures, Functions, Triggers, and cursors
  • Developing under scrum methodology and in a CI/CD environment using Jenkins
  • Designed and documented the entire Architecture of Power BI POC
  • Utilized Unix Shell Scripts for adding the header to the flat file targets
  • Deep analysis of SQL execution plan and recommend hints or restructure or introduce index or materialized view for better performance
  • Deploy EC2 instances for Oracle database.

Education

Bachelor of Arts -

Osmania University
04-2014

Skills

  • PyCharm
  • Eclipse
  • Visual Studio
  • SQL
  • Plus
  • SQL Developer
  • TOAD
  • SQL Navigator
  • Query Analyzer
  • SQL Server Management Studio
  • SQL Assistance
  • Postman
  • Windows 7/8/XP/2008/2012
  • Ubuntu Linux
  • MacOS
  • Django REST framework
  • MVC
  • Hortonworks
  • Oracle
  • MySQL
  • SQL Server
  • MongoDB
  • Cassandra
  • DynamoDB
  • PostgreSQL
  • Teradata
  • Cosmos
  • Python
  • PySpark
  • Scala
  • Java
  • C
  • C
  • Shell script
  • Perl script
  • SVN
  • Git
  • GitHub
  • Hadoop
  • MapReduce
  • HDFS
  • Sqoop
  • PIG
  • Hive
  • HBase
  • Oozie
  • Flume
  • NiFi
  • Kafka
  • Zookeeper
  • Yarn
  • Apache Spark
  • Mahout
  • SparkMLIib
  • AWS
  • Microsoft Azure

Timeline

Sr. Data Engineer

Ecolab
06.2022 - Current

Sr. Data Engineer

Ditech
10.2020 - 05.2022

Data Engineer

Fifth Third Bank
07.2018 - 09.2020

Data Engineer

Amigos Software Solutions
01.2017 - 04.2018

Software Associate

Careator Technologies Pvt Ltd
08.2014 - 12.2016

Bachelor of Arts -

Osmania University
MOUNIKA K