Summary
Overview
Work History
Education
Skills
Certification
Projects
Timeline
Generic

Deekshith Mulakala

Denton,TX

Summary

  • Demonstrating solid proficiency in AWS services such as Amazon EC2, S3, EMR, Amazon RDS, VPC, Amazon Elastic Load Balancing, IAM, Auto Scaling, Cloud Front, CloudWatch, and Lambda, effectively utilizing them to trigger various resources.
  • Building data pipelines using Azure Data Factory, Azure Databricks, and loading data into Azure Data Lake, Azure SQL Database, and Azure SQL Data Warehouse while efficiently managing and granting database access.
  • Exhibiting substantial experience with Azure services including HDInsight, Stream Analytics, Active Directory, Blob Storage, Cosmos DB, and Storage Explorer.
  • Proficient in utilizing Python frameworks like Flask and libraries such as Pandas, NumPy, Matplotlib, Natural Language Processing, Scikit-Learn, and Seaborn for data processing, analysis, and visualization.
  • Proficient scripting abilities with Python (PySpark), Scala, and Spark-SQL for development and aggregation from various file formats, including XML, JSON, CSV, and Parquet.
  • Extensive experience in data analysis through HiveQL, Hive-ACID tables, Pig Latin queries, custom MapReduce programs, and achieving enhanced performance.
  • Profound knowledge spanning all phases of Data Acquisition, Data Warehousing (requirements gathering, design, development, implementation, testing, and documentation), Data Modeling (Star Schema and Snowflake for FACT and Dimensions Tables), Data Processing, and Data Transformations (Mapping, Cleansing, Monitoring, Debugging, Performance Tuning and Troubleshooting Hadoop clusters).
  • Hands-on experience with Ad-hoc queries, Indexing, Replication, Load balancing, and Aggregation in MongoDB.
  • Expertise in creating Kubernetes clusters using cloud formation templates and PowerShell scripting to automate deployment in a cloud environment.
  • Sound knowledge in developing highly scalable and resilient Restful APIs, ETL solutions, and third-party integrations for Enterprise Site platforms using Informatica.
  • Proficient use of bug tracking and ticketing systems such as Jira and Remedy, with version control managed through Git and SVN.
  • Regular collaboration with the business, production support, and engineering teams to deeply analyze data, facilitate effective decision-making, and support analytics platforms.

Overview

5
5
years of professional experience
1
1
Certification

Work History

Data Engineer

Munich Re - America
08.2023 - Current
  • Proficient in Azure services (HDInsight, Databricks, Data Lake, Blob, Data Factory, Synapse, SQL DB, SQL DWH) and managed healthcare datasets exceeding 10TB, optimizing cloud data management and reducing storage costs while maintaining 99.9% data uptime for databases over 50TB.
  • Integrated Azure Synapse Analytics with enterprise data warehousing and Big Data analytics, enabling data-driven insights, improving healthcare resource allocation by 25%, and enhancing data governance using Informatica’s metadata management and lineage tracking.
  • Developed Power BI dashboards and re-engineered SQL scripts with PySpark, improving system performance by 25%, accelerating data ingestion by 70%, and delivering real-time insights for efficient healthcare operations.
  • Implemented CI/CD pipelines in Azure Cloud, reducing deployment time by 50% and achieving 99% system uptime, ensuring uninterrupted services and streamlined healthcare workflows.

Data Engineer

David's Bridal
01.2023 - 07.2023
  • Designed and implemented 10 scalable, automated data pipelines using technologies like Amazon Redshift, S3, Glue, and Databricks Delta Lake, efficiently processing millions of data points and supporting terabytes of data analytics with Spark.
  • Improved database and data warehouse performance by 32% through query tuning, indexing, and optimized AWS services (RDS, Redshift, and Athena), enhancing data accessibility and processing efficiency.
  • Developed Python and PySpark programs for ETL operations, extracting data from S3, transforming it with Spark SQL and Scala, and loading it into SQL Server and Hive tables, enabling data-driven insights and streamlined workflows.
  • Automated CI/CD pipelines using AWS Code Pipeline, Jenkins, and Code Deploy, ensuring efficient and reliable deployments while enhancing metadata management with PySpark scripts and data cataloging.

Data Engineer

JMS IT
06.2020 - 12.2021
  • Designed and developed business requirements based on agile methodologies, creating Talend jobs in Big Data environments, leveraging Spark's in-memory capabilities for processing large datasets on S3 Data Lake, and conducting ETL operations using Python, Spark SQL, S3, and Redshift to extract valuable customer insights.
  • Performed architecture assessments and implementations of AWS services like EMR, Redshift, and S3, transforming and migrating large datasets with AWS Glue dynamic frames and cataloging data for efficient access and analysis.
  • Integrated Power BI with AWS services for comprehensive analytics, crafted and modified SQL procedures and functions, and developed Python scripts to handle diverse data formats from XML, CSV, and Excel sources, ensuring data quality and governance.
  • Configured IAM policies, roles, and users for secure AWS administration, and automated CI/CD pipelines using Jenkins, Terraform, and AWS, enabling seamless data transformation and movement across AWS data stores and databases.

Data Analyst

Micron
01.2020 - 06.2020
  • Conducted data analysis using Python, R, and SQL to generate actionable insights, performed statistical testing and regression analyses, and created detailed reports to translate findings into stakeholder-friendly insights.
  • Developed and maintained Tableau dashboards to enhance data-driven decision-making and collaborated with management and customers to meet end-user requirements through critical thinking and technical innovation.
  • Automated system processes using PowerShell scripts, improving efficiency and reducing manual tasks, while supporting production systems and working with relational databases using SQL.
  • Participated in Agile development cycles, contributing to sprint planning and retrospectives, and aligned technical solutions with business strategies for optimal outcomes.

Education

Bachelor's - Computer Science, Cyber Security

Vel Tech University
Chennai, India
05.2020

Master of Science - Cyber Security

University of North Texas
Denton, TX
05.2023

Skills

    Communication Problem Solving Leadership Analytical Skills Client Management Teamwork Ethical Conduct

    Programming/Scripting: PythonSQLC#

    Data Engineering: ETLData modellingData warehouse architecture

    AWS Cloud Services: RedshiftS3AWS GlueEMRLambdaKinesisFireHose

    Databases: MySQLPostgreSQLnon-relational databasesRDBMS

    Data Visualization: Power BITableauExcelLookerMS Office Suite VS Code

Certification

  • Machine Learning & AI using Python and R, 2024
  • Cisco Certified Basic Fundamentals of Data, 2021
  • AWS Certified Solutions Architect Associate, 2024
  • Microsoft Certified Azure Developer Associate, 2024

Projects

Data Backup and Recovery Solution, Implement a backup solution using S3 for storing backups of non-relational databases (e.g., DynamoDB or document stores), S3, Lambda for automating backup processes, IAM for permissions, Hadoop, Pyspark for Bigdata Related Serverless Web Application, Build a web application using AWS Lambda and API Gateway to serve data from a non-relational database like DynamoDB or a graph database like Neptune., Lambda, API Gateway, IAM roles, DynamoDB or Neptune

Timeline

Data Engineer

Munich Re - America
08.2023 - Current

Data Engineer

David's Bridal
01.2023 - 07.2023

Data Engineer

JMS IT
06.2020 - 12.2021

Data Analyst

Micron
01.2020 - 06.2020

Bachelor's - Computer Science, Cyber Security

Vel Tech University

Master of Science - Cyber Security

University of North Texas
Deekshith Mulakala