Summary

Overview

Work History

Education

Skills

Certification

Timeline

Charitha Kandula

USA

Summary

Data Engineer with 3+ years of experience in designing and optimizing data pipelines on AWS and Azure. Expert in ETL processes, big data (Hadoop, Spark, Kafka), and programming in Python, SQL, and Scala. Proven success in cloud migrations, infrastructure automation (Terraform), workflow orchestration (Airflow), data visualization (Tableau, Power BI), financial analysis, machine learning, and IoT. MS in Computer Science with multiple academic excellence awards. Motivated to tackle new challenges.

Overview

years of professional experience

Certification

Work History

Data Engineer (Contract)

Wells Fargo

San Francisco, CA

02.2024 - 08.2024

Migrated an existing on-premises application to AWS
Used AWS services like EC2 and S3 for small data sets processing and storage, Experienced in Maintaining the Hadoop cluster on AWS EMR
Writing Pig and Hive scripts with UDF in MR and Python to perform ETL on AWS Cloud Services
Imported data from AWS S3 into Spark RDD, Performed transformations and actions on RDDs
Involved in converting Hive/SQL queries into Spark transformations using Spark SQL, Python and Scala
Worked on Apache spark writing Python applications to convert txt, xls files and parse
Written Terraform scripts to automate AWS services which include ELB, CloudFront distribution, RDS, EC2, database security groups, Route 53, VPC, Subnets, Security Groups, and S3 Bucket and converted existing AWS infrastructure to AWS Lambda deployed via Terraform and AWS Cloud Formation
Worked on data cleaning and ensured data quality, consistency, integrity using Pandas, NumPy
Create several types of data visualizations using Python and Tableau
Designed and develop Tableau visualizations which include preparing Dashboards using calculations, parameters, calculated fields, groups, sets, and hierarchies
Developed multiple POCs using Scala and deployed on the Yarn cluster, compared the performance of Spark, with Hive and SQL/Teradata
Created scripts to read CSV, JSON files from S3 buckets in Python and load them into AWS S3, and DynamoDB
Developed Map Reduce jobs in Python for data cleaning and data processing
Connecting MySQL database through spark driver
Designed and implemented real-time and batch workflows
Worked on implementation of Audit Balance and Control Framework with Airflow as the backend
Authored Dags for Use cases with Spark, Python, Java, etc., in Airflow
Import of data using Scoop from Oracle to HDFS
Developed analytical component using Scala, Spark and Spark Stream
Environment: Python, Scala, SQL, AWS, S3, EC2, EMR, Lambda, RDS, Hadoop, Spark, Hive, Pig, Scoop, MySQL, Tableau, Oracle, Airflow, Teradata, Java, J2EE.

Graduate Teaching Assistant- Data Science

Northern Arizona University

AZ, USA

01.2023 - 12.2023

Assist in preparing and organizing course materials, including lecture slides and lab exercises, with a focus on data science, machine learning, and artificial intelligence
Help students with course-related questions and technical issues related to data preprocessing, feature engineering, and machine learning model development during office hours or tutoring sessions
Grade assignments, exams, and projects, providing constructive feedback on topics such as supervised and unsupervised learning algorithms, model tuning, and AI model evaluation metrics
Supervise and support students during practical lab sessions, ensuring proper use of data science tools, machine learning frameworks (e.g., TensorFlow, PyTorch), and statistical analysis methods
Manage online course content and maintain communication channels between students and the instructor, including troubleshooting technical problems related to data science platforms and tools
Handle administrative tasks such as tracking attendance and performance metrics and assist with the management of course-related databases and data pipelines
Support the instructor in delivering lectures or presentations, including setting up and demonstrating advanced AI models, neural networks, and machine learning algorithms.

Data Engineer

CGI

Hyderabad, India

07.2020 - 08.2022

Developed and maintained Python scripts for automating data processing tasks, including data cleaning, transformation, and integration
This involved writing efficient, reusable code and debugging issues to ensure reliable data pipelines
Optimized the SQL Server database structure to facilitate quicker access to information, addressing customer-reported incidents
Worked with JSON, CSV, Sequential, and Text file formats
Achieved 90% service precision by deploying and managing services using Azure Kubernetes Services
Imported data from Microsoft SQL Server to Azure Data Lake Gen2 utilizing tools in Azure Data Factory
Created workflows, and mappings using Informatica ETL and worked with different transformations such as lookup, source qualifier, update strategy, router, sequence generator, aggregator, rank, stored procedure, filter, joiner, and sorter
Utilized Azure Monitor to track and analyze system performance metrics, identifying bottlenecks and optimizing query execution plans for improved performance
Created workflows, and mappings using Informatica ETL and worked with different transformations such as lookup, source qualifier, update strategy, router, sequence generator, aggregator, rank, stored procedure, filter, joiner, and sorter
Integrated Azure Databricks with Azure Synapse Analytics for efficient query processing and analytics on Databricks-managed datasets
Worked on Python scripting to automate generation of scripts
Data curation is done using Azure data bricks
Experience in Reporting Services, Power BI (Dashboard Reports), Crystal Reports, SSRS using MS SQL Server and in supporting services MDX technology of the analysis services
Developed Power BI reports and dashboards from multiple data sources using data blending
Applying statistical methods to analyze data sets and draw meaningful conclusions
Authored optimized queries on databases to retrieve and verify information related to support cases
Environment: Python, SQL, JavaScript, Azure, Data Factory, Data Lake, Databricks, Synapse Analytics, Microsoft Excel, SQL Server, ETL, Informatica, Microsoft Excel, Power BI.

Education

Masters Degree - Computer Science

Northern Arizona University

Arizona City, AZ

12.2023

Bachelors Degree - Computer Science

Vel Tech University

India

05-2022

Skills

Programming Languages: Python 37/27, C, C, SQL

Database Tools: Oracle, MS SQL Server, MySQL, PL/SQL, Teradata

Reporting Tools: Power BI, Tableau

Web Programming: HTML, CSS

Cloud Technologies: AWS (S3, EC2, EMR, Lambda, RDS), Azure (Data Factory, Data Lake, Databricks, Synapse Analytics)

Data Formats: CSV, JSON, TXT, XML

Operating systems: Windows, Mac, Linux, Unix

Technologies/Tools/IDEs: PyCharm, Visual Studio, Jupyter Notebook, Eclipse, DBeaver

Big Data Technologies: Hadoop, Spark, Hive, Kafka, MapReduce

Certification

Python- Expert Level.
MATLAB
SQL for Data Science
Microsoft Power BI
Microsoft Azure

Timeline

Data Engineer (Contract)

Wells Fargo

02.2024 - 08.2024

Graduate Teaching Assistant- Data Science

Northern Arizona University

01.2023 - 12.2023

Data Engineer

CGI

07.2020 - 08.2022

Masters Degree - Computer Science

Northern Arizona University

Bachelors Degree - Computer Science

Vel Tech University

Charitha Kandula

Summary

Overview

Work History

Data Engineer (Contract)

Graduate Teaching Assistant- Data Science

Data Engineer

Education

Masters Degree - Computer Science

Bachelors Degree - Computer Science

Skills

Certification

Timeline

Data Engineer (Contract)

Graduate Teaching Assistant- Data Science

Data Engineer

Masters Degree - Computer Science

Bachelors Degree - Computer Science

Similar Profiles

PATRICIA M. SCHERERPATRICIA M. SCHERER

Andrew S. HollenbachAndrew S. Hollenbach

Laxmikanth ChittampallyLaxmikanth Chittampally

Mayank OberoiMayank Oberoi

Julian engineerJulian engineer