Summary

Overview

Work History

Education

Skills

Websites

Timeline

Priyanka Kosuri

Pine Brook,NJ

Summary

Around 5 years of experience in systems analysis, design, and development in the fields of java, Data Warehousing,
Hadoop Ecosystem, AWS Cloud Data Engineering, Data Visualization, Reporting and Data Quality Solutions.
Good experience in Amazon Web Services like S3, IAM, EC2, EMR, Kinesis, VPC, Dynamo DB, RedShift, Amazon RDS, Lambda, Athena, Glue, DMS, Quick Sight, Amazon Elastic Load Balancing, Auto Scaling, CloudWatch, SNS, SQS and other services of the AWS family.
Hands on experience in Data Analytics Services such as Athena, Glue, Data Catalog & Quick Sight.
Hands on expertise with AWS Databases such as RDS(Aurora), Redshift, DynamoDB and Elastic Cache (Memcached & Redis).
Experience in developing the Hadoop based applications using HDFS, MapReduce, Spark, Hive, Sqoop, HBase and Oozie.
Hands on experience in Architecting Legacy Data Migration projects on - premises to AWS Cloud.
Wrote AWS Lambda functions in python for AWS's Lambda which invokes python scripts to perform various transformations
and analytics on large data sets in EMR clusters.
Hands on experience on tools like Hive for data analysis and Sqoop for data ingestion and Oozie for scheduling.
Experience in scheduling and configuring the oozie and also having good experience in writing Oozie workflow and coordinators.
Worked on different file formats like JSON, XML, CSV, ORC, Paraquet.
Experience in processing both structured and semi structured Data with the given file formats.
Good knowledge in Kafka and Flume.:
Experience in Java, Java EE (2ee) technologies and proficient in Core Java, Servlets, JSP, EJB, JDBC, XML, and spring, Struts and Hibernate and RESTful Webservices.
Proven knowledge of standards-compliant, cross-browser compatible HTML, CSS, JavaScript, and Ajax.
Having good experience in different SDLC models including Waterfall, V-Model and Agile.
Demonstrated proficiency in Microsoft Office suite (Excel, Word, PowerPoint) to create comprehensive reports, presentations, and documentation for internal and external stakeholders.
Collaborated with cross-functional teams to streamline data collection processes, improving efficiency by 20%.
Employed SAP ERP system to manage inventory and streamline procurement processes, enhancing operational efficiency.
Maintained a high level of customer service orientation by promptly addressing client inquiries and concerns, leading to a 95% satisfaction rate.
Presented findings and recommendations to senior management through clear and concise communication, leveraging strong presentation skills.
Actively sought out information and remained up-to-date with industry trends, enhancing analytical and conceptual thinking abilities.
Thrived in a fast-paced environment by prioritizing tasks effectively and delivering high-quality results under tight deadlines.
Demonstrated organizational awareness and commitment by adapting to evolving business needs and fostering a collaborative work environment.

Overview

years of professional experience

Work History

AWS Data Engineer

Comcast

Philadelphia, PA

01.2023 - 01.2024

Designed and setup Enterprise Data Lake to provide support for various uses cases including Storing, processing, Analytics and Reporting of voluminous, rapidly changing data by using various AWS Services.
Used various AWS services including S3,EC2, AWS Glue, Athena, RedShift, EMR,SNS,SQS, DMS, Kenesis.
Extracted data from multiple source systems S3, Redshift, RDS and Created multiple tables/databases in Glue Catalog by
creating Glue Crawlers.
Created AWS Glue crawlers for crawling the source data in SB and RDS.
Created multiple Glue ETL jobs in Glue Studio and then processed the data by using different transformations and then loaded into S3, Redshift and RDS.
Created multiple Recipes in Glue Data Brew and then used in various Glue ETL Jobs.
Design and Develop ETL Processes in AWS Glue to migrate data from external sources like 53, Parquet/Text Files into AWS
Redshift.
Used AWS glue catalog with crawler to get the data from S3 and perform SQL query operations using AWS Athena.
Written PySpark job in AWS Glue to merge data from multiple tables and in Utilizing Crawler to populate AWS Glue data Catalog with metadata table definitions.
Used AWS Glue for transformations and AWS Lambda to automate the process.
Used AWS EMR to transform and move large amounts of data into and out of other
Created monitors, alarms, notifications and logs for Lambda functions, Glue Jobs using CloudWatch.
Performed end-to-end Architecture & implementation assessment of various AWS services like Amazon EMR. Redshift and S3.
Used AWS EMR to transform and move large amounts of data into and out of other AWS data stores and databases, such as Amazon Simple Storage Service (Amazon S3) and Amazon DynamoDB.
Created Kinesis Data streams, Kinesis Data Firehose and Kinesis Data Analytics to capture and process the streaming data and then output into 53, Dynamo DB and Redshift for storage and analyzation.
Created Lambda functions to run the AWS Glue job based on the AWS S3 events.
Performed unit testing of all the mappings developed in the ETL layer before delivering it to production environment.

Environment: AWS Glue, S3, IAM, EC2, RDS, Redshift, EC2, Lambda, Boto3, DynamoDB, Apache Spark, Kinesis, Athena, Hive, Sqoop, Python,ETL

AWS Data Engineer

Bluejestic

Tampa, FL

05.2022 - 12.2022

Responsible for provisioning key AWS Cloud services and configure them for scalability, flexibility, and cost optimization.
Create VPCs, subnets including private and public, NAT gateways in a multi- region, multi-zone infrastructure landscape to manage its worldwide operation.
Manage Amazon Web Services (AWS) infrastructure with orchestration tools such as CFT, Terraform and Jenkins Pipeline.
Create Terraform scripts to automate deployment of EC2 Instance, S3. EFS, EBS. IAM Roles, Snapshots and Jenkins Server.
Build Cloud data stores in 53 storage with logical layers built for Raw, Curated and transformed data management.
Create data ingestion modules using AWS Glue for loading data in various layers in S3 and reporting using Athena and
Quicksight.
Create manage bucket policies and lifecycle for $3 storage as per organizations and compliance guidelines.
Create parameters and SSM documents using AWS Systems Manager.
Established CICD tools such as jenkins and Git Bucket for code repository, build and deployment of the python code base.
Build Glue Jobs for technical data cleansing such as deduplication, NULL value imputation and other redundant column removal. Also build Glue jobs to build standard data transformations (date/string and Math operations) and Business transformations required by business users.
Used Kinesis Family(Kinesis Data streams, Kinesis Firehose, Kinesis Data Analytics) for collection, processing and analyze the streaming data.
Create Athena data sources on 53 buckets for adhoc querying and business dashboarding using Quicksight and Tableau reporting tools.
Copy Fact/Dimension and aggregate output from $3 to Redshift for Historical data analysis using Tableau and Quicksight.
Use Lambda functions and Step Functions to trigger Glue Jobs and orchèstrate the data pipeline.
Use PyCharm IDE for Python/PySpark development and Git for version control and repository management.
Environment: AWS - EC2, VPC, 53, EBS, ELB, CloudWatch, CloudFormation, ASG, Lambda, AWS CLI, GIT, Glue, Athena and Quicksight.
Python and PySpark, Shell scripting, Jenkins.

Hadoop - AWS Data Engineer

LSInextGen

Piscataway, NJ

05.2019 - 04.2022

Participated in requirements gathering and actively involved in the developing the requirement's into technical specifications.
Used SpringXD for data ingestion into HDFS.
Involved in development of MaReduce job's using various AP's like Mapper, Reducer, Record Reader, Input Formatter etc.
Extensively used HDFS for the storing the data.
Worked on Hive for creating External and Internal tables and did some analysis on the data.
Used HiveQL for the analysis on the data and validating the data.
Created Hive Load Queries for loading the data from HDFS.
Used Sqoop to export the data to Netezza from Hive and also used to import the data from Netezza to Hive.
Used informatica to load the data to final table. Used bulk load for this process.
Created sqoop jobs for importing and exporting the data from/to Netezza.
Used Oozie for scheduling this entire process.
Worked on AWS POC for transferring data from local file system to S3.
Hands on experience in creating EMR cluster and developing the glue jobs.
Written oozie workflows and job. Properties files for managing the oozie jobs. Configured all our MapReduce, Hive, Sqoop jobs in oozie workflow.
Scheduled the oozie jobs using Coordinator. Written workflow.xml, job. Properties and cordinator.xml for scheduling the oozie jobs.
Created Kinesis streams for live streaming of the data.
Done some mappings in Informatica and loaded the data to the target tables.
Written Oozie classes for moving files and deleting the files.
Configured the Jar's in the oozie workflows.
Validated the Hadoop jobs like MapReduce, Oozie using CLI. Able to handle the jobs in HUE too.
Deployed these Hadoop applications into the Development, Stage, and production Environments.
Extensively used Spark and created RDD's and Hive Sql for the Aggregating the data

Environment: Hadoop,MapReduce ,Java,Spark,Hive,Sqoop,Oozie,HDFS, Netezza,AWS,EMR, Glue, S3, Informatica9.1,DB2,Oracle11g.SQL,WindowsXP.AgileScrum,MRUnit.Mockito,ApacheLog4j.SpringXD.Subverion

Education

Master of Science in Data Science -

Saint Peter's University

02.2024

Bachelor of Engineering in Information Technology -

Bharat Institute of Engineering and Technology

05.2019

Skills

Programming Languages: Java14/15/16, Python
Hadoop/Big Data: HDP, HDFS, 5qoop, Hive, Pig, HBase, MapReduce, Spark, Oozie
AWS Cloud Technologies: IAM, S3, EC2, VPC, EMR, Glue, Dynamo DB, RDS, Redshift, Cloud Watch, Cloud Trail, Cloud Formation, Kinesis, Lambda, Athena, EBS, DMS, Elastic Search, SQS, SNS, KMS, QuickSight, ELB, Auto ScalingXML,XSL,XSLT,EJB 20/30,Struts1x/2,Spring25, Hibernate32,Ajax
Scripting Languages: Java Script, Python, Shell Script
Web Servers: Apache Tomcat41/50
Databases: Oracle (PL/SQL, SQL), DB2, Netezza
Tools: CVS, Code Commit, GIT hub, ApacheLog4j TOAD, ANT, Maven,unit Mock, Mockito, REST HTTP
Client,JMeter,CucumberJenkins,Aginity
ETL Tools: Informatica, DataStage
IDE'S: Eclipse, IBM'S RAD75

Proficiency in Microsoft Office (Excel, Word, PowerPoint)
Advanced Excel skills, including pivot tables, VLOOKUP, and macros
SAP ERP hands-on experience
Excellent communication skills
Analytical and conceptual thinking

Information seeking
Customer service orientation
Presentation skills
Collaboration
Organizational awareness & commitment
Ability to deliver in a fast-paced environment

Websites

https://www.linkedin.com/in/priyanka-kosuri-2621b5285/

Timeline

AWS Data Engineer

Comcast

01.2023 - 01.2024

AWS Data Engineer

Bluejestic

05.2022 - 12.2022

Hadoop - AWS Data Engineer

LSInextGen

05.2019 - 04.2022

Master of Science in Data Science -

Saint Peter's University

Bachelor of Engineering in Information Technology -

Bharat Institute of Engineering and Technology

Priyanka Kosuri

Summary

Overview

Work History

AWS Data Engineer

AWS Data Engineer

Hadoop - AWS Data Engineer

Education

Master of Science in Data Science -

Bachelor of Engineering in Information Technology -

Skills

Websites

Timeline

AWS Data Engineer

AWS Data Engineer

Hadoop - AWS Data Engineer

Master of Science in Data Science -

Bachelor of Engineering in Information Technology -

Similar Profiles

Ailine KofeloaAiline Kofeloa

Jerry FennellJerry Fennell

Jessica HurdJessica Hurd

Kimberly McGillicuddyKimberly McGillicuddy

Natasha EnriquezNatasha Enriquez