Summary
Overview
Work History
Education
Skills
Timeline
Generic

Loksai Pulivarthi

Plano,TX

Summary

Seasoned technology professional with over 9.5 years of experience dedicated to developing sophisticated software solutions across various layers of the technology stack. Possessing a diverse skill set encompassing multiple languages, frameworks, and design patterns, I am committed to continuous learning and staying abreast of emerging technologies. Proficient in Cloud technologies such as Azure and AWS, as well as Snowflake, Python, Spark, Shell Scripting, and Big Data/Hadoop platforms. Skilled in end-to-end implementation and seamless integration across multiple platforms. Recognized as a collaborative team player with a penchant for quick adaptation, self-initiated learning, and effective communication. Known for meticulous attention to detail and a proactive approach to enhancing business processes. Eager to leverage my expertise in technology to drive innovation and business growth.

Overview

12
12
years of professional experience

Work History

Sr. Data Engineer

Mastercard
St Louis, MO
11.2023 - Current
  • Developed Python scripts to generate synthetic data for Marchant(MAOD) datasets on-premise.
  • Collaborated on a Proof of Concept (POC) project focused on utilizing Jenkins to implement CI/CD pipelines for efficient deployment of code and jobs onto the Databricks platform.
  • Migrated Bitbucket repositories, Jenkins, and Jfrog artifactory applications to Azure in collaboration with platform and cloud builders teams. This project was initiated as a result of the termination of on-premise support for these tools.
  • Aquired knowledge in utilizing Databricks asset bundles and deployed sample jobs in Databricks environment.
  • Designed and developed a NiFi pipeline for a geo-spatial use case, facilitating the data movement from on-premise to AWS S3 while also integrating logging to Splunk for comprehensive monitoring and analysis.
  • Collaborated with the UDAP team to review and approve the design. Engaged in testing, pipeline promotion across environments, and documentation efforts.
  • Familiarized myself with the existing Nifi PaaS framework, an automated system for dynamically creating pipelines using YAML configuration files. Implemented a new functionality within the framework to facilitate data movement from on-premise to S3, ensuring adherence to framework-level standards.
  • Became acquainted with the procedures followed by the teams I collaborated with, creating work orders and handling ad-hoc tasks in alignment with their processes.

Sr. Data Engineer

Metlife
Plano, TX
03.2023 - 11.2023
  • Designed and developed Spark-based ETL pipelines for on-premise data processing using Python.
  • Implemented data governance and security measures within the on-premise Spark environment.
  • Collaborated with teams to gather requirements and ensure compliance with data governance standards.
  • Extensively involved in architecture and deploying scalable Spark clusters on AWS cloud infrastructure using services like Amazon EMR and AWS Glue.
  • Developed and optimized Spark-based ETL pipelines for data migration from on-premise to AWS cloud using Python.
  • Utilized Apache Airflow for orchestrating, scheduling, and managing data migration workflows.
  • Ensured data integrity and security during the migration process, adhering to compliance standards.
  • Continued development and optimization of Spark-based ETL pipelines in AWS cloud environment.
  • Optimized Spark job performance and resource utilization to meet business SLAs in the cloud.
  • Implemented data lake and data warehouse solutions on AWS cloud leveraging services like Amazon S3 and Amazon Redshift.
  • Automated deployment and monitoring of Spark jobs on AWS cloud infrastructure through collaboration with DevOps and cloud infrastructure teams

Sr. Data Engineer

AT&T
Plano, TX
03.2020 - 03.2023
  • Contributed to the FraudML team, taking charge of developing, maintaining, and deploying pipelines responsible for supplying features to ML models.
  • Collaborated with architects and developers to design, develop, and deploy Hadoop-based Data Lake pipelines. These pipelines handled data extraction, transformation, loading, and facilitated subsequent data visualization and analytics.
  • Worked extensively with large datasets such as Scamp, Telegence, and Mobile application data.
  • Developed Kafka consumers to ingest data from topics and populate corresponding Hive tables.
  • Proficient in designing and optimizing Hive tables with partitions and bucketing for enhanced performance. Skilled in configuring parameters and developing Hive UDFs tailored to business needs.
  • Extensive expertise in exploring and visualizing large datasets.
  • Created PySpark scripts to perform various transformations and aggregations on datasets.
  • Expertise in Azure Data Factory (ADF), designing multiple pipelines and activities for full and incremental data loads into Azure Delta Lake Store and Azure SQL Data Warehouse.
  • Conducted ETL processes using Azure Data Factory, T-SQL, and Spark SQL to transfer data from source systems to Azure Data Storage services.
  • Worked on optimizing on-premises processes for smooth migration to Azure Databricks and established scheduling protocols.
  • Configured and maintained personal and high-concurrency team clusters in Azure Databricks.
  • Managed data ingestion into Azure services like Azure Delta Lake, Azure Storage, and Azure SQL, facilitating subsequent processing within Azure Databricks.
  • Expertise in data loading and retrieval from the Redis database to ensure compliance with model transaction SLAs.
  • Deployed ML models and conducted end-to-end pipeline testing.
  • Utilized Grafana to monitor Kafka message queue dashboards effectively.
  • Expertise in troubleshooting and performance optimization of Spark applications to enhance overall processing time and increase error tolerance for the pipeline.
  • Proficient in crafting Linux shell scripts to automate diverse processes. Skilled in leveraging SED and AWK for cutting, cleaning, transforming, aggregating, and analyzing data in flat files.

Sr. Data Engineer

Aetna Inc
Chicago, IL
01.2016 - 02.2020
  • Implemented robust data ingestion pipelines on AWS to collect, validate, and cleanse healthcare data from various sources, including electronic health records (EHR) and medical devices.
  • Designed and implemented data models optimized for healthcare analytics on AWS, including dimensional modeling for clinical and operational data analysis.
  • Optimized data processing and query performance on AWS data services like Amazon Redshift, Amazon Athena, and Amazon EMR to meet the performance requirements of healthcare analytics applications.
  • Utilized Spark and Python on AWS for data processing and analytics tasks, including machine learning model development and deployment.
  • Led the migration of on-premises data and applications to AWS cloud platform, ensuring seamless transition and minimal disruption to operations.
  • Developed ETL processes on AWS to transform raw healthcare data into structured formats suitable for analysis and reporting, while maintaining data quality and integrity.
  • Designing and creating ETL jobs using Talend to efficiently load large volumes of data from flat files into the Hadoop ecosystem and relational databases.
    Implemented error handling and reporting functionalities within the developed TALEND jobs.
  • Collaborated extensively with the admin team to facilitate job deployments and schedule tasks effectively.
    Developed Linux shell scripts on AWS for automating data processing tasks and system administration, improving efficiency and reliability.

Big Data Developer

McDonald's
Oak Brook, IL
08.2015 - 10.2015
  • Extensively involved in designing and maintaining Hive tables and queries, ensuring efficient data storage and retrieval mechanisms.
  • Worked in data ingestion projects using Sqoop and Kafka, successfully bringing diverse data sources into the Hadoop ecosystem.
  • Implemented robust solutions for integrating relational databases with Hadoop, leveraging Sqoop and custom Python scripts effectively.
  • Managed YARN resources proficiently to optimize performance and resource allocation for Spark and MapReduce jobs.
  • Developed and optimized Spark-based ETL pipelines to process and analyze large datasets efficiently.
  • Utilized Python and SQL within Spark environments to manipulate and transform data as per business requirements.
  • Designed and implemented end-to-end ETL processes using Spark, Python, and SQL, contributing to data transformation and loading tasks.
  • Integrated Kafka messaging system seamlessly with Big Data platforms, enabling near-real-time data streaming and processing capabilities.
  • Demonstrated proficiency in Linux shell scripting for automating data processing tasks and system administration within Big Data environments.
  • Conducted comprehensive data analysis and generated insightful reports using SQL queries and visualization tools, contributing to data-driven decision-making processes.

Microsoft BI Developer

Sriven Infosys PVT LTD
Hyderabad, India
08.2012 - 07.2013
  • Developed and managed intricate SQL queries and scripts for extraction, transformation, and loading operations.
  • Skilled in utilizing Linux environment to efficiently process data and automate tasks.
  • Proficient in utilizing Source Analyzer, Target Designer, Mapplet Designer, and Mapping Designer to efficiently develop ETL solutions in Informatica PowerCenter.
  • Designed and implemented SSIS packages for data integration and workflow automation, including scheduling tasks for timely execution.
  • Created and customized SSRS reports for business stakeholders to visualize and analyze data effectively.
    Utilized SQL Server Management Studio (SSMS) for database development, administration, and optimization tasks.
  • Developed and optimized database objects such as procedures, functions, packages, triggers, indexes, and views for improved performance and efficiency.
  • Collaborated with testing and QA teams to ensure data quality and integrity across BI solutions.
  • Provided BI solution support, troubleshooting, performance tuning, and optimization.

Education

Master of Science - Information Technology

Valparaiso University
Valparaiso, IN
05.2015

Bachelor of Science -

Vignan University
Vadlamudi, India
05.2013

Skills

  • Python
  • SQL/Hive
  • Spark
  • Microsoft Azure
  • Amazon Web Services
  • Snowflake
  • Databricks
  • Linux
  • Data Pipelines
  • Power BI
  • Product Design & Development
  • Apache NIFI

Timeline

Sr. Data Engineer

Mastercard
11.2023 - Current

Sr. Data Engineer

Metlife
03.2023 - 11.2023

Sr. Data Engineer

AT&T
03.2020 - 03.2023

Sr. Data Engineer

Aetna Inc
01.2016 - 02.2020

Big Data Developer

McDonald's
08.2015 - 10.2015

Microsoft BI Developer

Sriven Infosys PVT LTD
08.2012 - 07.2013

Master of Science - Information Technology

Valparaiso University

Bachelor of Science -

Vignan University
Loksai Pulivarthi