Summary
Overview
Work History
Education
Skills
Websites
Timeline
Generic

Sainath Baludu

Summary

Experienced, result-oriented, resourceful and problem-solving Senior Data Engineer with over 14 years of diverse experience in Information Technology field, includes Development, and Implementation of various applications in Big Data and Cloud environments in Storage, Querying and Processing. Proficient in Python, Scala, and SQL. Skilled at optimizing the performance and cost of data pipelines, and experienced in data security and compliance in the AWS environment. Collaborative team player with excellent problem-solving and communication skills.

Overview

16
16
years of professional experience

Work History

Senior Data Engineer

Nielsen Media
06.2018 - Current
  • Led migration to cloud-based data integration, improving annual revenues by very significant percentage in 2020
  • Managed 3 cross-functional teams to work in close collaboration with analytics, engineering, and stakeholders
  • Implemented data encryption and access controls to ensure data security and compliance with company policies
  • Optimized performance and cost of data pipelines through continuous monitoring and tuning
  • Oversaw team of 3 data engineers, and collaborated with company management recommend changes based on data history and tests
  • Defined, built, and executed ETLs to resolve navigation issues, preventing $$ loss
  • Developed real-time data pipelines to provide insights for debugging system integrations, improving operating efficiency.
  • Created portable data pipelines using AWS SFN and SQS to provide standard for machine learning
  • Directed day-to-day operations of data-dependent systems
  • Designed, developed, deployed, and maintained data services for 20+ pipelines
  • Worked with Analytics team to architect and build Data Lake using various AWS services like EMR, S3
  • Developed generic Spark frameworks to assist different Quantitative science teams on onboarding multi-format datasets from various sources
  • Provided technical mentorship to junior team members, performed cod and design reviews as well as enforced coding standards and best practices.
  • Leveraged AWS Glue for ETL (Extract, Transform, Load) tasks, automating data ingestion from various sources and enhancing data quality.
  • Orchestrated complex data pipelines using AWS Glue and Apache Spark, resulting in great improvement in data processing efficiency.
  • Designed and developed serverless data processing solutions using AWS Lambda, reducing operational costs
  • Collaborated with cross-functional teams to implement event-driven architectures with AWS Lambda for real-time data processing.
  • Implemented data lake solutions using AWS Lake Formation, enabling centralized data storage and governance.
  • Set up data lake access controls and permissions using AWS Lake Formation, ensuring data security and compliance.
  • Ingested continues data from various microservices using confluent Kafka connect
  • Developed python code for different tasks, dependencies, SLA watcher and time sensor for each job for workflow management and automation using Airflow tool
  • Deploy and maintain various production applications using Docker and Kubernetes
  • Deployed data pipelines with CICD process using Jenkins & Gitlab runners.
  • Managed and maintained AWS infrastructure using Terraform, AWS SAM ensuring scalability and cost-efficiency.
  • Implemented infrastructure as code practices, resulting in reduced time provisioning time and improved consistency.
  • Developed Terraform modules for provisioning AWS resources, including EC2 instances, RDS databases, and VPC configurations.
  • Experienced in consuming near real-time data using Spark Streaming with Kafka as data pipe-line system
  • Experience in handling JSON datasets and writing custom Python functions to parse through JSON data using Spark
  • Expertise in Creating, Debugging, Scheduling and Monitoring jobs using Airflow for ETL batch processing.
  • Contributed to internal activities for overall process improvements, efficiencies and innovation.
  • Monitored incoming data analytics requests, executed analytics and efficiently distributed results to support strategies.
  • Addressed ad hoc analytics requests and facilitated data acquisitions to support internal projects, special projects and investigations.
  • Tested and validated models for accuracy of predictions in outcomes of interest.

Data Engineer

Gracenote, A Nielsen Company
01.2015 - 05.2018
  • Designed and implemented data lake using Hadoop and Hive on AWS
  • Developed custom data processing solutions using Python and Spark
  • Created and maintained data pipelines to ingest, process, and store data from variety of sources
  • Collaborated with data analysts to design and implement data warehouse.
  • Ensured data security and compliance through implementation of access controls and data governance policies.
  • Collaborated with team on ETL (Extract, Transform, Load) tasks, maintaining data integrity and verifying pipeline stability.
  • Employed data cleansing methods significantly enhancing data quality
  • Performed large-scale data conversions, transferring data into standardized formats for integration into microservices.
  • Authored specifications for data processing tools and technologies.
  • Generated detailed studies on potential third-party data handling solutions, verifying compliance with internal needs and stakeholder requirements.

Principal BI Consultant

KPI Partners
01.2008 - 12.2014
  • Worked with Business Owner to design Dashboard layout and page flow. Interpreted ultimate goals of customer and translated into reports, metrics and solutions to meet those goals.
  • Drove detailed OBI design and OBI Development. Employed use of OBI prototype to flesh out designs to accurately and visually provide feedback to customer.
  • Translating and converting user requirements into OBIEE designs and packages for development.
  • Working with members of Business/Technology teams to ensure on-schedule, high quality delivery of project.
  • Implementation of BI server metadata layers including physical, logical and presentation layers.
  • Administering and maintaining BI environment for high performance and availability, with robust security and usage monitoring.
  • Providing training and support to users with ad-hoc analysis as needed.
  • Developed technical architecture diagrams and configuration guides for Production support.
  • Designed and developed Security Strategy at both data and UI level
  • Health check on existing OBIEE application to make sure code is developed following all oracle’s best practices and confirm if existing OBIEE metadata file (RPD) can be used as base solution for Exalytics and Endeca Information Discovery POC.
  • Meeting with customer stakeholders and sponsors to establish proof of concept scope.
  • Migrate legacy dashboards, analyses, other BI objects and BI Repository to latest Exalytics OBIEE versions.
  • Obtain extract of data, depersonalized and suitable for off-site use in our secure datacenter.
  • Enhance legacy dashboards to take advantage of Exalytics visualizations, go-less prompts and other optimizations.
  • Package up POC system and migrate to our Exalytics server.
  • Run migrated system on Exalytics server “as is” with no performance optimizations, to establish baseline performance.
    Add in-memory results cache and BI Server/Presentation Server in-memory working areas, then benchmark.
  • Use Summary Advisor to create in-memory aggregates in Times Ten for Exalytics database, then benchmark.
    Use Index advisor to create Times Ten recommended indexes to get better performance.
  • Apply other optimizations as suited to dashboard and dataset, produce final benchmark figures
  • Present Exalytics-powered dashboard to client, make available for remote use from client site
  • Integrating EID Studio on OBIEE dashboard enabling business users to do information discovery.

Education

Bachelor of Science - Computer Science Engineering

JNTU
Hyderabad, India

Skills

  • Programming languages: Python, Spark, Scala, SQL
  • Big data technologies: Hadoop, Spark, Hive
  • Cloud computing: AWS (S3, EMR, Glue, Stepfunctions, SQS, Airflow)
  • Infrastructure as code: Terraform, AWS SAM
  • Orchestration Frameworks: Docker, Kubernetes
  • Data modeling: Data Lakes, Data Warehouses
  • Data security: Encryption, Access controls, Governance
  • Agile development methodologies
  • Business Intelligence Data Modeling
  • Critical Thinking
  • Team Leadership
  • Data Integration

Timeline

Senior Data Engineer

Nielsen Media
06.2018 - Current

Data Engineer

Gracenote, A Nielsen Company
01.2015 - 05.2018

Principal BI Consultant

KPI Partners
01.2008 - 12.2014

Bachelor of Science - Computer Science Engineering

JNTU
Sainath Baludu