Summary
Overview
Work History
Education
Skills
Timeline
Generic

Bala Suresh Abbina

Charleston,IL

Summary

Accomplished Data Engineer with over 6 years of experience specializing in Big Data ecosystems and enterprise application development. Demonstrated expertise in Hadoop, Spark, and large-scale data processing. Proven track record in data engineering, data architecture, and ETL processes with a strong emphasis on data modeling and data warehousing. Proficient in SQL and experienced with major data analytics platforms. Skilled in AWS and GCP cloud platforms with hands-on experience in developing scalable data pipelines. Adept at collaborating with cross-functional teams to deliver high-quality data-driven solutions. Committed to continuous learning and staying updated with advancements in data engineering technologies.

Overview

6
6
years of professional experience

Work History

Data Engineer

TikTok
2024.01 - 2024.06
  • TikTok is a social media platform for creating, sharing and discovering short videos
  • Developed and optimized Spark scripts for data encryption and processing using hashing algorithms
  • Created HIVE and HBASE tables and used Hive Queries in Spark-SQL for data analysis
  • Implemented ETL processes using Talend and Spark for data ingestion and transformation
  • Automated data workflows using Apache Nifi and Control M
  • Developed Python-based APIs for revenue analysis and data migration projects to Snowflake
  • Utilized AWS services including S3, Redshift, and Lambda for data processing and storage solutions
  • Monitored and maintained data pipeline performance, ensuring high availability
  • Collaborated with data scientists to implement machine learning models
  • Ensured data security and compliance with company policies
  • Conducted data quality assessments and implemented improvement measures
  • Provided technical support and troubleshooting for data-related issues.
  • Conducted extensive troubleshooting to identify root causes of issues and implement effective resolutions in a timely manner.
  • Managed cloud-based infrastructure to ensure optimal performance, security, and cost-efficiency of the company''s data platform.
  • Collaborated with data scientists to develop machine learning models by providing the necessary data infrastructure and preprocessing tools.

Data Engineer Intern

Bayview Asset Management
2023.04 - 2023.09
  • Bayview is an investment management firm focused on investments in mortgage and consumer credit, including whole loans, asset backed securities, mortgage servicing rights, and other credit-related assets
  • Implemented Kafka/Spark streaming pipelines for real-time data ingestion
  • Utilized Apache Airflow for scheduling and monitoring data workflows
  • Developed Spark applications using Python for data extraction and transformation
  • Utilized GCP services including Big Query, DataProc, and Pub/Sub for data analytics
  • Built ETL pipelines and deployed applications in cloud environments using Docker and Kubernetes
  • Conducted data validation and reconciliation processes to ensure data quality
  • Implemented data warehousing solutions to support business analytics
  • Optimized data storage and retrieval processes for performance efficiency
  • Collaborated with business stakeholders to define data requirements
  • Developed automated reporting solutions to provide real-time insights
  • Participated in code reviews and provided constructive feedback to peers.
  • Boosted performance of machine learning models by preprocessing large volumes of raw data for feature extraction and selection.
  • Developed custom scripts for data cleansing, ensuring consistency and accuracy across various datasets.
  • Provided reliable and secure access to sensitive information by enforcing strict authorization policies in line with company guidelines.

Data Engineer

NTT Data
2019.10 - 2022.05
  • NTT DATA - a part of NTT Group - is a trusted global innovator of IT and business services headquartered in Tokyo
  • We help clients transform through consulting, industry solutions, business process services, IT modernization and managed services
  • Executed big data analytics initiatives using Hadoop, Spark, and AWS
  • Developed Spark scripts for data aggregation and transformation
  • Automated data ingestion processes using Python and Apache Airflow
  • Migrated data to Snowflake and optimized ETL workflows for better performance
  • Implemented data validation and reconciliation processes to ensure data quality
  • Designed and developed data pipelines using AWS Glue for data transformation and loading
  • Managed and monitored data infrastructure to ensure high availability
  • Conducted performance tuning and optimization of data processes
  • Provided technical guidance and support to junior team members
  • Collaborated with data analysts to develop actionable insights
  • Implemented data governance policies to ensure data integrity and compliance.
  • Conducted extensive troubleshooting to identify root causes of issues and implement effective resolutions in a timely manner.
  • Collaborated with system architects, design analysts and others to understand business and industry requirements.
  • Developed and delivered business information solutions.

Data Analyst

ADP Inc
2018.07 - 2019.09
  • Is a global provider of human capital management solutions
  • Migrated workflows from development to production environments
  • Performed data analysis and profiling, working with data transformation and quality rules
  • Utilized Kubernetes and Docker for managing containerized applications
  • Ingested real-time data using Flume, Kafka, and Spark Streaming
  • Developed ETL processes using SSIS for data extraction, transformation, and loading
  • Created and maintained data models to support business reporting
  • Implemented data quality checks and validation processes
  • Developed dashboards and visualizations using Tableau and Power BI
  • Collaborated with business stakeholders to gather and analyze requirements
  • Provided training and support to end-users on data tools and best practices
  • Documented data processes and created user guides for reference.
  • Produced monthly reports using advanced Excel spreadsheet functions.
  • Maintained up-to-date knowledge of industry trends and advancements in data analytics, enhancing the adaptability of solutions provided.
  • Generated standard and custom reports to provide insights into business performance.

Education

Master of Science - Computer Technology

Eastern Illinois University
Charleston, IL
12.2023

Bachelor of Science - Computer Science

Jawaharlal Nehru Technological University
India
05.2018

Skills

  • Technical Skills
  • Hadoop Ecosystem:
  • HDFS, HBase, MapReduce, Spark, Kafka
  • Programming Languages: Python, SQL
  • ETL Tools:
  • Talend, Informatica, Apache Nifi, AWS Glue
  • Data Orchestration: Apache Airflow
  • Data Bases: SQL Server, MySQL, Oracle, Mango DB, Cassandra
  • Cloud Platforms: AWS (S3, EC2, Redshift, Lambda, Glue), GCP (Big Query, Dataflow)
  • Data Warehousing: Snowflake, Amazon Redshift, Snowflake
  • Data Lakes: Amazon S3, Google Cloud Storage
  • Data Modeling: SQL, Data Modeling, ETL
  • Big Data Technologies: Hive, Pig, Spark Streaming
  • CI/CD Tools: Jenkins, Git
  • Reporting Tools: Tableau, Power BI ,Microsoft Excel
  • Operating Systems: Linux, Ubuntu, CentOS
  • Data Security
  • XML Web Services
  • Data Modeling

Timeline

Data Engineer

TikTok
2024.01 - 2024.06

Data Engineer Intern

Bayview Asset Management
2023.04 - 2023.09

Data Engineer

NTT Data
2019.10 - 2022.05

Data Analyst

ADP Inc
2018.07 - 2019.09

Master of Science - Computer Technology

Eastern Illinois University

Bachelor of Science - Computer Science

Jawaharlal Nehru Technological University
Bala Suresh Abbina