Results-driven Data Engineer with 8+ years of experience in Big Data technologies, specializing in building scalable data pipelines and large-scale data transformations. Expertise in Hadoop, Sqoop, Hive, Spark, AWS, Azure, SQL, and Python. Strong background in distributed computing, cloud platforms, and real-time data processing. Passionate about optimizing performance and driving business insights through data engineering solutions.
Overview
9
9
years of professional experience
Work History
Hadoop Engineer
Bank of America
Plano, TX
07.2023 - Current
Developed data lake applications for processing structured and unstructured data for analytics
Designed and implemented ETL pipelines using PySpark and SQL for data extraction, transformation, and loading
Optimized Spark applications for high-performance data processing
Integrated AWS services such as S3, EMR, Glue, Redshift, and Lambda for cloud-based data solutions
Worked with Kafka Streaming for real-time data ingestion
Performed data analysis using Hive and Spark SQL on Parquet tables
Developed a security framework for fine-grained access control on AWS S3 using Lambda
Implemented CI/CD pipelines for container-based applications using Azure Kubernetes Service (AKS)
Hadoop Engineer
Bank of America
Plano, TX
05.2022 - 03.2023
Built and maintained big data pipelines leveraging Hadoop, Spark, and Hive
Imported and transformed large datasets using Sqoop, Hive, and PySpark
Utilized Snowflake and SQL for data warehousing and analytics
Set up workflow orchestration using Apache Airflow and Oozie
Conducted performance tuning of long-running Spark applications
Software Engineer
JPMorgan Chase - Cognizant
TX
06.2016 - 01.2020
Designed and implemented end-to-end data pipelines on Hortonworks Hadoop cluster
Optimized data storage and processing using Avro, Parquet, and ORC formats
Created ETL workflows using Spark, Hive, and Kafka
Developed Monte Carlo simulations using Pandas (Python) to generate portfolio risk assessments
Automated deployment and cluster monitoring using Cloudera Manager, Nagios, and Ansible
Designed Kafka custom encoders to enhance real-time data streaming efficiency
Education
Master’s degree - information technology management
Webster University
12.2024
Skills
Hadoop
MapReduce
Yarn
Hive
Pig
HBase
Kafka
Oozie
Spark
RDD
DataFrame
Dataset API
Spark SQL
Spark Streaming
Python
PySpark
Pandas
NumPy
Matplotlib
Seaborn
Java
Scala
Shell Scripting
AWS
S3
EMR
Glue
Redshift
Lambda
Kinesis
Athena
Azure
Data Lake
Data Factory
Data Bricks
HDInsight
DevOps
PostgreSQL
MySQL
Netezza
Cassandra
MongoDB
Snowflake
Teradata
Apache Airflow
GitLab CI/CD
Autosys
Tableau
Splunk
BO Reports
IntelliJ
PyCharm
Ambari
JIRA
Bitbucket
Agile
Scrum
Waterfall
Additional Information
Strong understanding of Data Warehousing, OLTP, OLAP, and Dimensional Modeling., Experience with NoSQL databases such as MongoDB, Cassandra, and HBase., Excellent problem-solving skills with a proactive approach to adopting new technologies., Strong communication and teamwork skills with experience in Agile/Scrum methodologies.
Education Certifications
Master’s degree in information technology management, Webster University, 12/24
AWS Certified Solutions Architect (Associate), if applicable
Microsoft Certified: Azure Data Engineer Associate, if applicable
Timeline
Hadoop Engineer
Bank of America
07.2023 - Current
Hadoop Engineer
Bank of America
05.2022 - 03.2023
Software Engineer
JPMorgan Chase - Cognizant
06.2016 - 01.2020
Master’s degree - information technology management
Bold.pro uses cookies as well as our third-party affiliates. When you use our website, you understand that we collect personal data to improve your experience. Learn more