Summary
Overview
Work History
Education
Skills
Hobbies
Languages
Timeline
Generic

Guranjit Parmar

Bellerose,NY

Summary

Seasoned Big Data Engineer with a knack for navigating the dynamic landscapes of financial technology and retail. Well-versed in a spectrum of big data technologies, including Hadoop, Spark, Python, and ETL tools. Proven capabilities shine in areas of data migration, visualization, warehousing, and nimble database management.

Overview

4
4
years of professional experience

Work History

Data Engineer/Spark Dev

Molina Healthcare
03.2021 - Current
  • Utilized cloud-based data storage services like Amazon S3 (for object storage) and a data warehousing solution such as Amazon Redshift, Azure Synapse Analytics for efficient storage and querying of large datasets
  • Developed ETL pipelines to process and analyze claims data from various sources, ensuring accurate and timely information for both clients and internal teams
  • Worked with Apache Spark for large-scale data processing, Apache Kafka for real-time data streaming, Apache Airflow for workflow orchestration, SQL for data transformation, and Python for scripting
  • Took advantage of Redshift for data warehousing and used SQL to query & manipulate data
  • Involved in creating Logical data Models to understand structure of data elements and the relationships between them
  • Modified python scripts for data validation checks and quality assurance processes to ensure the accuracy, consistency, and integrity of the data
  • Apache Airflow allowed the dev team to create and manage complex ETL workflows, scheduling data pipelines, and ensuring data processing tasks run smoothly
  • Processed an immense volume of data through Apache Spark which was able to handle batch and realtime data processing.

Data Engineer

Capgemini
12.2019 - 02.2021
  • Worked with AWS to store raw and processed data
  • Created and managed S3 buckets to store structured, semi-structured, and unstructured data which included log files, CSV files, and Parquet files
  • Utilized Apache Kafka to extract data from various sources, databases, and APIs
  • Spark and Hadoop enabled me to clean and transform raw data into a more suitable format which helped the team with data analysis and reporting
  • Built and Managed ETL pipelines to migrate data from source systems to its required destination
  • AutoSys allowed me to monitor job scheduling and handle job failures
  • GitHub made it seamless to manage code and stay up to date with the team's latest code changes and commits
  • SQL was essential for manufacturing and maintaining database schemas
  • Queried data on SSMS on a day to day basis to help enhance and structure the data we had to pull for batch jobs.

Education

Some College (No Degree) - Finance

Baruch College of The City University of New York
New York, NY

Certificate - Data Engineering

TechScope
New York, NY
06.2019

High School Diploma -

Bronx High School Of Science
Bronx, NY
06.2017

Skills

  • -Extensive knowledge and hands-on involvement with Big Data Tools and Technologies (Spark,PySpark,Hadoop,SQL)
  • -Combining data from various sources, such as databases, APIs, and files, into a unified format
  • -Designing and creating data models that reflect the structure and relationships within the data
  • -Extracting data from source systems, transforming it into a suitable format, and loading it into target databases or data warehouses
  • -Storing and managing large volumes of data in a centralized repository optimized for querying and analysis
  • -Ensuring data accuracy, consistency, and reliability through validation, cleansing, and error handling
  • -Writing and optimizing SQL queries to retrieve and manipulate data
  • -Handling and processing large-scale data using frameworks like Hadoop and Spark
  • -Utilizing cloud based technologies such as AWS & Azure for data storage,processing, and analytics purposes
  • -Managing code and scripts using tools like Git to track changes and collaborate effectively

Hobbies

Basketball, fitness, leisure reading, education, film, meditation

Languages

English
Native or Bilingual
Hindi
Native or Bilingual
Punjabi
Native or Bilingual

Timeline

Data Engineer/Spark Dev

Molina Healthcare
03.2021 - Current

Data Engineer

Capgemini
12.2019 - 02.2021

Some College (No Degree) - Finance

Baruch College of The City University of New York

Certificate - Data Engineering

TechScope

High School Diploma -

Bronx High School Of Science
Guranjit Parmar