Experienced data professional with a proven track record in architecting and optimizing robust data ecosystems. Proficient in diverse programming languages, big data technologies, and cloud platforms. Skilled in ETL pipeline development, real-time analytics, and data modeling. Adept at driving data-driven insights through advanced analysis and visualization techniques. Collaborative team player with expertise in enhancing user experiences through dynamic web applications. Strong commitment to data security, compliance, and ethical data handling.
Overview
5
5
years of professional experience
Work History
Data Engineer
Kanap Systems LLC
03.2024 - Current
Spearheaded the development and deployment of comprehensive data pipelines, ensuring efficient data flow and processing from ingestion to consumption.
Enhanced data processing efficiency by implementing best practices in data pipeline design, resulting in a significant reduction in processing time.
Successfully migrated fraud and risk models from Hortonworks to AWS, optimizing performance and scalability.
Extracted datasets from AWS, performed data transformations.
Conducted rigorous input and output data quality checks, ensuring that all data loaded into databases met the required standards of accuracy and reliability, thus supporting high-quality model output.
Ensured data pipelines adhered to security and compliance standards by implementing IAM roles and policies that align with organizational and regulatory requirement.
Developed and managed Airflow DAGs to automate and schedule complex data workflows, ensuring reliable and timely execution of data processing tasks.
Designed and implemented a calendar widget for configuring EMR settings within Airflow, streamlining workflow management and improving user experience.
Conducted rigorous input and output data quality checks, ensuring that all data loaded into databases met the required standards of accuracy and reliability, thus supporting high-quality model outputs.
Data Engineer
Tekcypher Solutions
06.2020 - 07.2021
Designed and optimized data warehousing solutions, incorporating Hadoop, Hive, and Snowflake to manage extensive datasets with peak efficiency.
Ensured robust data modeling and integrity in alignment with industry best practices.
Improved ETL processes through Apache Spark, Kafka, and PySpark, supporting the data flow from various sources to streamline analytics and reporting.
Advanced automation and optimization of data workflows, leveraging Apache Airflow for efficient scheduling, monitoring, and failover mechanisms.
Emphasized creating flexible cloud architectures that fulfill dynamic data processing need.
Applied SQL, Python (Pandas, NumPy), and Spark MLlib to extract actionable insights from raw datasets, enabling data-generating decision-making.
Software Developer
Rhyive software
01.2020 - 04.2020
Crafted intricate data models for efficient storage and retrieval, leveraging SQL and NoSQL databases, enhancing analysis capabilities.
Employ Kafka and Spark Streaming for instant insights from data streams, using Python and Scala
programming for effective processing.
Utilize AWS and Azure to manage and process data at scale, utilizing cloud services like S3 and Azure
Data Factory.
Incorporate ML models into analytics workflows by Python and R, supporting predictive and
prescriptive analysis.
Maintain data quality and compliance, implementing policies operating tools like Collibra and Alation for
efficient data stewardship.
Safeguard ethical data handling, protecting privacy and trust through anonymization techniques and
compliance with data regulations.
Establish and execute automated pipelines with Airflow , harnessing Python and SQL for seamless data
extraction and transformation.
Developed impactful visuals using Power BI and Tableau, incorporating data storytelling to convey
insights to diverse audiences.
Collaborate on data security, employing cryptography and network protocols for data protection and
threat detection.
Education
Master's - Computer Science
Villanova University
Villanova, PA
12.2023
B.Tech. - Computer Science, and Engineering
Andhra University
India
05.2020
Skills
Programming Languages: Python, SQL,R,Java
Big Data: Spark,Kafka Hadoop, Hive, Oozie, Map-Reduce, HDFS
Commissioning/Operation Engineer at Walmart Advanced Systems & Robotics (Contract through Apex Systems LLC)Commissioning/Operation Engineer at Walmart Advanced Systems & Robotics (Contract through Apex Systems LLC)
Assistant Director of Environmental Services at Hospital Housekeeping Systems, LLCAssistant Director of Environmental Services at Hospital Housekeeping Systems, LLC