Master’s graduate in Computer Science with a strong academic foundation in data engineering, big data technologies, and cloud platforms. Skilled in designing and implementing ETL pipelines, building real-time and batch data workflows, and managing data lakes using AWS (Glue, Lambda, EMR, S3, Athena) and Apache Spark. Proficient in Python, SQL, and Azure Data Factory, with experience developing data models, ensuring data quality, and applying data governance best practices. Completed multiple academic and internship projects involving real-time streaming data pipelines, AWS-based data lake architecture, and large-scale batch processing. Adept at creating interactive dashboards with Power BI and Tableau to drive data-driven decisions. Seeking an entry-level Data Engineer role to leverage technical expertise, problem-solving skills, and a passion for building scalable, high-performance data solutions.
Key Achievements:
Key Achievements:
Programming Languages
Python, SQL
Clouds and Data Platform
AWS(Glue, Lambda, EMR, S3, IAM, Athena, Step Functions, API Gateway), Azure
Big Data Technologies
Apache Spark (Core, SQL, DataFrame, MLlib), Hadoop (HDFS, MapReduce, Pig, Hive, HBase, YARN)
ETL & Data Integration
AWS Glue, AWS Step Functions, API Gateway, Airflow, Azure Data Factory, SQL
Business Intelligence & Reporting
Power BI (DAX, RLS, Calculated Columns, KPIs), Tableau (LOD, Row-Level Security, Calculated Fields, Parameters)
Data Modeling & Governance
IAM Policies, Role-Based Access Control (RBAC), AWS Lake Formation, Tableau Server Security
CI/CD & Version Control
Git, GitHub Actions, Jenkins, Git Tags
Data Quality & Validation
SQL Data Integrity Checks, Python Validation Scripts, Cross-table Reconciliation