Proven Data Engineer with a history of driving excellence on both sides of a data engineering team. I am contributing towards effective data engineering. I am excited to continue strengthening my career and increasing my skill set.
Experienced Data Engineer and Cloud Infrastructure Specialist with a decade of expertise in designing and implementing scalable data solutions and cloud architecture. Proficient in leveraging cloud platforms to optimize data processing and storage, driving efficiency and innovation. Adept at collaborating with cross-functional teams to deliver robust and reliable data infrastructure.
Overview
15
15
years of professional experience
Work History
Senior Consultant (DATA Engineer)
HCL Singapore PTE LTD
11.2023 - Current
Led the migration of data pipelines from on-premises to Azure/AWS
Designed and implemented a real-time data processing system using Apache Kafka and Spark, improving data processing speed by 50%
Architected and deployed scalable cloud infrastructure using Kubernetes, enhancing system reliability and scalability
Collaborated with data scientists and analysts to develop data models that enhanced business decision-making processes
Maintained data pipeline uptime of 99.9% while managing data from multiple sources
Implementation and support Azure/AWS Cloud in various project including solutioning
Implementation and support of the Enterprise Hadoop environment, including design, capacity planning, cluster set up, performance tuning, monitoring, infrastructure planning, scale up/out
Developed and maintained cloud infrastructure on AWS and Azure, ensuring high availability and security
Implemented infrastructure as code (IaC) using Terraform and Ansible, reducing deployment times by 40%
Managed containerized applications using Docker and Kubernetes, improving deployment efficiency and scalability
Worked closely with development teams to optimize cloud resource utilization and reduce costs
Conducted regular security audits and implemented best practices to safeguard cloud environments
Worked on Ericsson network EEA (Ericsson Expert analytics) software and Hadoop ecosystem environment
Responsible for architecting Hadoop clusters Translation of functional and technical requirements into detailed architecture and design
Work exclusively on Cloudera distribution of Hadoop
Design proper Hadoop Cluster environments for application and data consumption
Implemented automation using scripts, must be proficient in scripting
Design and implemented replication and backups for mission critical/tier-1 applications
Recommend and implementation in depth tuning for infrastructure and applications
Understanding of the HDFS file system and its methods of replication
Installed and configured multi-node fully distributed Hadoop cluster of large number of nodes
Installed and configured Cloudera Manager for easy management of existing Hadoop cluster
Worked on setting up high availability for major production cluster and designed automatic failover control using zookeeper
Configured Flume for efficient collection, aggregation, and transformation of huge log data from various sources to HDFS
Involved in collecting and aggregating large amounts of streaming data into HDFS using Flume and defined channel selectors to multiplex data into different sinks
Implemented Kerberos Security Authentication protocol for existing cluster
Good experience in troubleshooting production level issues in the cluster and its functionality
PROFESSIONAL-1 (Big data Engineer)
DXC TECHNOLOGY PVT LTD
12.2015 - 06.2019
Research and recommend innovative, and where possible, automated approaches for system administration tasks
Identify approaches that leverage our resources, provide economies of scale, and simplify remote/global support issues
Deep understanding of Hadoop design principles, cluster connectivity, security and the factors that affect distributed system performance
Managing Hadoop Environment with Kerberos authentication
As a Hadoop administrator, responsible for providing support for creating POC of Hadoop deployment also played an important role for Hadoop deployment decisions
Manage large scale Cloudera Hadoop cluster environments, handling all Hadoop environment builds, including design, capacity planning, cluster setup, performance tuning and ongoing monitoring
Lead troubleshooting on Hadoop technologies including HDFS, MapReduce, YARN, Hive, Pig, HBase, Sqoop, and Spark
Work with core production support personnel in IT and Engineering to automate deployment and operation of the infrastructure
The ability to work with and as a member of the IT O&M group as required to refine our Production capabilities: testing, kernel issues, compatibility, and deployment of new versions of custom software
Identify hardware and software technical problems, storage and/or related system malfunctions
Creation of metrics and measures of utilization and performance
Capacity planning and implementation of new/upgraded hardware and software releases as well as for storage infrastructure
Assist in monitoring the Linux community and report on important changes/enhancements to the team
LINUX ADMIN
COLLABERA TECHNOLOGY PVT LTD
09.2014 - 11.2015
Experience in installing, administering, and supporting Linux operating systems and hardware in an enterprise environment
Expertise in typical system administration and programming skills such as storage capacity management, performance tuning
Redhat Linux server kernel Patching in Production Environment etc
Managed oracle disks with EMC PowerPath/multipathing and coordinated with application/DB team
Troubleshooting Performance issue, Memory utilization, Filesystem slowness issue, NIC bonding, huge pages' configuration on database server with iptables configuration, Selinux management etc
Monitor cluster health and perform tuning activities
Perform capacity planning and expansion activities working across infrastructure and other enterprise services teams
Perform cluster maintenance with patching/upgrades/migration, user provisioning, automation of routine tasks, and re-processing of failed jobs
Administration and Maintenance of the Linux based environment and resolving the entire Server related issues and queries from client etc
LINUX ADMIN
RENOVISION AUTOMATION SERVICE PVT LTD
08.2010 - 03.2013
Manage all Linux Server, VMware and Eprints Servers
Administration and Maintenance of the Linux based environment and resolving the entire Server related issues and queries from client etc
Installation of LINUX Server and handling various flavors of Linux like Red hat, Fedora, Ubuntu, etc
Installation and configuration of NFS server and Clients, NIS/LDAP server & Clients, Samba server etc
Clients, FTP server and clients, web server, DNS, MySQL server and Backup etc
Installation and Maintaining of Radius server with LDAP Server authentication
Installation of Perl and Python & all required software etc
Installation of Multipath
Multipath is the ability of a server to communicate using multiple physical connections between the host bus adapters in the server and storage
Installation of Scientific application & monitoring like conifer, eigenstate, and R etc
SGI management center and User management & Patch add or remove in the server etc
VMware Server Installation, creating new virtual Machine and troubleshooting all the VMware server issue through V-sphere client, Convert P2V & V2V environment and managing etc
SAN Storage LUN Creating and Mounting over the Fiber Channel on HBA and assign to a group with permissions etc
Maintaining and Monitoring SAN (Storage Area Network) HP EVA 4400, MSA2000i etc
Education
MBA - Business Analytics
Bits Pilani University
01.2024
BCA -
SIKKIM MANIPAL UNIVERSITY
01.2011
12TH -
UP BOARD ALLAHABAD
01.2007
10TH -
UP BOARD ALLAHABAD
01.2004
Skills
Python Development Skills
SQL Data Analysis
Hadoop Proficiency
MySQL Database Management
PostgreSQL
Proficient in MongoDB
Azure ETL Solutions
Proficient in Google Cloud Dataflow
Proficient in Azure Databricks
Amazon Web Services
Microsoft Azure Expertise
Google Cloud Platform Expertise
Infrastructure Automation
Kubernetes Orchestration
Ansible Automation
Data Lake Implementation
Pandas Data Manipulation
Proficient in NumPy
ETL Process Integration
Hive Management
Data Warehouse Management
HBase Administration
CDH
Data Management with HDFS
DevOps Practices
Red Hat Linux Expertise
Global Certifications
Certified Azure Solution Architect
Certified Azure AI Associate
Certified Azure Security Technologies
Certified Azure DevOps Engineer Expert
Certified Azure Data Engineering Associate
AWS Solution Architect - Associate Certified
Google Cloud Certified Associate Engineer
(RHCSA) certification id: 100-116-375
(MCP) Installing & configuring Microsoft Windows XP Service Pack 2 (070-270)