Summary

Overview

Work History

Education

Skills

Certification

Timeline

Paramesh

Minneapolis,USA

Summary

A seasoned Sr Hadoop Engineer with a proven track record at Prime Therapeutics, I excel in deploying and managing Hadoop clusters, enhancing data security and governance with Apache Ranger and Atlas. Skilled in Ansible and Python, my expertise in automating and optimizing data operations significantly boosts system efficiency. Demonstrating strong problem-solving abilities and a commitment to continuous improvement, I consistently deliver high-quality solutions that meet and exceed employer expectations.

Overview

years of professional experience

Certification

Work History

Sr Hadoop Engineer

Prime Therapeutics

Minneapolis, USA

07.2019 - Current

Installing and configuring Hadoop clusters with Cloudera distribution of Hadoop
Version CDH 5.X to CDH 6.X
Upgrading cluster from CDH 6.X to CDP 7.1.6
CDH 6.X to CDP 7.1.6 to CDP 7.1.6 Migration on existing clusters without impacting to existing data
Experience in Configuring, Installing and Managing Apache Hadoop and Cloudera and Hortonworks Hadoop
Extensive experience with Installing New Servers and rebuilding existing Servers
Implemented and managed Apache Ranger policies within Cloudera Data Platform for fine-grained access control
Maintained Ansible as Configuration management tool to apply Hadoop config changes across clusters, revert them to previous versions replace the wrong components and etc
Build Hadoop server pre-check and pre-installs using Devops tools like Ansible
Monitor health of the platforms, introduce & implement structured granular reporting and tool framework, generate performance reports and KPI’s to maintain improvements
Integrate services like Ranger, Atlas, zeppelin with Active Directory
Creating ranger polices for HDFS, Hive, ATLAS, Kafka services
Installed and configured Apache Atlas to Meta tagging for future Attribute Based Access Control with Ranger, Data Lineage auditing and linking business taxonomies to meta data for organizing and visualizing data
Enabling HDFS encryption using CLI and Ranger
Configured and managed CDP Replication Manager to synchronize data seamlessly across clusters, enabling real-time or scheduled replication for business-critical datasets
Created policies and scheduled jobs in replication manager to replicate the HDFS/HIVE/HBase data between prod clusters
Integrated Apache Atlas classifications seamlessly with Apache Ranger tags, establishing a unified framework for data governance and access control
Designing and implementing fine-grained access control policies using Cloudera Ranger for Hadoop components such as HDFS, Hive, HBase, etc
Managing and enforcing authorization policies to control user access to data
Developing and managing security policies through the Cloudera Ranger administration interface
Customizing policies based on business requirements and compliance standards
Creating and maintaining documentation for configurations, best practices, and troubleshooting guides related to Cloudera Ranger and Apache Atlas
Designed, implemented, and maintained complex data workflows and pipelines using Apache Airflow to orchestrate tasks across various components within the Cloudera CDP ecosystem
Demonstrated expertise in integrating Apache Airflow with Cloudera components such as Hadoop Distributed File System (HDFS), Hive, Spark, and Impala for seamless data processing and analytics
Experienced across various platforms on different applications, and coordinate with related support and implementation teams and assist in customization of applications according to the business requirements of the client, work on incidents, problems and change requests
Utilized Airflow's parameterization features to make workflows configurable, allowing for easy adaptation to different environments and datasets within the Cloudera ecosystem
Integrated Apache Airflow with Cloudera CDP APIs for automation and management of Cloudera services, enhancing the overall efficiency of data operations
Improve system performance by conducting prep and stress tests to fine tune the services
Written scripts to backup Namenode metadata, MySQL dbs and configs with retention period as a part of disaster recovery process
Recognize and adopt best practices in data processing, reporting and analysis in terms of data integrity, test design, validations and documentations
Collaborating with application teams to install operating system and Hadoop updates, patches, version upgrades when required
Point of Contact for Vendor escalation
Deployed and managed Kafka clusters on AWS MSK, configuring Kafka brokers, topics, partitions, and data retention policies based on application needs and ensuring seamless data flow
Used Terraform to automate the deployment of Kafka clusters on AWS MSK, creating reusable infrastructure-as-code templates to improve deployment efficiency and accuracy
Built data ingestion pipelines using Kafka Connect to integrate with external data sources such as Amazon S3 and HDFS, enabling continuous data streaming for downstream analytics applications
Monitored Kafka clusters using Prometheus and Grafana, identified performance bottlenecks, and fine-tuned Kafka configurations (batch size, compression type, broker settings) to improve throughput and latency
Conducted root cause analysis and troubleshooting of Kafka performance issues, coordinating with engineering and support teams to ensure timely resolution
Created comprehensive documentation on Kafka architecture, cluster management, troubleshooting steps, and best practices

Hadoop Administrator

Visa Inc

Austin, USA

09.2017 - 07.2019

Involved in Performance tuning at source, target, mappings, sessions, and system levels
Troubleshoot and resolve Hadoop cluster related system problems
Implemented Fair schedulers on the resource manager to share the resources of the Cluster for the Map Reduce jobs given by the users
Sqoop connection setup for exporting and importing data from DB2/HDFS
Good understanding of Partitioning concepts and different file formats supported in Hive
Involved in Cluster Capacity planning, deployment and Implementing POC
Integrated HDFS with Active Directory and Cluster security with Kerberos
Provided Support for production clusters
Performed cluster upgrades / migrations
Involved in upgrading the cluster from CDH 5.11 to 5.13
Built NON-Prod clusters using chef and actively participated in building production clusters
Performed various Maintenances in all Visa Hadoop clusters
Worked as SME for Production cluster
Worked with UNIX team on remediating qualys findings
Good understanding on fair/capacity schedulers
Additional responsibilities include interacting with offshore team on a daily basis, communicating the requirement, delegating the tasks to offshore/onsite team members and reviewing their delivery
Experience in using service now as a ticketing tool
Environment: Amazon Web Services, Amazon MSK, AWS EMR, HDFS, Map Reduce, YARN, Hive, HBase, Impala, Sqoop, Kafka, Spark and Kubernetes

Application Mgmt Sr. Advisor

Dell Inc.

Austin, USA

04.2016 - 06.2017

Experience in monitoring and managing the health of the data nodes
Performed capacity-planning analysis, monitored and controlled disk space usage on systems
Monitored system activities and fine-tuned system parameters and configurations to optimize job performance and ensure security of systems
Expertise in Hadoop cluster management like Adding and Removing Nodes without any effect to running jobs and data
In depth knowledge of Hadoop Architecture and various components such as HDFS, Yarn, Map Reduce, Resource manager, Node manager, Application master and containers
Experience with configuration of Hadoop Ecosystem components: Hadoop HDFS, Mapreduce, Hive, Impala, Kafka, Storm, Spark, Sqoop, Oozie, Hbase, Zookeeper and Flume
Involved in creating the AD group and databases in hive and hdfs paths for the database
Creating roles and granting access to the users as in need basis
Having an extra eye in monitoring the cluster thru Cloudera manager and Ambari
Working closely with developers to help them in trouble shooting the issues
Allocating the Hdfs space quota in labs environments for POC’s
Involved in preparing the process documentation of user and admin guide for Dell Data Reservoir
Implemented Commission / Decommission of new nodes to the existing cluster
Experienced in configuring dynamic resource pooling
Monitor health check of data nodes and fix the servers that are with bad hard drives
Manage and review Hadoop Log files
Used chef as configuration management tool
Created chef recipes for automating the infrastructure and deployments process
Managed nodes, run lists, roles, environments, data bags, cookbooks, recipes in chef
Set up automated 24x7x365 monitoring and escalation infrastructure for Hadoop cluster using Cloudera Manager and Ambari
Experienced in using Cloudera Manager and Ambari an end-to-end tool manage Hadoop operations
Experience in configuring Kerberos security and connecting with Active Directory and manage Knox & Ranger configurations
Good knowledge on open-source configuration management tools Ansible
Experienced in using Apache Kafka like topics, Producers, consumers, brokers etc
Experienced in using Apache Storm, Nimbus, Supervisors, and Topologies etc
Successfully implemented shell scripts to pull the data from the Hive (Big data) to local file system
Making adjustments in number of maps reduce slots as per projects requirements
Setting up alerts to monitor cluster health so that the team can take necessary action
Extensively involved in Cluster Capacity planning, Hardware planning, Installation, troubleshooting and Performance Tuning of the Hadoop Cluster
Worked on resolving production issues and documenting Root Cause Analysis and updating the tickets using HP service manager
Environment: Hortonworks 2.X, HDFS, MapReduce, YARN, Hive, HBASE, Zookeeper, Kafka, Storm, Spark

Hadoop Administrator

Dell Inc.

Austin, USA

01.2015 - 03.2016

Experience in monitoring and managing the health of the data nodes
Monitored system activities and fine-tuned system parameters and configurations to optimize job performance and ensure security of systems
Expertise in Hadoop cluster management like Adding and Removing Nodes without any effect to running jobs and data
In depth knowledge of Hadoop Architecture and various components such as HDFS, Yarn, Map Reduce, Resource manager, Node manager, Application master and containers
Experience with configuration of Hadoop Ecosystem components: Hadoop HDFS, Mapreduce, Hive, Impala, Kafka, Storm, Sqoop, Oozie, HBase, Zookeeper and Flume
Having an extra eye in monitoring the cluster thru Cloudera manager and Ambari
Working closely with developers to help them in trouble shooting the issues
Allocating the Hdfs space quota in labs environments for POC’s
Implemented Commission / Decommission of new nodes to the existing cluster
Experienced in configuring dynamic resource pooling
Monitor health check of data nodes and fix the servers that are with bad hard drives
Manage and review Hadoop Log files
Experienced in using Cloudera Manager and Ambari an end-to-end tool manage Hadoop operations
Experience in configuring Kerberos security and connecting with Active Directory and manage Knox & Ranger configurations
Successfully implemented shell scripts to pull the data from the Hive (Big data) to local file system
Making adjustments in number of maps reduce slots as per projects requirements
Setting up alerts to monitor cluster health so that the team can take necessary action
Environment: Hortonworks 2.X, HDFS, MapReduce, YARN, Hive, HBASE, Zookeeper, Kafka, Storm, Spark

Associate Consultant

HSBC GLTM

Kuala Lumpur, Malaysia

07.2013 - 09.2014

Design and Coding as per requirements
Review the coded programs
Coordinating Testing phase in the Unit and System
Preparing test scripts
Review of Unit and Integration test cases
Implemented Exception handling using custom exception
Addressing critical issues and fixing bugs, also involved in code reviews, design discussion
Preparing the estimates for the minor improvements and enhancement works
Environment: COBOL, JCL, DB2, REXX, Java Multithreading, JSP, VSAM, CICS, QMF, Expediter, SPUFI, FILE-AID, DB2 utilities

Software Engineer

Polaris Software Labs

Chennai, India

08.2011 - 06.2013

Company Overview: Client: Morgan Stanley Smith Barney
Involved in designing Use-case
Involved in designing and implementing
Involved in writing SQL queries
Analysis of the business functionality of the system
Design and Coding as per requirements
Coordinating Testing phase in the Unit and System
Review of Unit and Integration test cases
Member of Defect Prevention Group
Design and Coding as per client requirements
UTP preparation, Unit testing, System testing
Analyzing, Co-coordinating with plant-users for solving problem tickets
Solving Business related issues within the System
Offshore coordination with Onsite leads & SDMs
Communication with Visteon Business users and End users
Client: Morgan Stanley Smith Barney
Environment: COBOL, JCL, DB2, REXX, Java Multithreading, JSP, VSAM, CICS, QMF, Expediter, SPUFI, FILE-AID, Easytrieve TSO/ISPF, DB2 utilities

Systems Engineer

Polaris software labs

Hyderabad, India

02.2010 - 08.2011

Design and Coding as per requirements
Review the coded programs
Coordinating Testing phase in the Unit and System
Preparing test scripts
Review of Unit and Integration test cases
Implemented Exception handling using custom exception
Addressing critical issues and fixing bugs, also involved in code reviews, design discussion
Preparing the estimates for the minor improvements and enhancement works
Environment: COBOL, JCL, DB2, VSAM, CICS, QMF, Expediter, SPUFI, FILE-AID, DB2 utilities

Education

Master of Computer Science -

Anna University

12.2009

Skills

Apache Kafka
AWS MSK
AWS EMR
AWS CloudWatch
SNS
CloudTrail
HDFS
YARN
Hive
Sqoop
HBase
Zookeeper
Storm
Spark
Solr
Impala
Hue
Atlas
Superset

Ambari
Cloudera Manager
DB2
MySQL
VSAM
AWS CLI
Terraform
Ansible
Python
Shell Scripting
Kerberos
Sentry
Know
Ranger
Kafka Connect
Schema Registry
Docker
Kubernetes
Git

Certification

HDP Certified Administrator (HDPCA), http://bcert.me/sdskbprr

Timeline

Sr Hadoop Engineer

Prime Therapeutics

07.2019 - Current

Hadoop Administrator

Visa Inc

09.2017 - 07.2019

Application Mgmt Sr. Advisor

Dell Inc.

04.2016 - 06.2017

Hadoop Administrator

Dell Inc.

01.2015 - 03.2016

Associate Consultant

HSBC GLTM

07.2013 - 09.2014

Software Engineer

Polaris Software Labs

08.2011 - 06.2013

Systems Engineer

Polaris software labs

02.2010 - 08.2011

Master of Computer Science -

Anna University

Paramesh

Summary

Overview

Work History

Sr Hadoop Engineer

Hadoop Administrator

Application Mgmt Sr. Advisor

Hadoop Administrator

Associate Consultant

Software Engineer

Systems Engineer

Education

Master of Computer Science -

Skills

Certification

Timeline

Sr Hadoop Engineer

Hadoop Administrator

Application Mgmt Sr. Advisor

Hadoop Administrator

Associate Consultant

Software Engineer

Systems Engineer

Master of Computer Science -

Similar Profiles

Phillips LeePhillips Lee

Angelica RogersAngelica Rogers

Jade MorenoJade Moreno

Anthony CleekAnthony Cleek

Farah TurnerFarah Turner