• Managing and setting up resources on the Microsoft Azure platform, including configuring Azure Virtual Networks (Vnet), subnets, and azure network settings, establishing Vnet peering, handling DNS configurations, implementing security policies and creating private endpoints.
Examine and evaluate the application and database framework to engineer an optimized and resilient solution on the Microsoft Azure Cloud platform. Implement cost-efficient strategies to manage and reduce expenditures, in accordance with client specifications.
Architect, configure, and automate multi-DC and Hybrid Cassandra cluster. Engage in advanced NoSQL distributed cluster governance, encompassing the integration and decommissioning of nodes, strategic capacity planning, performance calibration, cluster surveillance, and intricate troubleshooting.
Play a pivotal role in formulating the technology strategy by assessing and integrating new technology solutions. Aggregate and analyze performance metrics to present to executive leadership for strategic validation and approval.
Executed a version upgrade of Apache Cassandra from 3.11 to 4.1.4, involving the transition to a more recent release with zero application downtime.
Perform an assessment and detailed evaluation of the data modeling to identify and correct any data distribution imbalances within each data center of the Cassandra cluster that may degrade application performance.
Architected and deployed hybrid environments for both production and non-production, integrating both cloud and on-premises infrastructures. Furthermore, I orchestrated the scaling of these environments' capacity to fulfill the extensive demands of all applications.
• Orchestrated the migration of Oracle databases from on-premises systems to Azure database for PostgreSQL (Platform as a Service) by leveraging ora2pg and pgloader tools for data extraction, transformation, and ingestion.
Deployed Dynatrace agents across all Cassandra virtual machines to aggregate monitoring metrics and established tailored alerting mechanisms to capture precise performance data, enhancing troubleshooting capabilities and enabling accurate issue diagnosis.
Formulating and deploying advanced disaster recovery and high availability architectures for heterogeneous database technologies, including Cassandra, Azure Cosmos, and PostgreSQL, Couchbase, and Oracle.
Engage and contribute to ongoing enhancement initiatives to refine performance and conduct proactive maintenance, ensuring sustained application and database uptime. Oversee monitoring operations and systematically document incidents, configuration changes, and remediation strategies.
Convene with the application team to gain insights into application and database architectures, providing strategic recommendations on industry best practices. Participate in review sessions for deliverables associated with releases and project milestones.
Collaborate closely with the release management team to transition changes into the production environment and conduct comprehensive sanity testing.
Implemented the segmentation of a large Cassandra cluster spanning terabytes of data and facilitated the migration to a new Cassandra cluster to ensure optimal data distribution across the nodes.
Optimize the performance and execute stress testing to proactively diagnose and address issues. Administer and manage clusters using advanced tools such as OpsCenter, DevCenter, Nodetool, etc.
Deployed and configured Prometheus for metric collection, and integrated Grafana for the graphical user interface to visualize database metrics. Established critical alerting mechanisms to ensure timely issue detection.
Provisioned and scaled Presto by incorporating additional worker nodes, followed by performance optimization to efficiently extract large datasets in terabyte ranges.
• Proficient with infrastructure automation tools such as Ansible, having streamlined and automated numerous operations tasks and workflows using these technologies. Ensure that all Astra alerts and target remediations were addressed to maintain compliance with the company's security standards.
Work closely with Compliance and Site Reliability Engineering to ensure the managed Azure infrastructure is performant, healthy, patched, secure, and all SLOs and SLAs are met.
Leverage my technical proficiency to guide and mentor team members, formalize best practices in documentation, and lead knowledge transfer sessions on new initiatives and modifications.
Linux, Solaris, AIX, Windows, Oracle, SQLServer, Postgres, Couchbase, Cassandra, MongoDB, RMAN, OEM, TOAD, Data pump, VERITAS Net Backup, Putty, VERITAS cluster manager, Presto/Zepelin, PgAdmin, Nodetools
MCA, Osmania University, Hyderabad, India, 2008
Telecom Company, 02/01/14 to Present, Managing 100 + enterprise-level Oracle 19C database systems, achieving the goal of 99.99% availability at acceptable level of performance., Installed and configured oracle Golden Gate unidirectional table level / schema level between homogenous oracle to oracle rdbms replication., Proactive managing standby replication methodologies like Streaming replication and hot standby for disaster recovery for PostgresSQL db., Responsible for all backup, recovery, and upgrading of all of the PostgreSQL databases., Involved in Cassandra Cluster prod environments that includes adding and removing cluster nodes, Installation, Upgrading, performance tuning, cluster Monitoring, Troubleshooting., Installed and configured multi-nodes fully distributed Cassandra cluster., Installed, Configured, Administered Apache Cassandra 3.x., Decommissioning and commissioning the Node on running cluster/existing cluster., Understanding of Cassandra cluster management by using datastax OpsCenter, Deventer and node tool utility commands., Experience in setting up the required replication factors for key spaces in Cassandra., Exposure to Application Monitoring Tools like Splunk, Nagios, OpsCenter, Prometheus and Grafana., Troubleshoot read/write latency and timeout issues in Cassandra., Tunning the Oracle databases through OEM Grid Control, Explain Plan, Tkprof, AWR, ADDM, ASH and SQL Tuning Advisor (sqltrpt) utilities to find and tune the top wait events, top sql’s by elapsed time, CPU and IO Creating index, Sql profile and SPM to improve the query performance. BMS (Bristol-Myers Squib), 04/01/12 to 01/01/14, Worked on setting up Golden Gate and used Golden Gate to migrate databases using initial load expdp method created definition files and enabled ogg monitoring on servers., Used conflict detection resolution techniques and configured Golden Gate param files troubleshooting issues in v11,12c, Golden Gate unidirectional, bidirectional, peer-peer replication environments., Data refresh from production to development & test Databases., Database restore and recovery using RMAN and manual methods., Applying CPU/PSU patches on the database., Managing OCR, Voting disk and performing restore and recovery operations., Administering Oracle Clusterware and the RAC database instances using CRSCTL, SRVCTL, OCRCONFIG, CLUVFY utilities., Troubleshooting RAC Instances & Listeners. BP, 05/01/09 to 02/01/12, Performed day-to-day database administration tasks like Table space usage, checking alert log, trace files, monitoring disk usage, Table/index analyze jobs, database backup logs., Worked on Oracle Enterprise Management OEM 12c/13c to configure databases and its components and generated reports AWR, ASH, ADDM to analyze database performance and identify bottle necks., Data refresh from production to development & test Databases., Database restore and recovery using RMAN and manual methods., Cloning databases with various strategies like Cold, Hot and RMAN., Involved in Production Database Upgradations from 9i to 10g., RMAN backup setup for the newly cloned instances., Applying CPU/PSU patches on the database., Managing the Dataguard and ASM databases.