Solutions Architect with extensive experience in designing and managing large-scale data infrastructures for enterprise organizations.
Proven expertise in database design, data modeling, and ETL processes, resulting in enhanced system performance, optimized data processing, and highly available and redundant frameworks.
Skilled in implementing data security measures and ensuring compliance with HIPAA, GDPR, and other data security regulations. Successful in driving business intelligence initiatives that align with organizational objectives.
Specialized in database and data warehouse migrations on-premises and cloud.
AWS Certified Solutions Architect – Professional, AWS Machine Learning, and Big Data Analytics – Specialty Professional.
PMP and Lean Six Sigma Green Belt certified.
Proficient in data lake architecture, ingesting, cataloging, processing, and securing data using various technologies in AWS.
Expert in client and technology engagement with business analysts to develop key performance indicators, leveraging AWS, GCP, and Azure technologies transformation capabilities.
Expertise in client engagement with different architecture visualization tools, like Draw.io, ER/Studio, Datagrip, and Visio.
Expertise in BI and warehouse decision-making for the given client requirement.
Specialized in Disaster recovery frameworks and Costoptimied solutions
Overview
15
15
years of professional experience
1
1
Certification
Work History
AWS Data Architect
Metropolitan Police Department
Washington, United States
10.2023 - Current
Led the transition from SQL Server to AWS RDS, focusing on optimizing system efficiency and operational costs, resulting in estimated annual labor savings of $475,000. Utilized Snowflake for enhanced cloud data warehousing, improving data processing and storage capabilities.
Employed AWS Data Migration Service for the efficient ETL transition of SQL Server to AWS RDS. Ensured seamless integration with SQL Server's advanced features, complementing them with Snowflake's scalable data storage solutions.
Enhanced MPD's evidence analysis technology on AWS, leveraging AWS Glue for serverless ETL data integration. Used AWS Lambda for automating tasks in SQL Server, integrating these processes with DBT for efficient data modeling and transformations.
Improved stakeholder engagement and reporting capabilities by migrating to AWS Quicksight and SQL Server SSRS, achieving significant annual labor savings. This involved integrating DBT to optimize data extraction and transformation processes.
Conducted a comprehensive optimization of AWS resources and SQL Server instances for ETL, employing AWS Cost Explorer for resource analysis. Implemented Snowflake for cost-effective data storage, complementing SQL Server's Data Compression and Resource Governor features.
Established AWS S3 for robust data storage solutions and implemented data lake strategies using AWS Lake Formation. Enhanced querying capabilities by leveraging Snowflake, integrated with Power BI for dynamic data visualization.
Carried out a proof of concept using Amazon Redshift Spectrum to handle diverse data formats like Iceberg, integrating these capabilities with DBT for more efficient data handling and analysis, supplemented by Power BI for visualization.
Streamlined the existing data pipeline in Azure, employing Amazon EMR and AWS Glue for effective ETL operations. Integrated these solutions with DBT for enhanced data processing and reporting, using Power BI for analytical insights.
Developed and maintained a CI/CD pipeline for data analytics using AWS CodePipeline and AWS CodeBuild. This included the integration of SQL Server tools with Snowflake, ensuring continuous improvement in ETL efficiency and data analytics performance.
Focused on cloud database solutions, specifically Amazon Redshift, leveraging its compatibility with Snowflake for advanced data warehousing. Emphasized the use of DBT for data transformation, enhancing SQL Server's ETL performance.
Architected an enterprise data model, integrating data from various sources. Used AWS Schema Conversion Tool for alignment with SQL Server and DBT, ensuring consistent performance and efficient data processing.
Built efficient data pipelines across multiple cloud platforms, utilizing AWS Data Pipeline and Amazon Kinesis for real-time data processing. Incorporated Snowflake for its robust data warehousing capabilities, enhancing overall data management efficiency.
Implemented a comprehensive data backup and recovery strategy with AWS Backup. Incorporated SQL Server's robust backup features, ensuring data security and minimal downtime in ETL processes, supplemented by the reliability of Snowflake for data storage.
Software Architect, R&D Department
AIR-Worldwide
Boston, United States
08.2019 - 10.2023
Modeled data for multiple internal and external clients based on the database and data warehouse.
Migrated on-prem Postgres databases to AWS RDS AURORA Postgres serverless, applying NPV enhancement techniques to analyze long-term cost benefits. This transition led to a 25% reduction in operational costs, resulting in annual labor savings of approximately $350,000.
Utilized Python for automating dataflows in OLAP and OLTP applications within engineered data lake and warehouse solutions, enhancing operations in cloud and on-premise environments. Integrated Power BI for ETL data analysis, QuickSight for dynamic visualization, and DataDog for comprehensive system monitoring.
Conducted Python-based scripting for migration, development, and integration of applications and databases to cloud platforms, focusing on high availability and disaster recovery. Leveraged Power BI for in-depth analytics, QuickSight for real-time dashboard creation, and DataDog for ongoing system monitoring.
Evaluated multiple Databases like Postgres, Postgres Serverless, Oracle, MySQL, and MariaDB for different applications for different internal and external client applications.
Integrated Mulesoft with database APIs with Database Connectors in the MuleSoft Anypoint Platform to interact with relational or NoSQL databases, and exposing these interactions as RESTful APIs via MuleSoft flows.
Designed Python-based engines to manipulate and analyze data for research on Massively Parallel Processing (MPP) data-marts across various distributed file systems. Used Power BI and QuickSight for data visualization and analysis, with DataDog for tracking ETL system performance.
Developed and maintained DBT models for transforming and structuring data in data warehouses: This involves designing and coding DBT models to transform raw data into a format suitable for analysis. This includes creating complex SQL queries, tests, and documentation to ensure data quality and reliability.
Integrated DBT with CI/CD pipelines for automated deployment of data models: Configured Continuous Integration and Continuous Deployment (CI/CD) pipelines to automate the testing and deployment of DBT models. Ensured seamless integration of data transformations into the overall data pipeline.
Managed BI applications using Apache Superset, applying resource leveling strategies for efficient data transfers between Azure Blob Storage and AWS S3. This led to $230,000 annual labor savings.
Carried out salesforce integrations with Mulesoft and Postgres RDS
Orchestrated fault-tolerant data workflows using Azure DevOps and Apache Airflow, with Python scripts for automation and efficiency. Incorporated Power BI and QuickSight for data analytics and reporting, and DataDog for monitoring pipeline health.
Executed POCs for data partitioning and high availability clustered databases, leveraging Python with AWS Glue, Azure Data Factory, and Kafka for real-time data handling. Analyzed and visualized data using Power BI and QuickSight, with DataDog for real-time monitoring and alerting.
Developed core logic for geospatial operations within databases using Python, enhancing performance with technologies like Pgbouncer and Azure Connection Pooling. Utilized Power BI and QuickSight for advanced geospatial data visualization, monitored with DataDog.
Designed comprehensive data flow models and established POCs, including DBT models for data transformation, utilizing Python for scripting. Analyzed effectiveness using Power BI and QuickSight, with DataDog for performance monitoring and logging.
Established BI web applications using Apache Superset, managing data transfers between Azure Blob Storage and AWS S3 with Python's data manipulation capabilities. Integrated Power BI and QuickSight for enhanced data visualization, and employed DataDog for tracking ETL data transfer processes.
Led a team of software engineers to successfully develop a major software project on time and within budget.
CIO/ Data Architect
Verizon
Irving, United States
03.2019 - 08.2019
Led the architectural design of a governance solution for Verizon's 50,000 node Hadoop cluster, focusing on secure and restricted data access. Integrated Apache Iceberg for advanced data partitioning and snapshot management.
Managed comprehensive data governance processes, optimizing structures within the data lake. Implemented Python scripts for automated data management, and explored the use of Apache Hudi for efficient data lake management and time travel queries.
Conducted a comparative analysis of Collibra vs. IBM Unified Governance platforms through proof of concept (POC) assessments. Utilized Python for in-depth data analysis and leveraged Apache Iceberg in the POC to assess performance improvements.
Oversaw massive data ingestion processes, handling approximately 160 terabytes daily. Employed Apache Hudi to streamline data updates and deletions, enhancing the efficiency of data processing.
Embraced real-time data ingestion using streaming services such as Kafka and AWS Kinesis, with Python scripts for process automation. Explored the integration of Apache Iceberg to manage large-scale streaming data more effectively.
Integrated data ingestion across various cloud platforms, maintaining high bandwidths. Used Apache Hudi for its efficient data storage and retrieval capabilities, ensuring optimized performance across different cloud environments.
Implemented NIFI clusters for data ingestion tasks and designed efficient Directed Acyclic Graphs (DAGs). Investigated the use of Apache Iceberg to improve data organization and query performance within NIFI workflows.
Achieved process optimization through automation and incremental data processing in workflow scheduling and management with Apache AIRFLOW and NIFI. The automation and improved scheduling resulted in a 15% reduction in operational costs, equating to annual savings of $180,000.
Utilized Apache AIRFLOW DAGs for workflow scheduling and management, integrating with NIFI for data processing. Python was instrumental in automating these workflows, and Apache Hudi was used for its incremental data processing features.
Sr. Data Engineer III
Comcast
Philadelphia, United States
06.2018 - 03.2019
Migrated 60 TB SQL legacy database to Azure Cloud using BCP, Azure ExpressRoute, and Azure Data Migration Services.
Conducted data migration between Azure SQL Databases across regions and from on-prem legacy databases.
Set up and maintained Azure SQL Databases in diverse regions, emphasizing data normalization.
Initiated Azure SQL Database instances, overseeing necessary data migration.
Formulated, sustained, and crafted access patterns for Azure Cosmos DB, bolstering Azure Functions access.
Established Azure SQL Data Warehouse and cataloged metadata in Azure Data Catalog.
Managed Azure Cosmos DB tables using Terraform, allocating Blob Storage containers for varied environments.
Employed Azure Data Factory for transfers among SQL databases, Cosmos DB, and Azure Blob Storage.
Sr. Data Engineer
MasterCard
OFallon, United States
12.2017 - 06.2018
Oversaw MySQL databases using Amazon RDS, ensuring efficient management and maintenance.
Managed a variety of AWS resources through the AWS Management Console for streamlined operations.
Handled data storage and management in Amazon S3, providing secure and scalable solutions.
Implemented robust data warehousing solutions using Amazon Redshift, enhancing data analysis capabilities.
Leveraged AWS Data Pipeline for efficient ETL (Extract, Transform, Load) tasks, optimizing data workflow processes.
Orchestrated and managed microservices architecture using Amazon ECS (Elastic Container Service) for improved scalability and reliability.
Monitored AWS cloud resources and applications effectively using AWS CloudWatch, ensuring optimal performance and availability.
MySQL Developer
Affinion Group
Trumbull, United States
03.2016 - 12.2017
Conducted audits of user authentication logs on the database server to ensure secure access and compliance.
Identified and shared long-running or low-performing SQL queries with developers for optimization and performance improvement.
Performed regular reviews of log usage and engaged in consistent database monitoring for operational efficiency.
Ensured the reliability of database backups, including exports and hot backup emails, to maintain data integrity.
Created and developed SQL Server Reports using SSRS, focusing on generating detailed drill-down and drill-through reports.
Managed free space and host availability on backup and archive directories, maintaining sufficient storage resources.
Upgraded MySQL from version 5.5 to 5.7, which included migrating from MyISAM to InnoDB storage engines for enhanced database performance.
Monitored and managed database size, performed analysis of database tables and indexes, and conducted index rebuilding when necessary to reduce fragmentation.
My SQL Admin/Developer
Instrumentation Laboratory
Boston, United States
12.2014 - 07.2015
Managed MySQL database installation, configuration, system administration, provisioning, and troubleshooting, ensuring system integrity and availability for high-traffic web applications.
Conducted daily performance tuning and capacity planning using MySQL Enterprise Monitor to prevent issues and maintain optimal database performance.
Played a key role in designing, developing, testing, and deploying complex enterprise applications, both in the database and frontend, using development models like SCRUM.
Developed Unix/Linux specific scripts to automate database administration functions and streamline processes.
Handled capacity planning and performance tuning of databases, including upgrades, creating partitions, and setting up replication and monitoring for MySQL databases.
Maintained materialized views for efficient replication and performed thorough reviews and analysis of technical designs for the product.
Created stored procedures and user-defined functions for CRUD operations and business logic integration with front-end applications.
Provided custom reports using SQL and Excel to management, analyzing data trends and patterns, and ensuring accurate data testing and validation between source and destination for ETL loads.
Database Administrator
Cognizant
Hyderabad, India
12.2013 - 11.2014
Managed database concurrency by effectively setting locks for queries, resolving issues related to table locking, concurrent inserts, and external locking.
Handled the creation of user accounts and assignment of appropriate privileges, ensuring robust database security.
Configured RMAN (Recovery Manager) and implemented best practices for database backup and recovery to safeguard data.
Assisted in disaster recovery procedures, utilizing data guard for recovery in Oracle 11gR2 environments.
Collaborated with application developers to modify database structures as needed, supporting various application development activities.
Developed and maintained standards and guidelines for the database to ensure uniform practices across the project.
Coordinated with UNIX administrators for efficient space management on various servers, optimizing database performance.
Conducted code reviews for submissions from the application team to ensure compliance with established standards and best practices.
Business Data Manager
RCHyderabad
Hyderabad, India
11.2012 - 10.2013
Developed and executed a strategic data plan aligning with RC Hyderabad's business objectives, focusing on managing customer and IT support databases.
Established robust data governance policies and processes, ensuring data accuracy and security throughout the organization.
Integrated external data sources to augment the organization's data ecosystem, enhancing overall data capabilities and insights.
Collaborated with analysts to create comprehensive reports and dashboards, providing valuable insights for decision-making and strategy development.
Implemented stringent data security measures, ensuring adherence to privacy regulations and promoting a data-driven culture within the organization.
Data Quality Analyst
GMR Aero Tech
Hyderabad, India
09.2010 - 10.2012
Utilized Informatica Data Quality and IBM InfoSphere QualityStage for conducting detailed data profiling and analysis at GMR Aero Technic.
Defined and enforced strict data quality rules and standards, ensuring high data integrity across the organization.
Employed Talend, an open-source ETL tool, for effective data cleansing and transformation, improving overall data usability.
Monitored and maintained data quality metrics, addressing any arising issues with solutions like Trillium Software for enhanced data reliability.
Conducted thorough investigations and root cause analysis of data quality problems using tools such as Informatica Data Quality.
Collaborated with various stakeholders to address data quality challenges, utilizing platforms like Microsoft SharePoint for effective communication and documentation.
Property and Evidence Technician / Major Narcotics Branch/VCSD at Metropolitan Police DepartmentProperty and Evidence Technician / Major Narcotics Branch/VCSD at Metropolitan Police Department