Certified data professional with 16 years of experience in designing, implementing, and optimizing data solutions. Seeking to leverage my expertise to drive data-driven insights and support business growth in a challenging and dynamic environment.
Overview
16
16
years of professional experience
1
1
Certification
Work History
Senior Data Engineer
SureCo Health care
Santa Ana, CA
10.2021 - 03.2024
Worked closely with business stakeholders, interacting across departments
Senior management to translate reporting/data requirements into technical specifications for the areas of member management, enrollments, plans, benefits, claims, operations, finance, HRIS, telehealth services
Designed and implemented scalable and efficient data solutions on Google cloud data platform, SQL Server, PostgreSQL, DynamoDB to support Sureco’s business use cases, including application databases, data warehousing, data lakes, and real-time analytics, and reporting initiatives
Designed and implemented Real-Time Data Pipelines on Google Cloud Platform to stream data from SQL Server, PostgreSQL, DynamoDB to Google BigQuery using Cloud Pub/Sub, Cloud Dataflow, and BigQuery, reducing data processing latency by 30%
SQL Server: Streaming change data capture (CDC) events from SQL Server utilizing PyODBC library to establish connections to SQL Server from within the Cloud Function PostgreSQL: Implemented Change data capture (CDC) solution using Apache Kafka and Debezium to capture database changes
DynamoDB: Enabled DynamoDB streams, and created AWS Lambda function that reacts to DynamoDB events and publishes them to the Pub/Sub topics
Implemented MDM solution in SQL Server then migrated into BigQuery for member management, integrated data from disparate sources such as member portals, health care providers, CRM systems and third-party data providers and telehealth service systems
Migrated UHSM claims data from SQL Server to Google BigQuery using Big Query Migration Service and integrated with SureCo’s Datawarehouse to enable claims analysis
Created Data Pipelines to extract and feed data to & from internal and external systems using Google Cloud Services, AWS Services, Fivetran, DBT, Scripts (SQL, Python, Bash, gsutil tool), Spark, Kafka Used Cloud Functions to implement serverless functions for handling specific tasks, such as sending email notifications, data streaming, data processing, file delivery and data validations
Developed custom data processing logic using Apache Beam to handle complex event processing and enrichment in real-time streaming applications
Utilized BigQuery for data warehousing and analytics, optimizing query performance and reducing costs through partitioning, clustering, and materialized views, authorized views, external tables, ensuring efficient execution of analytical workloads
Heavily worked on T-SQL/BigQuery/PGSQL scripts, stored procedures (SQL/JAVA), triggers, views/materialized views, cursors, CTE’s, functions, and performance issues for transactional and data warehouse systems (SQL Server, BigQuery, Postgres SQL) Generate multiple Enterprise reports, interactive dashboards, and visualizations in Tableau by utilizing BigQuery/SQL Server for both operational and analytical purposes to analyze enrollment trends, sales/product KPIs, member engagement, HRIS Paycheck deductions, cost containment, reimbursement rates and claims, member utilization patterns, trends related to utilization of wellness programs and telehealth services, risk scores and predictive model outputs
Created data lifecycle policies and versioning strategies to manage data retention and archival in Google Cloud storage to store documents, images related to enrollment forms, policy documents, client enrollment configurations, medical records, plan information, and created external tables in BigQuery, BigQuery Storage Write API to reference data stored in Cloud Storage for analysis, reporting
Developed and maintained SQL Server, AWS PostgreSQL and DynamoDB database solutions for Sureco’s enrollment platform, Plans data by implementing best practices for database design, schema optimization, and query tuning to enhance overall system performance and efficiency
Led the adoption and implementation of DBT as a data transformation tool, integrated with BigQuery, and created scalable and repeatable data models
Integrated ThoughtSpot with BigQuery to conduct data exploration and analysis on UHSM data and enabled search-driven analytics interface to uncover actionable insights such as claims analysis, provider performance tracking, and member engagement by creating clusters, worksheets, pinboards and visualizations
Created data pipelines to process and drop Member/claims data files to/from vendor SFTP/cloud storage location
Utilizing CI/CD pipeline using Google Build, Jenkins, sqitch with GitHub/Big Bucket integration for development and deployment activities Created DAGs in Airflow on Google Cloud Composer to orchestrate ETL processes and data workflows
Managed Kubernetes clusters on Google Kubernetes Engine (GKE), ensuring high availability and scalability of containerized applications
Established standards, best practices, frameworks and incorporated them into engineering solutions, Implemented Data Governance, Data Masking Policies (PII, PHI, HIPPA), Data security, Data Profiling, Data Encryption, Data loss prevention (DLP) policies around BI Platform Designed/Documented Data models, Data mappings for both transactional and analytical systems
Worked in Agile & Kanban development methodologies.
Technical Lead, Data Engineering
Legalzoom.com Inc.
Glendale, CA
- 10.2021
Led Business Intelligence and Data engineering team by taking responsibility of overall technical solutions interacting across departments
Senior management to translate requirements into technical specifications
Led the architecture and migration efforts to transition on-premises data warehouse and legacy systems to Snowflake Data Cloud/Google BigQuery, leveraging best practices for data modeling, schema design, and ELT/ETL processes for Data warehouse, data lake, data marts, analytical databases resulting in a 55% reduction in infrastructure costs & 65% improvement in query performance
Implemented LegalZoom’s customer centric Data Lake/Datawarehouse starting from inception to implementation and ongoing support
Implemented ETL/ELT solutions for Snowflake/SQL Server/Redshift/BigQuery environments to extract and feed data to & from for various systems including enterprise domains, Salesforce, Marketing cloud, FileNet, Google Analytics, Tealium, Questionnaire, Subscriptions, Secretary of State, NPS, Revstream, Attorney profile/schedules using SQL, SSIS, Fivetran, Informatica, Matillion, SnowSQL, Snowpipe, AWS Glue/Lambda/SNS topics, Google Dataflow/BigQuery Transfer Service, gsutil tool, Denodo, Python, spark, Kafka, kinesis, PySpark, Snowpark, External functions
Designed and developed modular and scalable data models using DBT with DAGs in Airflow to enable flexible and efficient data transformations and orchestration
Extensively worked on SQL Scripts, stored procedures, triggers, views, materialized views, cursors, CTE’s, functions, python, JSON, XML scripts and performance issues with transactional and data warehouse systems (Snowflake, SQL Server, BigQuery)
Optimized query performance and data processing efficiency in Snowflake, BigQuery, and SQL Server through query tuning and schema optimization
Designed and implemented real-time streaming data processing pipelines using Apache Spark Streaming and Structured Streaming in Databricks/via Snowpipe into snowflake to ingest, transform, and analyze high-volume data streams from Event Hubs and Kafka to provide real-time insights into sales trends, order processing performance and forecast future sales projections
Designed and maintained data pipelines using Google dataflows and populated data from google Analytics/Snowflake/SQL Server into Google BigQuery to support attribution and customer interaction related insights, reporting
Created data pipelines to process and drop third-party vendor Leads files into vendor SFTP
Worked on ML models for advanced analytics using AWS Cloud services to analyze customer feedback, predict ETA averages, anomaly detection in LZ systems for demand forecast, market penetration analysis & to predict Customer LTV values
Utilizing CI/CD pipeline with Jenkins for development & deployment activities Generated multiple Enterprise reports, interactive dashboards and visualizations using Tableau, SSRS, ThoughtSpot for both operational and analytical purposes, created dynamic and engaging visualizations that facilitate data-driven decision making such as sales/product/web KPIs, subscription momentum, Bill through trends, ETA, Bookings, Revenue Reporting, LTV/LTR, market share analysis, customer segmentation insights, NPS Scoring, Compliance Monitoring, competitor benchmarking, Entity Registration Developed SSAS Cubes to pre-process order data and built a BOOKINGS dashboard in SSRS and later migrated into Tableau
Setting up ThoughtSpot and optimized semantic layer in Snowflake for business users to create/consume/operationalize data-driven insights in ThoughtSpot
Heavily worked on Data analysis for solution designing, helping data analysts, scientists, and business users to answer business questions
Established standards, best practices, frameworks and incorporated them into engineering solutions, Implemented Data Governance, Data Masking Policies (PII, GDPR, CCPA), Data security, Data Profiling, Data Quality around BI Platform
Worked in Agile & Waterfall development methodologies
Being a data technologist and evangelist for the organization and provided training on new technologies and tools to the team members and senior staff.
Senior BI Developer/DBA
RealEC Technologies
Houston, TX
05.2009 - 01.2010
Led the team of 3 developers and communicated, coordinated status updates to management
Involved in RealEC Data Warehouse development starting from inception to implementation and ongoing support
Designed ETL Packages to bring data from existing OLTP databases over to the new data warehouse by performing different kinds of transformations using SSIS
Worked on Dimensional Data Modeling using Star and snowflake schemas for Fact and Dimension tables
Written stored procedures, Triggers, User-defined Functions, Views, and Cursors for report use, did lot of SQL performance monitoring and tuning of reporting data
Designed, developed, and deployed new reports and enhancements to existing Reports, for various applications as assigned.
BI/SQL Developer
Camelot Integrated Solutions
Houston, TX
01.2008 - 12.2008
Collected business requirements from users and translated them as technical specs and design docs for development
Identified various data sources, formulated data dictionaries, design and develop data models (physical/logical data models) based on the given specs and requirements
Created Stored Procedures, Triggers, Indexes, User defined Functions, Constraints on various database objects to obtain the required results
Transferred data from various data sources/business systems including MS Excel, MS Access, Flat Files to SQL Server using SSIS using various features like data conversion etc
Created derived columns from the present columns for the given requirements
Involved in dimensional modeling (Star &Snowflake) and creating the Data Source Views (DSV) and SSAS Cubes using the identified Fact and dimension tables
Designed, developed, and deployed new SSRS reports and enhancements to existing Reports, for various applications as assigned.
Senior SQL Reports Developer
Steppingstones (Aliteck Consulting)
Houston, TX
01.2010
Involved in Integration of ET Web Application with SQL Server Reporting Services 2005/2008 Did Solid SQL Server coding, debugging and performance tuning, Wrote Complex queries using complex joins, CTE’s
Transformed complex business logic into Database design and maintaining it by using SQL tools like Stored Procedures, User Defined Functions, Views, TSQL Scripting and implemented an extra level of security by creating views and creating stored procedures to query the confidential data
Created and coordinated complex stored procedures to use as the datasets for the Report Design, to generate Ad hoc reports using SSRS.
BI Developer
SIEMENS IT Solutions & Services
Mason, OH
11.2008
Migrated/Converted older MS SQL databases to MS SQL Server 2008
Created SSIS packages to transfer data from Oracle and SAP to SQL server using different SSIS components and used configuration files and variables for production deployment
Created the SSAS cubes using the respective fact and dimension tables, performed the processing and deployed the cubes to SQL server analysis services database
Developed reports from SQL server and cubes using SQL server analysis services database
Developed stored procedures (T-SQL/PL-SQL), functions, views, and triggers
Used ProClarity tool to analyze the data in the cube by using the different features like Chart View, Decomposition Tree, Performance Maps
Senior Business Intelligence Database Developer
IBM LBPS
Beaverton, OR
Worked closely with application users, business and senior analysts, architects, and other developers to analyze, design and develop analytics/EDW database related components for the IBM LBPS HAMP Fulfillment Program, data flow and data content/quality issues
Involved in developing EDW (Enterprise Data Warehouse), ABIP (Analytics applications) databases and provided estimates of effort required to design and develop solutions
Extensively worked on stored procedures, triggers, Views, cursors, CTE’s, T-SQL scripts, schemas, permissions and performance issues with client/server database design
Created the SSIS packages to read/extract the data from excel sheets, SQL Server, other data sources and merge into SQL Server and vice-versa, involved in scheduling and monitoring.
Education
Master of Science -
McNeese State University
01.2007
Bachelor of Technology -
JNT University
01.2005
Skills
Business Intelligence/Data Architecture and Development
Dimensional Modelling
Snowflake
DBT
MSBI Stack (SQL Server 2019/2017/2014/2012/2008/2005, SSIS, SSAS, SSRS, POWERBI)
Google Cloud (Cloud Storage, Pub/Sub, Cloud Functions, Dataflow, Data proc, Bitbucket, Cloud SQL, BigQuery, Big Query Data Transfer Service, Composer, Looker)
AZURE (Blob Storage, Data Factory, Synapse)
Databricks
Hadoop
SQL (T-SQL, SNOWSQL, PLSQL, PGSQL, BTEQ)
MDX
Python
PySpark
JAVA
PowerShell
JSON
API Integration
Pig
Hive
Spark
GitHub
Big Bucket
Jenkins
Sqitch
Docker
Embarcadero
Erwin
Visio
Coda
Denodo
Alteryx
Salesforce
Marketing Cloud
JIRA
Confluence
Python Programming
Performance Tuning
Data Modeling
Continuous integration
Data Warehousing
SQL and Databases
Business Intelligence
Data Visualization
Certification
Google Cloud Professional Data Engineer
Snowflake Web UI Certification
Neo4J Certified Professional
Denodo: Data Virtualization Architect
Querying Microsoft SQL Server 2012 (Exam ID 461)
Administering Microsoft SQL Server 2012 Databases (Exam ID 462)
Implementing a Data Warehouse with Microsoft SQL Server 2012 (Exam ID 463)
Microsoft SQL Server 2005 Business Intelligence – Implementation and Maintenance (Exam ID 445)
Microsoft SQL Server 2005 – Implementation and Maintenance (Exam ID 431)