Summary
Overview
Work History
Education
Skills
Certification
Timeline
Generic

NIKHITHA M

Herndon,VA

Summary

Skilled Data Engineer with over 9 years of experience designing and implementing large-scale data solutions in cloud environments, specializing in Azure and leveraging Python for automation and advanced data processing. Proven expertise in building and optimizing end-to-end ETL pipelines, cloud migrations, and data integration. Strong knowledge of machine learning models and data-driven insights, with a keen interest in exploring Generative AI. Adept at collaborating with cross-functional teams to solve complex data challenges, streamline workflows, and drive innovation through cloud technologies and cutting-edge data engineering techniques.

Overview

10
10
years of professional experience
1
1
Certification

Work History

Data Engineer

NATIONAL INSTITUTE OF HEALTH\NIAID
Rockville, MD
12.2016 - Current
  • Spearheaded migration of on-premise SQL Server and Oracle databases to Azure SQL Database and Azure Data Lake using Azure Data Factory and Databricks, enhancing data accessibility and scalability across multiple environments.
  • Developed highly efficient ETL pipelines using Azure Data Factory, integrating with Databricks for distributed data processing, reducing data transformation times by 25%.
  • Automated data transformation and validation using Python (Pandas, NumPy), ensuring high data quality and seamless integration into Azure SQL and Data Lake environments.
  • Built and managed scalable data models in Azure SQL and Data Lake, leveraging PolyBase to query across on-premise and cloud-based data, resulting in optimized data retrieval and storage.
  • Designed and implemented interactive Power BI dashboards, integrating advanced Python visualizations to enhance data insights for NIH stakeholders, significantly improving decision-making processes.
  • Created automation workflows using Power Automate and Logic Apps for data ingestion from external sources, reducing manual intervention and improving efficiency.
  • Developed and optimized machine learning models using TensorFlow and Scikit-learn, collaborating with data scientists to integrate predictive analytics into Power BI for actionable insights.
  • Used Azure Blob Storage and SQL Data Warehouse to build scalable storage solutions for managing large datasets, ensuring high availability and real-time access to critical data.
  • Automated various database operations with PowerShell and Python scripts, including scheduling data loads, monitoring performance, and running data integrity checks, resulting in reduced operational overhead.
  • Implemented continuous integration and deployment (CI/CD) pipelines using Azure DevOps, Git, and Jenkins to automate code deployments, testing, and collaboration across development teams.
  • Improved the performance of real-time data processing by configuring Azure Synapse Analytics, enabling complex query handling across massive datasets with minimal latency.

Environment: Azure Data Factory, Databricks, Azure SQL, Azure Data Lake, PolyBase, Power BI, Python (Pandas, NumPy, TensorFlow), PySpark, Power Automate, Logic Apps, SQL Server, Azure DevOps, Azure Blob Storage, Azure Synapse Analytics, Git, Jenkins, PowerShell.

Database Developer/Analyst

GREAT AMERICA INSURANCE GROUP
Cincinnati, OH
10.2016 - 12.2016
  • Developed custom Python scripts for automating data extraction, transformation, and loading (ETL) processes in Azure Data Factory, improving data transfer efficiency.
  • Built end-to-end data pipelines in Azure using Python and Databricks, performing data cleansing and preparation for downstream analysis and reporting.
  • Automated large-scale data migration from on-premise Oracle databases to Azure SQL using Python and Azure Data Lake, ensuring data accuracy and timely delivery.
  • Utilized advanced Python data processing libraries (Pandas, NumPy) to handle complex data transformation tasks, significantly reducing manual intervention in the ETL process.
  • Designed interactive reports and dashboards in Power BI, integrating custom Python visualizations to enhance data analysis and visualization capabilities.
  • Collaborated with business users to implement tailored solutions using Python for analyzing complex datasets, providing insights for decision-making, and strategic planning.

Environment: Python (Pandas, NumPy), Azure Data Factory, Azure Data Lake, SQL Server, Oracle, Databricks, Power BI

SQL BI Developer

BLUE CROSS BLUE SHIELD
Naperville, IL
08.2014 - 09.2016
  • Develop dashboards and visualizations to help business users analyze data, as well as providing data insight to management. Focus on Microsoft products like SQL Server Reporting Services (SSRS) and Power BI.
  • Performed root cause analysis of database/SQL performance problems and recommended a solution for production, as well as release, environments.
  • Created SSIS packages with which data from different resources were loaded daily in order to create and maintain a centralized data warehouse.
  • Wrote parameterized queries for generating tabular reports, formatting report layout, creating reports using global variables, expressions, functions, sorting the data, defining data source and datasets, calculating subtotals, and grand totals for the reports using SSRS.
  • Implemented Windows PowerShell scripts to monitor the event logs of critical Windows servers in real-time and filter for specific errors, allowing me to view errors from the entire Windows infrastructure as they occurred in the environment.
  • Utilized T-SQL to pull loan data from various databases and created various dashboards, scorecards, and reports to support loan business decision-making.
  • Generated ad-hoc reports, sub-reports, drill-down reports, drill-through reports, and parameterized reports to provide visible data for data analysts and businesses using SSRS and Tableau.

Environment: SQL Server, SSRS, SSIS, PowerBI, Oracle, TFS, Visual Studio, SQL Profiler, MS Excel, Query Analyzer, PowerShell

Education

MASTERS IN COMPUTER SCIENCE -

University of Houston Clear Lake
05.2014

Skills

Programming Languages & Libraries:

  • Python (Pandas, NumPy, TensorFlow, Matplotlib, Seaborn), PowerShell, T-SQL, Java, HTML, CSS

Data Engineering & ETL:

  • Azure Data Factory, Databricks, PolyBase, Data Pipelines, Data Modeling, Data Transformation, Azure DevOps, SQL Server Integration Services (SSIS), SQL Data Warehouse, ETL Automation

Cloud Platforms & Tools:

  • Microsoft Azure (ADF, Databricks, Azure Data Lake, Azure SQL, Azure Blob Storage), Azure DevOps, Azure Functions, Logic Apps

Data Science & Machine Learning:

  • Python, TensorFlow, Scikit-learn, Feature Engineering, Data Cleansing, Exploratory Data Analysis (EDA), Data Mining, Machine Learning Models

Data Visualization & Reporting:

  • Power BI, Tableau, QlikView, SSRS, Excel, Crystal Reports, Advanced Python Visualizations

Version Control & Automation:

  • Git, TFS, SVN, Azure DevOps, Continuous Integration/Continuous Deployment (CI/CD), Jenkins

Other Tools:

  • Jupyter Notebook, Microsoft Graph, SharePoint, MATLAB, Weka, Apache Spark

Certification

Microsoft Azure Data Engineer Associate

Timeline

Data Engineer

NATIONAL INSTITUTE OF HEALTH\NIAID
12.2016 - Current

Database Developer/Analyst

GREAT AMERICA INSURANCE GROUP
10.2016 - 12.2016

SQL BI Developer

BLUE CROSS BLUE SHIELD
08.2014 - 09.2016

MASTERS IN COMPUTER SCIENCE -

University of Houston Clear Lake
NIKHITHA M