Summary

Overview

Work History

Education

Skills

Websites

Kaggle

Personal Information

Timeline

Fariha Baloch

Erie,USA

Summary

Experienced in working on complex operations focused projects, identifying key questions to address data challenges, formulating hypotheses, and finding creative solutions for all audiences. Proficient with clustering, classification & regression modeling, statistical analysis using Python, SQL, PANDAS and AWS.

Overview

years of professional experience

Work History

Data Engineer

Workday

Boulder, Colorado

11.2023 - Current

Architected and developed ETL data pipeline using tools like AWS Glue, Airbyte, AWS RDS and Tableau that brought in data from several disjoint sources together and helped the teams build resiliency metrics from one unified platform.
Researched on the ETL tools available in the market today and decided to host Airbyte OSS on AWS EKS cluster instead of buying licencse for the cloud version. This saved significant number of dollars for the org. Spent time learning the tool and wrote source code for internal data sources that were made part of unified data pipeline.
Wrote code for IaaS in Terraform and ArgoCD. This included using Helm module in Terraform to bring up FluentBit to collect logs and metrics from EKS cluster. Writing code for IaaS ensures that we can save time in case the current infrastructure becomes offline or a new duplicate configs are needed for DR or other purposes.
Worked with the internal teams to build Developer Experience metrics in Grafana that shows a team's strengths and weaknesses during different SDLC phases including their SLO/SLI metrics that helps with forecasting issues in production pipeline. By following SLO violations in their services, recently a team was able to find out an issue hours before customer reported and were ready with the root cause.

Data Analyst

Workday

Boulder, Colorado

01.2022 - 11.2023

Applied topic modeling using LDA and BERTopic models to a set of transcripts to reduce time to in extracting the summary of the interviews. This helped saved a few hours in data analysis portion of the
Collected, analyzed and visualized data using Tableau from several teams in the organization to recognize the opportunities for the teams to invest in and make critical decisions using data. The results were used by different orgs to help plan their future product goals while keeping their developer’s interest at the forefront.

Data Scientist

Planetary Care

12.2020 - 12.2021

Extracted Reddit posts using PushShift Reddit API to create a sentiment analysis tool that takes a key phrase and finds sentiments in the posts and generates scatter plot of the sentiments' intensity over a time period set by the user. The tool is developed with Dash using plotly graphs and hosted on AWS beanstalk for non technical teams to interact easily.
Created a tool using Genism library that extracts the keywords, the key phrases and the summary from any PDF document or any website. The code is used to identify and promote relevant literature on the regenerative agriculture. By using this code, the research team saved time by reading only the articles that are relevant to a customer's need.

Data Scientist

FinGoal (Techstars/MetLife '20)

08.2020 - 12.2021

To enrich the credit card transactional data and understand a user's preferred qualities in a restaurant, performed web scarping of the attributes published on the restaurants' yelp webpages using requests and BeautifulSoup libraries. Saved time in data collection process by using webscraper instead of manual addition of the fields.
The data was used to give a highly personal advice to the customers, based on their favored choice of the attributes of a restaurant.
Performed clustering of the merchants in a credit card transactional data with similar attributes. The summaries from each merchant's wiki-page were extracted using the Wikipedia app and transformed to vectors using TF-IDF and Doc2vec models. This give the model extra help in predicting customer's preferences and resulted in better clustering results.

Product Validation Lead

Intel Corporation

Folsom, CA

01.2017 - 01.2018

Lead a PCI-E based SSD project for the validation team to ensured validation activities were carried out according to established guidelines from the org.
Strategized the influx of the work for several validation sub-teams and streamlined the outflux of results for several of Intel’s top customers.
Presented status using several indicators of the project progress and the quality of the products to the management keeping the program release on time and mitigated risks as soon as those arose.

QA Lead and Scrum Master

NetApp

Wichita, Kansas

09.2007 - 01.2017

Scrum Master for multiple technical projects
Leveraged Agile principles to keep teams on track and become self-organized
Helped build new metrices that reflected a product’s health on the data collected from all sub-teams and aggregated results in one visual chart.

Education

Career Track Certification in Data Science - Data Science

Springboard

10.2020

Certification in Data Analysis -

Cornell University

04.2018

PhD in Electrical Engineering -

Wichita State University

05.2014

Skills

AWS Glue
AWS EKS
Python
SQL
ML

ETL
Terraform
Data Analysis
Requirements Gathering
Key Performance Indicators

Websites

Kaggle

www.kaggle.com/fariha23

Personal Information

Title: Data Expert | Scrum Master

Timeline

Data Engineer

Workday

11.2023 - Current

Data Analyst

Workday

01.2022 - 11.2023

Data Scientist

Planetary Care

12.2020 - 12.2021

Data Scientist

FinGoal (Techstars/MetLife '20)

08.2020 - 12.2021

Product Validation Lead

Intel Corporation

01.2017 - 01.2018

QA Lead and Scrum Master

NetApp

09.2007 - 01.2017

Career Track Certification in Data Science - Data Science

Springboard

Certification in Data Analysis -

Cornell University

PhD in Electrical Engineering -

Wichita State University

Fariha Baloch

Summary

Overview

Work History

Data Engineer

Data Analyst

Data Scientist

Data Scientist

Product Validation Lead

QA Lead and Scrum Master

Education

Career Track Certification in Data Science - Data Science

Certification in Data Analysis -

PhD in Electrical Engineering -

Skills

Websites

Kaggle

Personal Information

Timeline

Data Engineer

Data Analyst

Data Scientist

Data Scientist

Product Validation Lead

QA Lead and Scrum Master

Career Track Certification in Data Science - Data Science

Certification in Data Analysis -

PhD in Electrical Engineering -

Similar Profiles

SAI KUMAR YADASAI KUMAR YADA

David RiveraDavid Rivera

Lavanya PagoluLavanya Pagolu

Rajesh GollapudiRajesh Gollapudi

Ann MuczynskiAnn Muczynski