Summary
Overview
Work History
Education
Skills
Timeline
Generic

Salma Mohammed Ali

Seattle,WA

Summary

Experienced in the full Systems Development Life Cycle (SDLC) and Agile (Scrum) Software Development, including requirements gathering, analysis, design, and implementation. Proficient Data Scientist/Data Analyst with 3+ years of expertise in data analysis, machine learning, and data mining. Skilled in statistical programming languages like R and Python, and Big Data technologies such as Hadoop and Hive. Extensive involvement in all project phases, from data acquisition to testing, validation, and data visualization. Capable of performing data parsing, manipulation, and preparation using various methods and packages. Strong experience in text analytics, generating visualizations, and creating dashboards with tools like Tableau. Hands-on experience with big data tools like Hadoop, Spark, and Hive, and implementing various machine learning algorithms. Good industry knowledge, problem-solving skills, and ability to work both independently and within a team. Proficient in transforming business requirements into analytical models and developing scalable data solutions. Experience in designing online applications and architecting data warehouse/business intelligence applications. Skilled in data analytics, reporting, and extraction from databases like Oracle, SQL Server, and DB2.

Overview

3
3
years of professional experience

Work History

Data Analyst

XavemCloud
03.2023 - Current
  • Produced monthly reports using advanced Excel spreadsheet functions.
  • Identified, analyzed and interpreted trends or patterns in complex data sets.
  • Created various Excel documents to assist with pulling metrics data and presenting information to stakeholders for concise explanations of best placement for needed resources.
  • Utilized data visualization tools to effectively communicate business insights.
  • Conducted extensive exploratory data analysis on vast datasets comprising customer demographics, transaction history, and behavioral patterns.
  • Utilized statistical techniques and machine learning algorithms to segment customers based on their purchasing behavior, preferences, and lifetime value.
  • Collaborated closely with Marketing and Sales teams to identify the most influential factors contributing to customer churn.
  • Developed a comprehensive churn prediction model that accurately identified customers at risk of churning, enabling proactive intervention strategies.
  • Created a visually appealing and user-friendly dashboard that provided real-time insights into customer segments, churn rates, and actionable recommendations for the Sales team.
  • Presented findings and recommendations to the executive team, demonstrating the potential impact of targeted retention initiatives on customer satisfaction and revenue growth.

Data Analyst

Dr Reddy’s Laboratories
12.2019 - 12.2020
  • Involved in analysis, design and documenting business requirements and data specifications
  • Supported data warehousing extraction programs, end-user reports and queries
  • Implemented Data Exploration to analyze patterns and to select features using Python SciPy
  • Built Factor Analysis and Cluster Analysis models using Python SciPy to classify customers into different target groups
  • Supported MapReduce Programs running on the cluster
  • Evaluated business requirements and prepared detailed specifications that follow project guidelines required to develop written programs
  • Participated in Data Acquisition with the Data Engineer team to extract historical and real-time data by using
  • Hadoop MapReduce and HDFS
  • Communicated and presented default customers profiles along with reports using Python and Tableau, analytical results and strategic implications to senior management for strategic decision making
  • Developed scripts in Python to automate the customer query addressable system using python which decreased the time for solving the query of the customer
  • Performed Data Enrichment jobs to deal missing value, to normalize data, and to select features
  • Developed multiple MapReduce jobs in python for data cleaning and preprocessing
  • Parsed JSON formatted twitter data and uploaded to the database
  • Developed Hive queries for analysis and exported the result set from Hive to MySQL using Sqoop after processing the data
  • Created HBase tables to store various data formats of data coming from different portfolios
  • Worked on improving performance of existing Pig and Hive Queries
  • Created reports and dashboards, by using Tableau 9.x, to explain and communicate data insights, significant features, models scores and performance of new recommendation system to both technical and business teams
  • Utilize SQL, Excel and several Marketing/Web Analytics tools in order to complete business & marketing analysis and assessment
  • Creating meta-data and data dictionary for the future data use/ data refresh of the same client
  • Running SQL scripts, creating indexes, stored procedures for data analysis
  • Data Lineage methodology for data mapping and maintaining data quality
  • Maintained PL/SQL objects like packages, triggers, procedures etc
  • Participated in all phases of data mining; data collection, data cleaning, developing models, validation, visualization and performed Gap analysis
  • Designed and deployed reports with Drill Down, Drill Through and Drop down menu option and Parameterized and Linked reports using Tableau
  • Implement point of view security to Tableau dashboards to facilitate visibility across various levels of the
  • Organization
  • Developed Python programs for manipulating the data reading from various Teradata and convert them as one
  • CSV Files, update the Content in the database tables
  • Create, activate and program in Anaconda environment
  • Worked on predictive analytics use-cases using Python language
  • Clean data and processed third party spending data into maneuverable deliverables within specific format with
  • Excel macros and python libraries such as NumPy, and matplotlib
  • Used Pandas as API to put the data as time series and tabular format for manipulation and retrieval of data
  • Used Spark and SparkSQL for data integrations manipulations Worked on a POC for Creating a docker image on azure to run the model
  • Environment: Hadoop 3.0, MapReduce, Hive 3.0, Agile, HBase 1.2, NoSQL, PySpark, Teradata, Plotly, Spark, PL/SQL,
  • Python, Tableau, HQL, Machine Learning, Data Validations and Standardizations.

Data Analyst

Sonata Software Ltd
12.2017 - 11.2019
  • Participated in daily scrum meeting with team to discuss on the challenges and development of the project
  • Worked with Business analysts for requirement gathering, business analysis, and testing and project coordination
  • Implemented Tableau for visualizations and views including scatter plots, box plots, heatmaps, tree maps, donut charts, highlight tables, word clouds, reference lines, etc
  • Preparing dashboards using calculated fields, parameters, calculations, groups, sets and hierarchies in
  • Tableau
  • Interact professionally with diverse group of professionals in the organization including managers and executives
  • Experience in standardizing Tableau for shared service deployment
  • Hands-on development assisting users in creating and modifying worksheets and data visualization dashboards
  • Analyzed source data and gathered requirements from the business users
  • Worked on understanding and creating ETL Design documents and specifications
  • Prepared technical specifications to develop ETL mappings to load data into various tables confirming to the business rules
  • Extensively used ETL to load data using Power Center/from source systems like Flat Files and Excel Files into staging tables and load the data into the target database Oracle
  • Worked with various Transformations like Joiner, Expression, Lookup, Aggregate, Filter, Update Strategy,
  • Stored procedure, Router and Normalizer etc
  • Prepared technical specification to load data into various tables in Data Marts
  • Responsible for Unit Testing and Integration testing of mappings and workflows
  • Responsible for validating Target data after applying different Informatica Transformations on the Source
  • Data
  • Communicated and presented default customers profiles along with reports using Python and Tableau, analytical results and strategic implications to senior management for strategic decision making
  • Extracted data for reporting from multiple data sources like SQL Server, SQL Server Analysis, Service, Azure
  • SQL Database, Azure SQL Data Warehouse, Salesforce, etc
  • Extracted data from data warehouse server by developing complex SQL statements using stored-procedures and common table expressions (CTEs) to support report building
  • Extensive use of DAX (Data Analysis Expressions) functions for the Reports and for the Tabular Models
  • Created PowerPivot models and PowerPivot reports, publish reports into SharePoint, and deployed models into
  • SQL Server, SSAS instance
  • Assisted in creating SQL database maintenance logs and presenting any issues to the database architects
  • Worked on Power BI reports using multiple types of visualizations including line charts, doughnut charts, tables, matrix, KPI, scatter plots, box plots, etc
  • Utilized Power BI to create various analytical dashboards that helps business users to get quick insight of the data
  • Used Spark and SparkSQL for data integrations manipulations Worked on a POC for Creating a docker image on azure to run the model
  • Environment: Hadoop 3.0, MapReduce, Hive 3.0, Agile, HBase 1.2, NoSQL, PySpark, Teradata, Plotly, Spark, PL/SQL,
  • Python, Tableau, HQL, Machine Learning, Data Validations and Standardizations.

Education

Masters - information technology and Management

Illinois Institute of Technology

Bachelor of Engineering - Computer Science

PSG College of Technology

Skills

TECHNICAL SKILLS

  • Big Data Technologies: Cassandra, SPARK, Apache MAHOUT, Cloudera CDH4, Impala
  • Big Data Technologies: Apache Hadoop (HDFS/MapReduce), PIG, HIVE, HBASE, SQOOP, FLUME, OOZIE
  • Web Technologies: JSP JDBC, HTML, JavaScript
  • Languages: Java, Scala (for Spark), Python, VBA, R Programming, Py Spark, SQL, Microsoft Excel, Matlab
  • Machine Learning / Data Analysis / Statistics: Hidden Markov Model, Random Forest, Decision Tree, Support Vector Machine, Neural Network
  • Operating Systems: Windows, UNIX and Linux
  • Frame Works: Spring, Hibernate
  • Version Control: VSS (Visual Source Safe), CVS
  • Agile Methodology : Jira
  • Visualization Tools: Tableau, Power BI
  • Cloud: AWS,Azure, Google BigQuery
  • Compatibility Testing
  • Microsoft Visio
  • Project Management

Timeline

Data Analyst

XavemCloud
03.2023 - Current

Data Analyst

Dr Reddy’s Laboratories
12.2019 - 12.2020

Data Analyst

Sonata Software Ltd
12.2017 - 11.2019

Masters - information technology and Management

Illinois Institute of Technology

Bachelor of Engineering - Computer Science

PSG College of Technology
Salma Mohammed Ali