Summary
Overview
Work History
Education
Skills
Websites
Accomplishments
Certification
Timeline
Generic

AFREEN SIDDIQUI

Plymouth,MN

Summary

Self-motivated Data scientist and machine learning engineer with 12 years of experience in identifying patterns in data analysis effectively with multi-functional roles to identify and leverage areas for improvement in data systems. bringing a comprehensive understanding of warehouse operations and documentation needs.

Overview

12
12
years of professional experience
1
1
Certification

Work History

Sr. Big data engineer and Data scientist

Synchrony Financial
03.2019 - 09.2019
  • The project is to identify the suspected fraud accounts in order to avoid financial losses
  • Lead the effort to create design document and solution as per the business requirement
  • Worked alongside product managers to construct queries to identify potential fraud accounts
  • Imported data from data sources and performed spark transformations and actions on the data and stored the result in data lake
  • Did performance tuning for large datasets to make the code efficient
  • Involved in the entire development, testing, and production-ready process with the team
  • Extract, clean and process the data to normalize the feature to perform feature engineering on big data platform using spark and hive
  • Apply patterns code to identify different types of patterns for an account and generate fraud signals based on the selected identifiers by applying machine learning algorithms
  • Implemented different types of clustering on the accounts data to Identify the fraud rings and all the related accounts
  • Share the data insights with the business stake holders.

Sr. Big data Engineer

Optum Technologies
03.2019 - 09.2019
  • This project was to build the Certified measures for HEDIS(Healthcare Effectiveness Data and Information Set) in order to do the star performance measure check of health plans for the member
  • Development of HEDIS measures for members based on the facts and measure criteria given in an excel format using python, Java and spark
  • Create and architectural design for the audit output
  • Implement the development for the audit output for measures using spark and Java
  • Designing and implementing product features in collaboration with business and IT stakeholders and working closely with the architecture group for driving technical solutions.

Sr. Big data Engineer

Optum Technologies
11.2018 - 03.2019
  • This project was to improve the user experience for the services provided by determining NPS
  • Provided end to end design, implementation and making project production ready
  • Get the survey results from various sources and formats into the Hadoop environment
  • Validate and clean the data and apply transformation rules
  • Transform and join the Member data with their respective survey data and saved it as a partitioned parquet file in Hadoop eco system
  • Implemented validation framework using Spark SQL as part of data validation.

Sr. Big data Engineer

Optum Technologies
01.2018 - 09.2018
  • Worked on a project called IHR
  • Involved in the data flow and data analysis where this involves getting data from upstream and find the risk and quality measures for a given member
  • Imported data from upstream into HDFS, HBase using Spark with Oozie workflows
  • Designed and implemented framework to filter members using soft and hard logic
  • Involve in gathering the individual’s demographic, contact and Preference communications details and data profiling
  • Design and Implemented Spark framework to process the HL7 and various other file formats and then push these to RabbitMQ
  • Designed and developed system level audit feature
  • Extract the data from HBase using spark Java/Scala API’s and index into elastic search to present this in Kibana dashboards.

Sr. Big data Engineer

Optum Technologies
01.2016 - 01.2017

Data Management

  • This product was for making Active insurer’s data easily available to the business people, Agents, advocates through Rest API’s to view and update Individuals and Membership details
  • Involve in gathering the individual’s demographic, contact and Preference communications details and data profiling
  • Generate Java Components from XSD
  • Extract the data from HBase using spark java and Scala API’s and Hive SQLs
  • Map the extracted data to Java components generated
  • Push it to Mark Logic server
  • Lead the end-to end effort at Hadoop environment
  • Moved to Map reduce to improve the performance for the existing Talend jobs.

Hadoop Developer

Target
08.2015 - 12.2015
  • This project gathered various optimized prices for an item from different systems
  • Apply analytics to determine the current price required for an item
  • Involved in design and implementing the routes, processors, aggregators for apache camel
  • Implemented application with spring boot
  • Involved in elk setup with chef on open stack cluster
  • Represent the data on kibana dashboard that shows system data and metrics.

Hadoop Developer

Target
02.2015 - 07.2015
  • Determining the optimized price for an item by taking competitors data and applying rules to it
  • Price lookup for the threshold of an item price and notify the pricing applications
  • We calculate the goal retail for an item and check if that item is in clearance or sale or any promotion plans then apply the promotion or sale price till the specified last date, if not then apply regular retail price
  • Capture item and price data from DB2 using storm spout
  • Apply business rules using drools in the storm bolt
  • Involve in designing the implementation of the topology
  • Publish the data on a kafka queue
  • Analyze the input data coming from various sources and designed the schema to ingest the data.

Hadoop Developer

Adaptive, Real World Analytics
03.2014 - 12.2014
  • This product focus on data driven marketing that helps leading retail organizations grow their businesses
  • Gets a precise measure buying behavior at the individual customer level and provides a detailed analysis of how marketing impacts revenue
  • Provide trends for different products with regard to customer behavior
  • Worked with data scientists to provide data as per their requirements
  • Involved in migration of the workflows from Cloudera 4.x to 5.x
  • Designed and implemented to ingest and store Meta data from various data providers
  • Involved in preparing the data model
  • Analyzed the data by performing Hive queries and running Pig scripts to know customer purchase behavior
  • Implemented and utilized pig user defined functions
  • Involve in reading, writing and update the Cassandra database
  • Utilized Oozie workflow engine to run various Sqoop, Hive and Pig jobs.

Hadoop Developer

HSBC, Risk
03.2013 - 03.2014
  • Global Markets and Risk Technology & Operations team
  • We have been focused on identifying the risk involved in various portfolios like credit Card, Home loans, corporate loans etc
  • We will have to source the data from various applications that includes multiple lines of business corporate investment group
  • Defined VAP process based on the data to get the insights of the data
  • The target includes various other applications like Asset forecast and risk measure by doing the stress testing of the data
  • Gaining the understanding of the business by going through the LLD documents prepared by Data analyst
  • Identified and implemented the standards in the workflow building, deploying and release to Production
  • Data integrity checks implemented in the workflow process and Notifications about the flow status
  • Involved in production support.

Java Developer

HSBC, Mortgage Services
08.2011 - 05.2012
  • CML Data warehouse systems were developed to meet the business reporting and analytical requirements by the Marketing CIM Team
  • Data is extracted from various operational systems based on its functions., Foreclosure
  • This application determines the delinquent customers who will be given a chance to avoid foreclosure by settling the loan with liquidating the physical assets or paying off the partial loan amount, for this customer has to fill in the application of FAP and the same would be fed to the system for processing stage wise starting from stage 01 to 99 with 22 stages, wherein the application starts with new application followed by document submission, documents verification, and some more up to the application is approved or declined based on business requirements
  • After analysis and processing stage marts are created which are sent to the Customer Information Management and Reporting team for analysis purpose and making better decisions.

Mainframe Developer

One HSBC, Credit and Retail Services
04.2010 - 08.2011
  • CRS DW provides business decision makers the ability to perform analysis over a period of time on customer, product profitability, risk analysis, market trends through various data marts
  • Its solutions help HSBC Technological Services achieve tactical and strategic goals by enabling them with solutions which generate information & vital statistics to make the right decisions.

Mainframe Developer

ASSURANT
06.2007 - 03.2010
  • BEST-Claims Processing System, FORTIS) is a Health
  • Insurance Provider
  • Assurant Health is one of four key business segments of Assurant Inc
  • Together with Assurant Employee Benefits, Assurant Solutions and Assurant Specialty Property, these business segments have partnered with clients who are leaders in their industries and have built leadership positions in a number of specialty market segments in the U.S
  • And selected international markets
  • BEST (Better Examiner System Technology) is a medical and dental claim processing system, which captures claims, adjudicates and pays medical and dental insurances
  • It receives claims from manual and online sources and attempts to adjudicate them without manual intervention, passing the approved claim onto back-end systems that pay the claim
  • When the processor has finalized the claim, a record is added to the claim transaction file which is processed in the nightly batch cy 3

Education

MS - Computer Application

SMU University, Osmania University

Career path course in code academy for NLP specialist – -

MS - Data Science

CMU
2023

Skills

  • I have received multiple recognitions and Bravo awards for many of my projects
  • Data Science
  • As part of data analytics did Data analysis, Data visualization, Feature engineering, Data mining, Machine learning and Data modeling
  • Machine Learning
  • Python, PySpark, and python libraries like NumPy, SciPy, sci-kit-learn, Pandas, TensorFlow, Keras, and Matplotlib
  • In depth understanding of different supervised and unsupervised Machine Learning algorithms, recommendation systems and their implementation
  • NLP
  • Have hands on experience with libraries like NLTK, Text Blob, SpaCy, Gensim for text data cleaning, tokenization of data, normalization, chunking, POS tagging, language parsing, language quantification and have eloquent knowledge about machine learning and deep learning algorithms like BOW, tfidf, word embedding, word2vec, RNN models like LSTM for sentiment analysis, chatbots and text generation
  • Hadoop eco system:
  • Spark, Scala, Java, Kafka, Hive and Oozie
  • Cloud: Azure, GCP
  • SQL:
  • MySQL, Postgres, Oracle
  • Tools: Shell script, Anaconda, IntelliJ, jupyter notebook, SVN and github

Accomplishments

  • Collaborated with a team of 6+ in the development of multiple projects.
  • I achieved results in many projects by analyzing and helping the team.
  • I achieved Bravo/Recognition by completing tasks with accuracy and efficiency.

Certification

  • NLP Training - June 2023

Timeline

Sr. Big data engineer and Data scientist

Synchrony Financial
03.2019 - 09.2019

Sr. Big data Engineer

Optum Technologies
03.2019 - 09.2019

Sr. Big data Engineer

Optum Technologies
11.2018 - 03.2019

Sr. Big data Engineer

Optum Technologies
01.2018 - 09.2018

Sr. Big data Engineer

Optum Technologies
01.2016 - 01.2017

Hadoop Developer

Target
08.2015 - 12.2015

Hadoop Developer

Target
02.2015 - 07.2015

Hadoop Developer

Adaptive, Real World Analytics
03.2014 - 12.2014

Hadoop Developer

HSBC, Risk
03.2013 - 03.2014

Java Developer

HSBC, Mortgage Services
08.2011 - 05.2012

Mainframe Developer

One HSBC, Credit and Retail Services
04.2010 - 08.2011

Mainframe Developer

ASSURANT
06.2007 - 03.2010

Data Management

MS - Computer Application

SMU University, Osmania University

Career path course in code academy for NLP specialist – -

MS - Data Science

CMU
AFREEN SIDDIQUI