Machine Learning Engineer with a decade of experience in AI, Machine Learning, and Data Analytics, adept in integrating advanced technologies like Optical Character Recognition (OCR), Generative AI (GenAI), Large Language Models (LLM), Convolutional Neural Networks (CNN), and Deep Neural Networks (DNN) into complex systems. Skilled in areas of Predictive Analysis, Data Manipulation, Mining, and Visualization, with a strong foundation in Business Intelligence. Proficient in training algorithms on diverse cloud platforms, catering to industries such as finance, marketing, advertising, geospatial analysis, and IoT. Expertise in Natural Language Processing and Computer Vision, with a track record of constructing effective data ETL and machine learning pipelines, and implementing end-to-end MLOps solutions. Recognized for delivering seamless, impactful solutions in dynamic and challenging environments.
Data Engineering and ETL:
Extensive experience as a data engineer building ETL pipelines.
Proficient in ETL tools in diverse platforms: AWS, Azure, DataBricks, Apache Spark, and Airflow.
Worked with various cloud platforms, emphasizing expertise in AWS and Azure.
Continuous Integration and Deployment:
Utilized CI/CD tools such as AWS and GitLab to ensure seamless integration and deployment of NLP solutions.
Wrote automation processes in Python, leveraging AWS Lambda for efficient and automated workflows.
Employed Docker for deployment on diverse platforms, including Linux, Windows, OSX, and AWS.
Big Data and Cloud Platforms:
Reviewed and implemented data ingestion and initial analysis processes using MongoDB, node.js, and Hadoop.
Designed and deployed cost-effective infrastructure on AWS, optimizing functionality.
Scaled analytics solutions to Big Data using Hadoop, Spark/PySpark, and other relevant tools. Experience with Public Cloud platforms such as Google Cloud, Amazon AWS, and Microsoft Azure.
Machine Learning and Analytics:
Applied machine learning techniques, including regression, classification models, deep learning neural networks.
Designed star schema, Snowflake schema for Data Warehouse, and ODS architecture.
Leveraged advanced statistical procedures for Supervised and Unsupervised problems.
Visualization and Presentation:
Designed visually stunning visualizations using Tableau for effective communication and presentation.
Published and presented dashboards and Storyline on web and desktop platforms.
Database and SQL:
Worked with relational databases, including Teradata and Oracle, demonstrating advanced SQL programming skills.
NLP and Text Processing:
Applied various NLP methods for information extraction, topic modeling, parsing, and relationship extraction.
Developed, deployed, and maintained scalable production NLP models using NLTK, SpaCy, and transformers for automated customer response.
DevOps and Data Platform:
Demonstrated familiarity with DevOps practices and tools specifically tailored for data platform development and deployment.
Proficient in utilizing tools such as Azure DevOps, GitHub, Azure Resource Manager, Azure CLI, PowerShell, and ARM templates for streamlined and efficient development processes.
Data Science Specialties: Natural Language Processing, Machine Learning, Internet of Things (IoT) analytics, Social Analytics, Predictive Maintenance, Stochastic Analytics
Analytic Skills: Bayesian Analysis, Inference, Models, Regression Analysis, Linear models, Multivariate analysis, Stochastic Gradient Descent, Sampling methods, Forecasting, Segmentation, Clustering, Naïve Bayes Classification, Sentiment Analysis, Predictive Analytics, Econometrics modelling
Analytic Tools: Classification and Regression Trees (CART), H2O, Docker, Support Vector Machine, Random Forest, Gradient Boosting Machine (GBM), TensorFlow, PCA, RNN, Linear and non-Linear Regression, Decision Tree, Support Vector Machine, Stochastic Gradient Boosting, Xgboost, LGBM
Visualization Analytic tools: Power BI, Tableau, R, Python, Matplotlib, Plotly, Seaborn
Analytic Languages and Scripts/ETL tools: R, Python, HiveQL, Spark, Spark MLlib, Spark SQL, Hadoop, Scala, Impala, MapReduce, AWS S3, AWS glue, EMR, AWS redshift, AWS lambda, Azure Data Factory, Azure Data Lake gen 2, Azure Synapse, Apache Airflow, Dags, Data Bricks, Snow Flake
Languages: Java, Python, R, Scala, C/C, JavaScript, SQL, SAS
Python Packages: Numpy, Pandas, Scikit-learn, Tensorflow, Keras, SciPy, Matplotlib, Seaborn, Plotly, NLTK, Scrapy, Gensim, Pytorch
Version Control, CI/CD: GitHub, Git, SVN, Gitlab, Gitbucket, Jenkins, Kubernetes
IDE: Jupyter Notebook, VSCode, Intellij IDEA, Spyder, Eclipse
Data Query: Azure, Google Bigquery, Amazon RedShift, Kinesis, EMR; HDFS, RDBMS, SQL, MongoDB, HBase, Cassandra and NoSQL, data warehouse, data lake and various SQL and NoSQL databases and data warehouses
Deep Learning: Machine Perception, Data Mining, Machine Learning algorithms, Neural Networks, TensorFlow, Keras
Large Language Models: Lang Chain, Open AI
Neural Networks and Deep Learning
Credential URL: https://www.coursera.org/account/accomplishments/records/K9ZG3Q5ZZMXB
Credential ID: K9ZG3Q5ZZMXB
Introduction to Generative AI
Credential URL: https://coursera.org/share/1653218909455f556a4048e57fc78b60
Credential ID: JH3QSTTZPFHD