Summary
Overview
Work History
Education
Skills & Expertise
Certification
Timeline
Generic

SAHITHYA MANNARU

Jersey City,NJ

Summary

Data Engineer with over 5 years of experience in developing and implementing advanced large language models, NLP, and machine learning solutions. Proficient in Python, PySpark, and SQL, with extensive experience in AWS and Azure for scalable infrastructure. Proven expertise in data warehousing, ETL processes, and big data technologies like Apache Hadoop and Spark. Skilled in using Airflow for scheduling and workflow management, and experienced in analytics and monitoring to drive data-driven decision-making and business transformation

Overview

6
6
years of professional experience
1
1
Certification

Work History

Data Engineer

Synechron
New York, USA
10.2023 - Current
  • Utilized OpenAI and PandasAI for extracting plots and summaries from uploaded documents like PDFs and Word documents
  • Led the adaptation of the application by checking the outputs given by language models like Falcon, Hugging Face, Llama, and Langchain, broadening the platform's capabilities for users
  • Spearheaded the integration of OpenAI and other open-source LLMs like Llama2, Mistral into enterprise products, driving innovation in natural language processing capabilities and enhancing product offerings
  • Developed and deployed chat and IVR solutions for the banking sector, utilizing LLMs to process and understand customer queries, resulting in a 30% improvement in customer service response times
  • Implemented Lean Six Sigma methodologies to streamline AI model deployment processes, reducing errors by 15% and increasing deployment efficiency by 25%
  • Engineered and maintained high-performance applications using vector and graph databases such as Pinecone and Neo4j, optimizing data retrieval processes and supporting complex data relationships
  • Designed and implemented real-time data processing pipelines using PySpark to handle large volumes of financial transactions, ensuring low-latency and high-throughput data processing
  • Developed and deployed advanced BI solutions using NLP and machine learning, enabling USBank to gain deeper insights into customer behavior and market trends
  • Mastered the use of Langchain and Haystack frameworks to design and implement agentic workflows using Airflow, facilitating more dynamic and responsive AI-driven scalable GenAI applications
  • Utilized MongoDB and FAISS to optimize data retrieval processes, significantly enhancing the precision and speed of business intelligence reports for portfolio management
  • Integrated Hugging Face's Transformers library for advanced natural language processing tasks, such as sentiment analysis of market news and research reports
  • This integration empowered portfolio managers with deeper insights into market sentiments, enabling them to react swiftly to changing market conditions and investor sentiments
  • Adopted Azure AI for streamlined machine learning workflows, enabling predictive analytics and market trend analysis, which aided portfolio managers in making data-driven investment decisions
  • SERP API for automated web scraping and data extraction from financial websites and news portals
  • This automation minimized manual efforts and ensured the accuracy and timeliness of data collected for research reports

Data Engineer

General Motors
Detroit, USA
04.2022 - 04.2023
  • Leveraged Python and Unix shell scripting to efficiently process and manipulate large datasets, leading to a remarkable 20% improvement in data accuracy and a 15% reduction in cleaning time
  • Automated scripts significantly enhanced data processing efficiency
  • Conducted extensive data querying, optimization, and maintenance tasks on Oracle Exadata databases, resulting in a remarkable 70% improvement in data processing and retrieval speeds
  • Collaborated with database administrators and data engineers to optimize SQL queries for enhanced performance
  • Collaborated with cross-functional teams to successfully implement Agile methodologies, fostering efficient collaboration between developers, data analysts, and stakeholders
  • Agile adoption led to improved project execution, faster delivery, and continuous process improvement
  • Designed and maintained highly scalable data pipelines, enabling real-time processing of petabyte-scale data and delivering actionable business intelligence to stakeholders
  • Utilized Informatica to skillfully design and implement efficient ETL processes, reducing data integration time by 30% and improving data accuracy by 20%
  • Created data mappings, workflows, and transformations for smooth data flow and seamless integration
  • Developed PySpark-based real-time data pipelines to process and analyze vehicle telemetry data, enabling proactive maintenance and reducing downtime by 15%
  • Implemented and maintained highly efficient shell scripts for automating routine data tasks, such as data extraction and loading
  • Resulted in a substantial 40% reduction in manual effort and significantly improved data processing efficiency
  • Developed and optimized sophisticated data integration workflows using Informatica ETL, ensuring seamless data flow between various systems
  • Streamlined data integration, improved data reliability, and enhanced overall system performance

Data Engineer

Quantum Technologies Private Limited
Hyderabad, India
08.2018 - 05.2021
  • Executed over 250 test scripts using Python to validate critical software elements, ensuring the high performance of the application with an impressive accuracy rate of 99.8%
  • The comprehensive testing strategy fortified software functionality and enhanced reliability
  • Engineered SQL queries and database protocols to optimize data processing and retrieval, improving transparency by 10% and organizational efficiency by 39%
  • Collaborated closely with software engineers to seamlessly integrate new business models into existing systems, harnessing Python's capabilities to elevate overall system functionality by 25%
  • Simultaneously, a 20% reduction in manual workload streamlined operational efficiency
  • Facilitated the development of 10 internal operating systems, significantly shortening feedback turnover time and elevating operational efficiency
  • The successful implementation resulted in improved productivity and system usability
  • Spearheaded the migration from on-premise servers to AWS cloud infrastructure (EC2, S3, RDS), catalyzing seamless scalability and cost optimization
  • This transformation harnessed the benefits of cloud computing, ensuring high availability
  • Leveraged Lean Six Sigma principles to optimize data integration workflows, achieving a 20% reduction in ETL processing time and improving data quality
  • Employed Informatica to expertly design, develop, and optimize data integration workflows, facilitating seamless data flow and integration across systems
  • This meticulous approach streamlined data accessibility and ensured timely information delivery

Education

Master of Science - Computer Science

Stevens Institute of Technology
Hoboken, NJ
05-2023

Skills & Expertise

Programming Languages : Python, Java, R, SQL, PLSQL,NoSQL
Databases : MySQL, MongoDB, Oracle, PostgreSQL, Snowflake, Oracle Exadata
Cloud Technology : Amazon Web Services(AWS), Azure AI
Big Data : Apache Hadoop, HDFS, MapReduce, Hive, HBase, Spark(PySpark)
ML Frameworks : Flask, NumPy, Pandas, Scikit-learn, TensorFlow, Keras, Matplotlib, PyTorch, Seaborn
Web Development : HTML5, CSS3
Scheduling &CI/CD Tools : Airflow, GitLab, Jenkins, Kubernetes, Jira, Ansible
Data Warehouse : Prism, data mapping, Informatica ETL
Large Language Models : HuggingFace, OpenAI, Llama
Other Tools : Power BI, Tableau, Excel, AutoSys

Certification

AWS Certified Solutions Architect- Associate, Azure AI Engineer Associate, HackerRank Python Certification, IBM Python Certification

Timeline

Data Engineer

Synechron
10.2023 - Current

Data Engineer

General Motors
04.2022 - 04.2023

Data Engineer

Quantum Technologies Private Limited
08.2018 - 05.2021

Master of Science - Computer Science

Stevens Institute of Technology
SAHITHYA MANNARU