Accomplished Staff Machine Learning Engineer at Personio Inc., specializing in optimization of generative AI models and leadership of cross-functional teams. Demonstrated success in improving model serving efficiency by 45% through advanced performance profiling and technical guidance, utilizing expertise in Python and MLOps to deliver impactful AI solutions.
Overview
26
26
years of professional experience
Work History
Staff Machine Learning Engineer
Personio
New York, NY
10.2024 - Current
AI Infrastructure Development: Designed and implemented agentic conversational AI systems reducing inference latency by 40% through custom CUDA kernels and model optimization techniques
Technical Leadership: Leading 5-person ML engineering team in architecting AI infrastructure roadmap, establishing performance benchmarks, and driving adoption of hardware-accelerated ML solutions across product verticals
Systems Integration: Built MCP (Model Context Protocol) interfaces for seamless integration between AI services and backend systems, ensuring optimal resource utilization and scalability
Performance Optimization: Implemented GPU memory optimization strategies and distributed inference pipelines for real-time AI applications
Startup Founder
Algorithmical Corp
, MA
02.2023 - Current
High-Performance Trading Systems: Developed fully automated day trading platform with microsecond-latency requirements, implementing custom C++ algorithms optimized for real-time market data processing
ML Infrastructure: Built scalable recommendation engine using distributed computing frameworks, optimizing for both CPU and GPU acceleration
Systems Architecture: Designed end-to-end ML pipeline with hardware-aware model deployment, achieving 60% improvement in inference throughput
Staff Machine Learning Engineer
Warner Bros. Discovery, Inc
02.2022 - 10.2024
ML Systems Optimization: Designed and optimized generative AI models and recommendation engines for MAX platform, implementing hardware-specific optimizations that improved model serving efficiency by 45%
Infrastructure Development: Built and deployed large-scale AI systems on AWS with custom GPU clusters, implementing MLOps practices for continuous model optimization and hardware resource management
Cross-functional Leadership: Collaborated with hardware engineering teams to optimize AI workloads for specific GPU architectures, translating business requirements into technical specifications for hardware-accelerated solutions
Performance Engineering: Implemented advanced profiling and optimization techniques for transformer models, achieving significant improvements in training and inference performance
Alexa AI Infrastructure: Led development of hardware-optimized generative AI systems for Alexa platform, implementing efficient memory management and compute optimization for edge devices
ML Framework Development: Built and optimized NLP models with transformer architectures, focusing on hardware acceleration and real-time performance requirements
Systems Integration: Deployed large language models on distributed AWS infrastructure with custom optimization for GPU utilization and cost efficiency
Cross-team Collaboration: Partnered with hardware teams to optimize AI workloads for Amazon's custom silicon, ensuring optimal performance across different device architectures
AI Systems Development: Developed and deployed AI-powered solutions including chatbots and recommendation systems, with focus on performance optimization for telecom infrastructure
Big Data Infrastructure: Built and optimized ML pipelines using Spark and Hadoop, implementing distributed computing solutions for large-scale data processing
MLOps Implementation: Led implementation of MLOps practices ensuring reliability and scalability of AI systems across multiple platforms and hardware configurations
Hardware Integration: Collaborated with network hardware teams to optimize AI workloads for telecom equipment, focusing on power efficiency and real-time processing requirements
High-Performance Systems: Extensive experience in C/C++ development for telecom infrastructure, including real-time signal processing, network protocol implementation, and embedded systems optimization
Hardware-Software Integration: Deep expertise in SW/HW co-design for telecom equipment, optimizing software for specific hardware architectures and performance constraints
Distributed Systems: Built scalable, fault-tolerant systems for telecom networks with focus on low-latency, high-throughput requirements
Performance Optimization: Advanced profiling and optimization techniques for resource-constrained environments, including memory management and computational efficiency
Various Companies | 15 Years Combined Experience
Education
Master of Science - Data Science
University of Michigan
Ann Arbor, MI
12.2025
Master of Science - Entrepreneurship
University of Massachusetts
12.2003
Master of Science - Software Engineering
Illinois Institute of Technology
Chicago, IL
09.1999
Bachelor of Science - Electrical & Electronics
National Institute of Technology
Suratkal, India
09.1997
Skills
ML Systems & Infrastructure: Distributed AI systems, performance profiling and optimization
AI Frameworks & Tools: PyTorch, TensorFlow, CUDA, OpenMP, MPI, Kubernetes, Docker, AWS, MLOps pipeline development
Programming & Systems: Advanced C/C (15 years), Python, CUDA programming, parallel computing, high-performance computing, embedded systems optimization
ML Domains: Generative AI, NLP, recommendation systems, ranking models, transformer architectures, model quantization and optimization
Leadership: Technical team leadership, cross-functional collaboration, mentorship, project management, AI strategy development
Technical Mentorship: Actively mentor junior engineers and research scientists, focusing on ML systems design, performance optimization, and career development
Cross-functional Leadership: Experience leading large-scale projects across multiple teams, driving technical decisions and ensuring alignment with business objectives
Quality Engineering: Established engineering best practices and code review processes that improved overall team productivity and system reliability