Summary
Overview
Work History
Education
Skills
Timeline
Generic

Shelly Guo

Sammamish,WA

Summary

Dynamic engineering leader with a proven track record at Microsoft, specializing in AI infrastructure and cloud architecture. Expert in scaling distributed systems and driving engineering excellence. Adept at cross-functional collaboration and team building, successfully launched high-value Azure services, enhanced cybersecurity compliance, and significantly improved service reliability and performance.

Overview

30
30
years of professional experience

Work History

Partner Engineering Director

Microsoft Corporation
Redmond, WA
12.2023 - Current

Leading the Back Plane engineering team that powers the Infrastructure and Experiences of Azure OpenAI, Azure AI Foundry, Azure AI Services, and CoreAI Agent Platform. The team size exceeds 100 developers, and the portfolio consists of:

  • Asset Management Systems that provide the storage, distribution, and safeguarding of partner model assets (including OpenAI, Meta, Mistral, Cohere, and OSS model providers), customer data, agentic applications, evaluators, and evaluation results.
  • Resource Provider implementation that powers the control and monetization flows of Azure OpenAI, Azure AI Services, and the Agent Platform.
  • High-throughput, scalable middle-tier compute infrastructure that powers up to 300 microservices (including Azure OpenAI Application Layer) to deliver AI platform services and experiences to our customers.
  • Owner and primary driver of Microsoft AI offerings in sovereign and air-gapped clouds. Launched Azure OpenAI, Azure AI Services, and Azure AI Foundry in US Gov, US Sec, and US Nat.

Partner Group Engineering Manager

Microsoft Corporation
Redmond, WA
04.2021 - 12.2023
  • Built the AI Infrastructure Fundamentals team in Azure Machine Learning from ground up, birthing the Azure Singularity service, an internal model training and hosting infrastructure platform for Microsoft’s AI workloads.
  • Took the platform from incubation to a full-fledged Azure production service that became the hosting and scheduling platform for all Microsoft AI workloads, including Azure OpenAI, Bing Chat and Microsoft Copilot products.
  • Led the migration of the Microsoft AI Training platform from legacy system to Singularity with 80K+ GPUs migrated, resulting in significant utilization and reliability gains for large training workloads.
  • Leads the Secure Foundation Initiative and the Model Asset Protection program for AI Platform, focusing on cybersecurity architecture and IP protection for AI model providers.
  • Drive the implementation of security, compliance and governance controls across AI Platform and AI Services, ensuring they meet enterprise standards for AI security and safety.

Principal Group Engineering Manager

Microsoft Corporation
Redmond, WA
02.2016 - 03.2021
  • Owner of the entire engineering pipeline spanning local developer experience, functional validation, pre-production validation, deployment automation, livesite management, compliance certification and customer support experience for Azure Cosmos DB, one of the fastest growing Azure database services.
  • Designed, implemented and operationalized a release pipeline that is focused on ensuring service quality and deployment safety while increasing cadence and productivity.
  • Oversaw the service expansion from 20 clusters in 20 Azure regions to 100,000+ machines, 800+ clusters, 40+ regions and 4 different clouds with no major incidents due to change management.
  • Led the initiative to create a comprehensive validation infrastructure system that forms one of the most effective QA pipelines for Azure Cosmos DB, a massively scaled, stateful service that powers many mission critical applications.
  • This validation service enabled Cosmos DB developers to rapidly iterate on the code base while keeping the quality of the code always at its highest level.
  • Guardian of service SLO Attainment for Azure Cosmos DB and owner of engineering KPIs for an organization of 250+ developers and growing.
  • Azure Cosmos DB has one of the most robust SLAs among all Azure services, a unique competitive strength that we are proud to offer.
  • Strong advocate for engineering agility and productivity while 100% committed to maintaining highest service quality.
  • Established track record for improving engineering process through data and automation.
  • Owner of livesite management and root cause process.
  • Continuously refine on-call process to maintain service availability through implementation of better signals, automated corrective actions and auto analysis of root causes.
  • Led the compliance certification effort and achieved ISO, SOC and FedRAMP compliance for Azure Cosmos DB, accelerating the product to enterprise ready status, opening the door to a whole new set of high impact customers.
  • Led the transition of the service delivery platform from dedicated clusters to shared Azure Computer clusters, a move that resolved the on-going capacity crisis and laid a solid foundation for better capacity management to accommodate rapid service growth while managing COGs responsibly.
  • Led the architectural design to replat service delivery for future support of zone resiliency and private clouds.

Principal Engineering Manager

Microsoft Corporation
Redmond, WA
02.2015 - 02.2016
  • Led a team of 10 developers to re-define Pre-Release validation and developer experience to increase release cadence and maximize developer productivity for a large organization of 150+ developers.
  • Worked across several teams within and outside the division to evaluate and identify the best-in-class engineering system and developed a plan for transition to the next-gen system.
  • Architected a new “Test in Production” solution that builds core testing concepts directly into the product to remove any variations between test and production environments, making tests more reliable and test results more accurately reflect production quality.
  • Established a new data driven approach to engineering velocity.
  • Focused on collecting engineering KPIs and use them to inform engineering investments as well as measure progress and drive accountability.
  • Maintained a set of internal infrastructure services that manage one of the largest test labs in the company, with more than 3K physical machines, 6K+ VMs and hundreds of workflows running daily.
  • Diagnosed issues and drove fixes to ensure SLA to partner teams.
  • A strong advocate for code quality, unit testing and engineering excellence.
  • Developed team-wide coding and testing standard and tools to enforce them.
  • Substantially improved team performance through recruitment, mentorship and performance management.

Senior Software Development Engineer & Lead

Microsoft Corporation
Redmond, WA
09.2009 - 08.2011
  • Led a team of 8 developers through multiple releases of Live Connect developer platform, a developer API with a set of cross-platform client libraries for Windows Live services including SkyDrive, Profile, Contacts and Mail.
  • It is still the most widely used public APIs for Live services today.
  • Designed and implemented the first OAuth 2.0 web authentication service for Microsoft Account and later facilitated the transition of OAuth2.0 as the official web authentication protocol for Microsoft Account.
  • Designed and implemented the Live Connect .Net client library for Windows and Windows Phone.
  • Designed and implemented a RESTful API service for Live Mesh and Windows Live, a pre-cursor of the Live Connect developer API.

Software Design Engineer II & Senior Software Design Engineer

Microsoft Corporation
Redmond, WA
10.2003 - 02.2008
  • Implemented building blocks for Windows Workflow Foundation.
  • Delivered the object model, the designer and Visual Studio integration, the type system for the workflow designer, the compilation and serialization of workflows, etc.
  • Designed and implemented a page flow framework for Asp.Net using Windows Workflow Foundation.
  • Designed and implemented a fast in-app-domain WCF (Windows Communication Foundation) channel for .Net 4.0.

Software Design Engineer I & II

Microsoft Corporation
Redmond, WA
05.1998 - 10.2003
  • Designed and implemented a web service that provides a framework for building human workflow applications on top of BizTalk server.
  • Designed and implemented a data source protocol for Office client applications, including Excel, FrontPage and Access, to access different data sources in a uniform manner.
  • Implemented a pluggable subscription manager agent to synchronize and replicate data for offline Access Data Pages.
  • Implemented the import/export functionalities in Access 2002 to allow data conversion between different data formats.

Microsoft Intern

05.1995 - 08.1997
  • Developed a C++ test framework for the Visual SourceSafe team to help automate test running process.
  • Built data processing and visualization tool for the Paradise project, a parallel object relational database system specializing in managing large amounts of geographic and scientific data.

Education

B.S. - Computer Science with Honors

University of Wisconsin – Madison
01.1998

Skills

  • AI infrastructure
  • Cloud architecture
  • Infrastructure scaling
  • Service-oriented architecture
  • Distributed systems
  • Data analysis
  • Cybersecurity compliance
  • DevOps practices
  • Livesite management
  • Engineering excellence
  • Key client relationships
  • Cross-functional collaboration
  • Recruitment and hiring
  • Team building
  • Vendor management
  • Problem solving
  • Effective communication
  • Mentorship skills
  • Conflict resolution
  • Web APIs
  • Workflow
  • Automation
  • Performance optimization
  • C#
  • Net

Timeline

Partner Engineering Director

Microsoft Corporation
12.2023 - Current

Partner Group Engineering Manager

Microsoft Corporation
04.2021 - 12.2023

Principal Group Engineering Manager

Microsoft Corporation
02.2016 - 03.2021

Principal Engineering Manager

Microsoft Corporation
02.2015 - 02.2016

Senior Software Development Engineer & Lead

Microsoft Corporation
09.2009 - 08.2011

Software Design Engineer II & Senior Software Design Engineer

Microsoft Corporation
10.2003 - 02.2008

Software Design Engineer I & II

Microsoft Corporation
05.1998 - 10.2003

Microsoft Intern

05.1995 - 08.1997

B.S. - Computer Science with Honors

University of Wisconsin – Madison
Shelly Guo