Expedia Group Technology — Innovation
Gateways, Guardrails, and GenAI Models
An enterprise guide for implementing secure, controlled access to Generative AI models.
As the exploration and implementation of diverse generative AI models take center stage, numerous tech companies are relentlessly pushing boundaries to heighten customer experiences and streamline operational efficiency. At ™️, EG, we have re-strategized and broadened our unified Machine Learning platform, integrating these generative AI models to enable quick and seamless adaptation across various use-case scenarios.
Navigating uncharted waters: The risks of relying on third-party generative AI systems
Problems with existing solutions
Existing platforms for models such as ChatGPT, from OpenAI, or Azure AI Service fall short in offering sufficient control to effectively leverage a single account or subscription across multiple applications. This limitation manifests in several drawbacks:
- An increase in one application’s usage — either via rate or tokens — can impose a rate limit on another application’s traffic.
- Cost tracking per application is not possible.
Moving forward: The unique features we needed
As a business entity, integration of the following critical features was emphasized for seamless functioning:
- Strong data protection measures to ensure no sensitive information leaves our secure network.
- Robust content filters to prevent transmitting inappropriate, offensive, or discriminatory content in response to customer inquiries.
- Streamlined access pattern — facilitating smooth movement across different models or platforms via single authentication methods, instead of platform-specific keys.
Additionally, becoming entrenched in a single platform can significantly hinder a team’s ability to transition away from that specific vendor’s offerings. Such vendor lock-in can seriously curtail an organization’s flexibility and control. It’s therefore crucial for organizations to opt for platforms and services that not only bolster robust data security and privacy, but also ensure flexibility and interoperability, thereby preventing potential vendor lock-in. Organizations need a foolproof solution that prevents both intentional and unintentional data breaches within their trusted networks. A stark illustration of this challenge is the incident involving Samsung and OpenAI, which underscores the real-world implications of such security vulnerabilities.
To fill the gaps and get desired features, we kick-started the GenAI tool kit — GenerativeAI Proxy & EG-Guardrail Service
Generative AI proxy (GAP)
Generative AI Proxy (GAP) is an essential service within the EG ML Platform designed to serve as a centralized entry point for clients using Generative AI (GenAI) APIs.
The primary objectives of GenAI Proxy are:
- Streamlined use case management and access control by offering standardized onboarding and business approval procedures.
- Ensure fair resource sharing and provide positive cost-control by enforcing fine-grained rate limits. This prevents any one client from unduly burdening shared resource pools.
- Request-response governance by filtering all requests and responses through an extensible list of guardrails. This reduces the risks associated with sensitive data leaks, inappropriate content, and other undesirable interactions.
- Provide cost transparency by reporting the cost of each API call to the EG Cloud Cost platform. This allows cost to be examined at various levels such as application, team, department, and more. The data can also be incorporated into cost forecasts.
Deep dive into GAP
GenAI Proxy is our central entry point for all generative AI APIs, including third-party platforms such as OpenAI and self-hosted Large Language Models (LLMs).
Clients may be internal services in the Commerce stack, or Expedia employees performing interactive ad hoc/exploratory work. Clients may also use existing third-party libraries and command-line tools.
API passthrough
One of GAP’s design goals is that existing tooling (such as libraries and command-line clients) for GenAI should be usable with GAP. Therefore, GAP aims to make as few changes as possible to any proxied API. There are only a few specific exceptions, so far:
- GAP authentication may not be identical to the third-party API, because GAP adds an abstraction layer of EG internal auth instead of requiring clients to use each platform’s unique onboarding & auth approach
- If GAP rejects your request due to issues such as authentication failure, rate limit exceeded, or internal error, the response may have a different format than the third-party API. Generally, we aim to improve these inconsistencies.
In the diagram below, you can see that GenAI Proxy essentially exposes the underlying API. Therefore, client libraries such as openai-python and langchain, which can interface with the OpenAI API, can also interface with GenAI Proxy.
To route requests properly, GAP has route definitions that know how to transform GAP requests into requests to the downstream API. The diagram below provides an outline and example of the URL mapping that takes place.
Request and response sequence
Each request that comes from a client goes through a specific sequence of filters both before it is forwarded to the downstream API, and after the response is received. The diagram below illustrates the sequence of these filters and their possible short-circuited response.
EG-Guardrails service
EG-Guardrails, referred to simply as Guardrails, plays a vital role within the Generative AI Proxy (GAP) system. These Guardrails are an essential component of GAP, serving as the guardians of security and compliance in the realm of generative AI. Their primary function is to ensure that every interaction involving generative AI models adheres to stringent security protocols and compliance standards.
Guardrails are designed to enforce strict security measures, safeguarding against potential vulnerabilities or risks associated with the use of AI models. This is especially crucial in industries and applications where data security and regulatory compliance are paramount.
Positioned at the apex of security and compliance, the OpenAI Guardrail Service operates within the GAP framework, functionally scrutinizing the inflow and outflow within the Large Language Models (LLMs). The service’s primary function is to safeguard the LLM environment by ensuring utmost adherence to safety protocol and guidelines.
Pre guardrails
Taking the form of preventative measures, pre guardrails initiate security and compliance assessments before any AI requests are processed. These checks specifically aim at the request body — the crux of data being dispatched to our LLMs. By serving as the first line of defense, the pre guardrails make certain the secure and responsible stewardship of your data.
Post guardrails
As the final bastion of our security and compliance infrastructure, the post guardrails confirm that LLM responses adhere to the highest standards of safety, quality, and compliance before they reach end-users. The duty of these post guardrails is not merely to evaluate, but, when deemed mandatory, to apply further checks and transformation to the response data, consequently ensuring both security and strict adherence to our guidelines.
Summary
As the tech sector widens it’s exploration of various generative AI models, companies are adopting these advanced technologies to enhance customer experiences and drive operational efficiencies. Expedia Group™️ stands front and center in this transformative trend, having reimagined its unified Machine Learning platform with integrated generative AI models, enabling agility across varied use-case scenarios.
In the expanding landscape of machine learning tools, several platforms have emerged that echo the similar functionality like MLflow Gateway. However, their capabilities remain nascent, focusing mostly on simplifying the deployment and oversight of large language models, along with the centralized management of API keys. These tools frequently overlook essential enterprise-grade features that many organizations require, including cost monitoring, rate limiting, and built-in authentication at the application level — features our platform offers inherently, ensuring ready-to-use, enterprise-level solutions.
Central to this blog is a detailed discussion on the imperatives of building in-house capabilities. This was spurred by the perceived limitations of existing platforms such as OpenAI’s ChatGPT or Azure AI Service, particularly in controlling account usage across different applications and cost tracking.
This blog post sheds light on the specific features required for seamless execution, including strong data protection, robust content filters and a streamlined authentication process. It also underscores the risks associated with vendor lock-in, reaffirming the necessity for flexibility and interoperability in technology adoption.
In a quest to bridge these gaps, we introduced the GenAI toolkit, encompassing the GenerativeAI Proxy and EG-Guardrail Service. This toolkit addresses the key challenges around security, compliance, guardrails, and gateway implementation for generative AI models, demonstrating our commitment to secure and responsible AI innovation.
We will keep extending the GenAI kit and capabilities of developed service as per technology trends. In particular, we plan to extend GAP to more external vendors than just OpenAI and Azure.