Travel FAQs made smarter: AI-driven answers from real-world discussion
How we harnessed generative AI to uncover patterns in traveler conversation and delivered clear, concise, and insightful questions and answers for destinations all over the world.
By ,
Tripadvisor is the world’s largest travel guidance platform, which is a fact worth repeating to highlight just how many reviews and contributions (over ) we manage. Each year, over 30 million reviews are submitted, along with more than 7 million photos. This huge volume of shared experience data helps our travelers to make informed decisions when planning and experiencing their dream vacations.
Travelers share more than just reviews on Tripadvisor. Our travel forums are a hidden gem with impressive reach — roughly 945 topics and 6,500 new posts daily (almost 2.4 million annually). Unlike reviews, forum content follows a conversational format, where each topic typically begins with a question. These discussions often provide more detailed and nuanced information than reviews, especially when local experts recommend hidden gems or when regular forum members discuss where to find the best deli sandwich in New York City.
Last year, we wrote about how we used generative AI to summarize review content while preserving the voice of the traveler. We focused on capturing both positive and negative opinions without bias, while doing our best to ensure transparency and impartiality. Given the trust that travelers place in Tripadvisor, it made perfect sense to use this approach to generate reliable travel advice content from our extensive forums.
Introducing Travel Advice
With our new Travel Advice feature, we’ve distilled the wealth of insights from our rich traveler forums (using English-language content only) down to the most frequently asked questions on popular travel topics. At the same time, we’ve scoured forum discussions to find the most helpful answers, carefully summarizing them to capture the genuine voice and intent of the original traveler. We present this content to the traveller in an approachable, easy-to-read format, while ensuring full transparency about its sources.
To improve accessibility, we’ve added a Travel Advice section at the bottom of landing pages for major geographical locations (or “geos” as we call cities and towns internally). Each question appears in its own expandable dropdown. Travelers can simply click the down arrow on any question to expand it and view the complete answer.
Each answer is structured into multiple paragraphs with highlighted section titles, ensuring readability and easy navigation. We prioritized clear, concise phrasing to enhance comprehension. You’ll notice a consistent summarization tone throughout, along with a message at the bottom of each section indicating that the summary was generated by AI.
To acknowledge the community members who shared this valuable information, we’ve included a “See related posts” link at the bottom of each Travel Advice section. Clicking this link opens a compact dialog showing card-like previews of related forum posts, with clear attribution to the original contributors. Each card contains a direct link to the full forum post, allowing travelers to access more detailed information while recognizing and rewarding the community members who created this content.
Our goal with this approach was to provide the necessary transparency, demonstrating that our AI-generated summaries remain true to the original forum user’s voice in answering the question.
Why forum content?
Tripadvisor.com/Forums hosts forums dedicated to geographical areas, such as cities or towns, where users can ask questions, receive responses, and engage with other travelers all over the world in multiple languages. In addition to geo-based forums, we also host themed forums, like Road Trips and Honeymoons and Romance that cater to specific travel interests. To ensure forum content remains accurate and timely, we rely on our Destination Expert program, where knowledgeable experts, independent from Tripadvisor, share their insights and answer questions. Together with our forum members, these experts help make our forums a rich source of expert-driven and community-curated travel knowledge.
A recent internal survey highlighted that forum content is the second most helpful source of information on our Tourism pages, trailing only the Explore by Interest category. This finding was further validated by recent tests, which demonstrated that well-structured FAQ content has helped increase SEO traffic to the forums.
Before diving in, it’s worth noting that Tripadvisor already features FAQ content for some of the most popular destinations. While useful and well-visited, these FAQs consist of static content that was created by Tripadvisor employees to address common traveler questions. In most cases, these FAQs are limited to a list of common points of interests, like hotels or restaurants for a given geo, but they lack the depth and nuance of real traveler discussion. While these FAQs have been effective, scaling and maintaining them manually has proven to be a challenge.
It’s clear that travelers benefit from the detailed insights found in our forums. Recognizing this, we saw an opportunity to apply generative AI to better surface and organize this valuable community-contributed content.
Overcoming the challenges
Machine Learning is an incredibly powerful tool for sorting and structuring vast amounts of information. However, before we could begin, we had to address several key challenges, such as:
- How to identify the right set of questions?
- How to pair them with accurate and relevant answers?
- How to maintain accuracy and ensure that AI-summarized responses align with the original source?
Uncovering patterns in traveler discussions
We started by analyzing forum content to understand how travelers engage with discussions. It quickly became clear that, regardless of the destination, most travelers tend to ask similar types of questions.
For example, when researching New York City, we noticed many forum topics where travelers asked similar questions about visiting museums — such as when to visit and how to get there. The same was also true for other destinations like Paris and Berlin. By studying these trends, we realized that the most valuable content often comes from the forum thread titles and the first post in each discussion. Titles are written with concise, clear language, while the first post tends to contain more comprehensive detail about the specific question or information being sought.
This simple insight gave us a clear starting point.
Grouping similar questions
Once we had identified the most relevant content, the next challenge was to cluster similar questions to determine the most frequently asked topics for the most effective FAQ content.
We used the bge-base-en-v1.5 embedding model to encode forum titles and first posts as vector representations. By clustering these vector representations, we identified the most common question topics across different destinations.
Given the vast differences in forum sizes across destinations, we needed an algorithm that could scale dynamically. For example, a city like Paris, France, has a massive forum that could easily generate thousands of clusters, whereas a city like Madison, Wisconsin, with a moderate-sized forum, might only generate a dozen or more clusters.
To compare different clustering algorithms, we created a small dataset including a handful of destinations and ran each algorithm to determine which produced the most logical clusters.
We started by testing the K-means algorithm, but it required us to pre-specify the number of clusters before running the operation. This was not feasible, given the unpredictability of data volumes. We also tried the HDBScan algorithm, but it struggled to produce enough meaningful clusters for our needs. Ultimately, we realized that we needed to create a custom algorithm that we could tailor to meet the needs of our dataset.
The examples below show typical clusters that were generated by our algorithm:
The first cluster focuses on advice for first-time travelers to New York City, the second on transportation from JFK to the city center, and the third on winter and Christmas suggestions for the city.
At a very high level, the algorithm works as follows:
- It computes embedding similarities using a Cartesian join, storing these similarities in descending order.
- It then uses a greedy approach to group the most similar forum threads. For example, imagine that two threads: thread A and thread B are very similar and would logically belong to the same cluster. If thread C is not similar to either, then it will form its own cluster and wait for similar threads to join it in the future.
- With each new cluster, it dynamically decides whether a new thread should join an existing cluster or form a new one.
Pseudo code of the algorithm:
def embedding_clustering(
unclustered_embeddings,
sim_threshold,
min_samples,
):
sim = cosine_similarity(unclustered_embeddings, unclustered_embeddings)
descending_indices = np.argsort(-sim, axis=1)
n_above_threshold = np.sum(sim >= sim_threshold, axis=1)
ordered_indices = np.argsort(-n_above_threshold)
clusters = []
for i in ordered_indices:
if not_clustered(i):
indices_ranked = [j for j in descending_indices[i] if sim[i, j] >= sim_threshold and not_clustered(j)]
if len(indices_ranked) >= min_samples:
mark_clustered(indices_ranked)
clusters.append(indices_ranked)
return clusters
While developing our own greedy clustering algorithm was a risk, it became clear early on that it suited our specific needs and could be scaled easily.
Recreating questions
Once we organized the most commonly asked questions into logical clusters, we needed to recreate a single representative question in clear, easy-to-read language.
To achieve this, we prompted GPT-4 Turbo to first generate a single, well-structured question for each cluster and then assign a higher-level category that encapsulates the cluster’s theme. The purpose of this broader categorization was to ensure applicability across many geos. For example, questions related to museums could be grouped under a more general category like Cultural and Recreational Experiences.
As we reviewed the generated questions and categories, we recognized the need to establish clear criteria for what makes an effective question. A question that is too general would apply to all geos but would offer little contextual value; while one that is too specific would limit its relevance to a small subset of travelers. Finding the proper balance was a challenge that generative AI alone couldn’t solve; we addressed this through a human-in-the-loop review process that guided the model toward the right level of specificity.
We soon developed the following five essential criteria that a question must meet to be considered high quality:
- Balanced specificity: A question must strike a balance between abstraction and specificity. While not too precise, it should reflect the destination’s unique characteristics.
- Timeless relevance: A question should not focus on real-time events, such as “Where is Taylor Swift’s July 2024 concert in London?”
- Focused topic: Ideally, a question should focus on a single topic. For example, “What are the best accommodations and dining options in London?” should be two separate questions: “What are the best accommodations in London?” and “What are the best dining options in London?”.
- Uniqueness: Every question must be unique in content.
- Concise wording: A question should be between 5 and 15 words long for clarity.
Guided by these criteria, we iteratively refined the initial category assignments and ultimately consolidated them into a list of 34 higher-level categories, ensuring consistent applicability across geos.
Overall, our three-stage process for generating questions was as follows:
- The model extracts the topics in each cluster and drafts an initial question.
- The questions are refined for consistency and reliability via another model call.
- Similar questions are eliminated using an embedding-based comparison.
Generating answers to the questions
Now that we had successfully created, categorized, and shortlisted the questions, we needed to extract relevant answers from forum discussions. To achieve this, we embedded all of the forum posts.
We stored all forum embeddings in Qdrant, our vector database, allowing for fast and efficient retrieval. These embeddings not only power our FAQ generation but also support agentic workflows (automated decision-making) that power Tripadvisor’s AI assistant and other GenAI experiences.
To optimize the quality and efficiency of our retrieval process, we focused on improving two distinct areas:
- Embedding models: For retrieval, we tested two different embedding models, all-mpnet-base-v2 and bge-base-en-v1.5. We conducted an internal survey asking colleagues to rate the quality of generated answers. The results of this analysis indicated that bge-base-en-v1.5 performed slightly better, and therefore we chose it as our primary embedding model.
- Chunking strategy: We also experimented with semantic chunking vs traditional post-level chunking. We decided to use semantic chunking to generate text embeddings at the sentence level before grouping semantically related sentences into coherent chunks. Compared to post-level chunking, this semantic approach reduced the total volume of embeddings by 50%, significantly lowering the cost and improving retrieval efficiency with Qdrant.
The illustration below describes our semantic chunking approach:
Due to the nature of forum discussions, context around each post is crucial. For example, a traveler might agree or disagree with other posts in the thread. Without it, there’s a risk of confusion regarding a traveler’s point of reference. To address this, we don’t simply feed the retrieved chunks into GPT. Instead, we include multiple surrounding posts to provide broader context, helping to achieve a more accurate and coherent response.
Once we had successfully chunked the answers, we used each question from the previous step to search Qdrant for relevant posts. We then gathered all relevant answers and fed them into GPT to generate a summarized response.
The key term here is summarize. We made sure to maintain a summarization tone, clearly conveying to travelers that the answers originated from forum discussions. Furthermore, we prioritized preserving the voice of the traveler, striving to represent their perspective as authentically as possible within the model’s capabilities.
Ensuring accuracy and transparency
When summarizing traveler-generated content from forums, it’s essential to ensure the AI-generated content accurately reflects the voice of the original forum post. We needed a machine learning approach to identify and share the most relevant supporting sentences from the retrieved posts. These supporting sentences serve as evidence, reinforcing the travel advice provided in the summarized answers.
Initially, we experimented with GPT to simultaneously generate a summarized answer and explicitly list the supporting sentences it used for that summary. However, we quickly identified multiple issues with this concurrent-task approach:
- First, the model struggled to perform both tasks simultaneously; asking it to list the supporting sentences significantly degraded the quality of the generated summaries.
- Second, the relevance and quality of the supporting sentences selected by GPT varied widely across geos and even across different questions, meaning that we didn’t consistently receive the most relevant sentences used in the generated summary.
After some experimentation, we found the following high-level approach to be a more scalable and stable solution:
- The most relevant context from the forum threads was retrieved using Qdrant to answer each question.
- GPT was prompted to answer the question using the retrieved context.
- The retrieved posts and generated answers were then split into individual sentences.
- Next, the data was compared at a sentence-to-sentence level by calculating cosine similarity between sentence embeddings, enabling the identification of semantic similarities between the generated answers and the forum posts.
- Finally, the top forum sentences with the highest similarity scores were retrieved and stored alongside the answer.
The following table shows a typical comparison of summarized answers and forum post sentences, along with their similarity scores. The highlighted text shows keywords and phrases that are common in both the original post and the generated answer:
This approach has proven highly effective and generates reliable results across multiple geos. As a result, we now provide travelers with trustworthy, AI-summarized answers, along with relevant, authentic forum posts.
Summing it all up
Much like our review summaries project from last fall, this effort proved to be both challenging and rewarding.
Initially, isolating the right content from within such a huge volume of forum discussions was very difficult. After exploring major open-source clustering algorithms, we took a calculated risk by developing our own clustering solution. Leveraging automated feedback tuning, the system optimized AI-summarized outputs into a clear hierarchy of globally applicable questions. Ultimately, we enhanced our geographical landing pages by surfacing highly sought after, conversational insights in an easily consumable format.
We utilized GPT-4 Turbo, which we selected for its strong instruction-following ability in evaluation against other models. As model capabilities and cost structures continue to evolve, we plan to explore newer, more efficient options in future iterations.
Our tests for this Travel Advice section showed significant improvements in both SEO visibility and traveler interactions. The majority of travelers (over 70%) who provided feedback on the section found it helpful, reinforcing its value as a trusted resource. At the moment, we are only able to show a subset of the questions we’ve generated, but we’re actively exploring opportunities for deeper integration into the forums themselves.
The techniques we’ve developed are highly scalable and already demonstrate broader applicability. Forum embeddings now power Tripadvisor’s AI travel assistant, enabling accurate and context-aware responses to traveler queries. Looking ahead, we plan to expand the reach of this enriched content to our partners, ensuring travelers have seamless access to Tripadvisor’s insights at every stage of their journey — whether through traditional search or in the new generation of conversational experiences. Our recent is an early example of this approach in action, combining trusted insights from real travelers on Tripadvisor with a conversational search experience tailored to today’s travelers.
Acknowledgements
Bringing a project like this to life requires cross-functional collaboration. While this article focuses on the machine learning aspects, it would not have been possible without the expertise and efforts of our Product, SEO, Design, User Research, Analytics, QA, MLOps, and Web Engineering teams. We extend special thanks to Nadeem Almoayyed for his leadership and vision in guiding this product initiative. Together, their contributions were essential in transforming raw insights into a seamless, user-friendly experience for travelers.