ChatGPT-4, Claude Sonnet 4.7, Gemini 2.5 Pro, Mindtrip, Layla.ai, Wonderplan, Vacay and Voyspark Spark planned the same 14-day Japan trip. The results are not what the marketing claims.
ChatGPT-4, Claude Sonnet 4.7, Gemini 2.5 Pro, Mindtrip, Layla.ai, Wonderplan, Vacay and Voyspark Spark planned the same 14-day Japan trip. The results are not what the marketing claims.
ChatGPT-4 wins on conversational fluency but loses on factual specificity — suggested three restaurants that closed in 2024 and a ryokan that has been a parking lot since 2022.
Claude Sonnet 4.7 produced the most culturally nuanced itinerary — understood that "avoid Tokyo crowds" means Yanaka and Kagurazaka, not skipping Tokyo entirely.
Mindtrip is the only tool with native booking integration: hotel suggestions click through to Booking.com and Hotels.com with real-time prices in the same session.
Layla.ai produces the most visually polished output (Instagram-ready maps and image galleries) but generic restaurants — same five sushi spots every test.
Gemini 2.5 Pro reads Google Maps reviews in real time and adjusted suggestions based on actual operating hours — a measurable advantage for Japan where many restaurants close on different days.
ChatGPT-4, Claude Sonnet 4.7, Gemini 2.5 Pro, Mindtrip, Layla.ai, Wonderplan, Vacay and Voyspark Spark planned the same 14-day Japan trip. The results are not what the marketing claims.