Menu Passport

2025

Agentic AI system that translates foreign restaurant menus from photo to enriched output with OCR, translation, and dish imagery.

Python
LangChain
GPT-4
React
TypeScript
GCP

The Problem

Dining at foreign restaurants is exciting but often inaccessible when you can't read the menu. Raw translation gives you words, not context — you still don't know what a dish actually looks like, whether it's spicy, or if it's vegetarian. The gap isn't translation; it's enrichment.

How It Works

Point your camera at any menu. Google Vision OCR extracts the text, and a LangChain agent powered by GPT-4 takes over: it filters the OCR results to include only menu items, translates each of them, fetches a representative photo via the Google Custom Search Images API, and converts currency. The result is a rich, visual menu in your language — readable even if you've never encountered the cuisine. The app is live, with the frontend on Vercel and the backend on Render.

Architecture

The agent orchestrates 10 specialized tools, each handling a discrete step: OCR parsing, language detection, per-item translation, description generation, image search, dietary classification, and structured output formatting. Token costs were reduced by 60% by passing structured context between tool calls and caching translation and image results keyed by dish name — avoiding redundant LLM and API calls for repeated items across large menus. The React/TypeScript frontend is deployed on Vercel; the Python/FastAPI + LangChain backend runs on Render.

Frontend →Backend →