2026·Luxury Travel · AI

BACCO

An autonomous AI concierge for the ultra-high-net-worth luxury travel market — built on proprietary vector databases and personalized inference pipelines.

VISIT SITE →

<50ms

Vector Search Latency

10k+

Curated Experiences

24/7

AI Concierge

01 / The Problem

The luxury travel industry serves a clientele that expects bespoke service measured in seconds, not days. Traditional concierges scale linearly with headcount — every new member means another human in the loop. BACCO needed an AI system capable of understanding hyper-personal preferences (a member who only flies private after 10am, prefers boutique hotels under 30 rooms, and avoids restaurants with white tablecloths) while delivering recommendations indistinguishable from a top human concierge with 20 years of experience. The challenge: build an inference pipeline that combines proprietary curation, real-time availability data, and member-specific context — all responding in under a second.

02 / Our Approach

How we built it

01 · Discovery

Mapped the concierge mental model

Spent 3 weeks shadowing top human concierges to understand the implicit reasoning steps. Codified 40+ preference dimensions and 12 reasoning patterns into a structured ontology.

02 · Architecture

Vector + relational hybrid

Designed a dual-database architecture: Pinecone for semantic similarity across experiences, PostgreSQL for hard constraints (dates, locations, budget caps). Custom retrieval orchestrator merges both in real-time.

03 · LLM Integration

Multi-model routing

Different models for different jobs: GPT-4 for conversational nuance, Claude for long-context preference synthesis, fine-tuned Llama for fast triage. Routing layer picks the right model per query type.

04 · Launch

Quiet rollout to top members

Soft launch to 50 high-net-worth members with weekly feedback loops. Iterated on prompt engineering and retrieval scoring based on real conversations.

03 / Stack

What powers it

Next.js 14

App Router for streaming UI + edge runtime for low-latency conversations

Pinecone

Production-grade vector search with sub-50ms p95 latency

PostgreSQL + pgvector

Relational data + secondary vector store for hot queries

OpenAI + Anthropic + Llama

Multi-model routing for cost/quality optimization

LangGraph

Structured reasoning workflows with explicit decision trees

Vercel + AWS

Edge for chat interface, AWS for heavy ML inference jobs

04 / The Result

BACCO is now serving members in real time with response quality that members consistently rate equal-or-better than human concierges in blind tests. The system has handled over 10,000 unique requests, learning from each interaction to refine member-specific recommendations. Average response time is under 800ms end-to-end, including LLM generation. Most importantly, BACCO scales without adding headcount: each new member costs near-zero marginal effort.

800ms

End-to-end response

94%

Member satisfaction

10k+

Requests handled

Human FTE added

5★

Avg rating

92%

Retention

“We replaced a process that took our team hours into something that responds in under a second — without losing the personal touch our members expect.”
— BACCO Founding Team
Luxury Travel Tech

Next case study

FLATHOUSE →

Have a project in mind?

Start a project →