Hotel Platform Tech Lead — Maintenance & Feature Delivery

Overview

Stepped into the technical lead role for the hotel booking platform squad — the team that owns the company's highest-revenue customer-facing system. The platform is a multi-service stack that has evolved over years: a legacy Java/Spring monolith at the core, a growing set of newer services around it (wholesaler integrations, pricing, inventory, frontend apps), and shared infra managed via Kubernetes manifests and Terraform.

Dual mandate:

Keep the legacy system healthy — coach team members through incident response, unfamiliar code paths, and the operational realities of a system nobody originally designed from scratch.
Ship new features on top of it — drive the design and delivery of new integrations and platform extensions without destabilizing the revenue-critical core.

Team shape: A small cross-functional squad (backend-leaning, with frontend and infra responsibilities). My role is technical rather than people-management: architecture, code review, hands-on contribution, and mentorship across the group.

My Role

As tech lead, my responsibilities span both maintenance and forward-looking work:

Maintenance & Legacy Stewardship

Triage production incidents on the legacy hotel platform; pair with engineers on root-cause analysis instead of fixing everything myself
Review every non-trivial PR into the legacy codebase; flag risky patterns (transaction boundaries, N+1 queries, breaking API contracts) before they ship
Maintain runbooks and onboarding docs so new team members can ramp up on unfamiliar legacy code without blocking on me
Own the Kubernetes manifest + Terraform pipelines that the whole squad depends on, coaching team members through safe infra changes

Feature Delivery on the Legacy Stack

Design new integrations (wholesaler connectors, pricing hooks, inventory sync) that plug into the existing monolith without rewriting it
Break features into work items sized for different engineers' experience levels — juniors ship vertical slices, seniors own cross-cutting concerns
Run design reviews before implementation; enforce "document the contract, then build" instead of code-first refactors that drift

Knowledge Transfer

Run pairing sessions on the harder parts of the legacy codebase (batch jobs, transaction management, cross-service consistency)
Write up architectural decisions as ADRs so the reasoning survives turnover
Push the team toward self-sufficient debugging — teach the workflow (logs → metrics → traces → code) rather than hand out answers

Tech Stack

Frontend

ReactTypeScriptViteTailwind CSS

Backend

Java (legacy monolith + new services)Spring BootSpring Cloud (OpenFeign)REST APIsScheduled jobs / batch processors

Infrastructure

Kubernetes / GKEArgoCD (GitOps delivery)Terraform (multi-env infra as code)Cloud SQL (MySQL)Redis

Process

PR review & design review ownershipIncident triage / on-call rotation supportPairing and knowledge-transfer sessionsRunbook and architecture documentation

Architecture

The platform the squad owns is not a single project — it is a living ecosystem. The lead work is less about building any one system and more about keeping multiple systems consistent with each other while the team evolves them in parallel.

Legacy Core: A long-lived Java/Spring booking monolith that serves the main customer-facing site. Still under active development, but the priority is stability and backward-compatible extension — not rewrite.

Wholesaler Integrations: New frontend + backend services that plug wholesaler inventory and pricing into the core. Built alongside (not inside) the monolith so failures are isolated and new code can use a modern stack.

Infrastructure Layer: Shared Kubernetes manifests (GitOps via ArgoCD) and Terraform modules that every service in the squad depends on. Changes here ripple across the whole platform, so review bar is higher than application code.

Next-Generation Platform Work: Greenfield services that will eventually take over responsibilities from the legacy core. Kept small and well-scoped — we do not block current delivery on speculative rewrites.

Key Challenges

1. Legacy Knowledge Is Tribal

Critical behavior of the legacy system lives in the heads of a few long-tenured engineers, not in documentation. Newcomers get stuck on the same landmines repeatedly — transaction scoping, implicit API contracts, undocumented cron dependencies.

2. Shipping Features Without Breaking Revenue

The legacy platform is the company's primary revenue channel. Any regression — a pricing bug, a booking failure, a slow endpoint during peak hours — is immediately visible in revenue. New features must integrate safely.

3. Uneven Team Experience

The squad mixes engineers at very different experience levels. Over-delegating risky work to juniors causes incidents; over-centralizing decisions on me creates a bottleneck and blocks their growth.

4. Multi-Repo, Multi-Env Operational Surface

The stack spans many repos (application code, wholesaler services, Kubernetes manifests, Terraform). A single feature often touches 3–5 repos across 2–3 environments. Coordinating this well is its own skill.

Solutions & Design Decisions

Codify the Tribal Knowledge

For every incident the team resolves, we capture a short writeup: what broke, how we found it, and what the landmine was. Over time these accumulate into a real runbook corpus. New engineers read the runbooks before touching related code, and the same class of incident stops recurring.

Layered Review — Not Just Approval

I treat PR review as a teaching surface, not a gatekeeping step. Comments explain why a pattern is risky (with reference to the runbook when relevant) rather than just "please change this." The reviewee ships the fix themselves and learns the pattern; next time they apply it without me catching it.

Vertical Slice Delegation

For each new feature, I break work into end-to-end vertical slices — an API endpoint plus its DB migration plus its frontend call plus its test. Juniors own a full slice and ship it to production; seniors own the cross-cutting design decisions (transactional boundaries, auth, observability). Everyone ends the sprint with something they shipped themselves.

Safe Change Patterns for the Legacy Core

For risky edits in the legacy monolith, the team follows a shared pattern: feature flag → dual-write / dual-read → verify parity in production → cut over → remove the flag. No one merges a behavior-changing patch into the hot path without a rollback lever.

Infra as the Shared Substrate

Kubernetes manifests and Terraform are treated as a team-owned substrate, not a "platform team" external dependency. Every team member has pushed at least one infra change with my review. That removes the "I don't touch infra" bottleneck and spreads the muscle across the squad.

Results & Impact

Team Capability

Every engineer in the squad has shipped to production independently on both the legacy core and the newer services
Infra changes (Kubernetes / Terraform) are no longer gated on a single person — multiple engineers can review and ship safely
On-call and incident-response load is genuinely distributed across the squad, not concentrated on me

Delivery

Consistent delivery of new wholesaler integrations and platform features on top of the legacy stack without revenue-impacting regressions
Infra and deploy changes land as reviewed, reversible PRs — no more ad-hoc cluster edits
Architectural decisions are now written down (ADRs + runbooks) so the reasoning survives when people change teams

Legacy Health

Recurring incident patterns decrease as runbooks accumulate and the team internalizes them
The legacy monolith is still the revenue core, but it is no longer a single-person-knowledge system

Learnings

Leading Is a Force Multiplier, Not a Solo Act

The instinct when you know a legacy system well is to fix everything yourself. That is faster in the short term and strictly worse for the team in the long term. The real win is when an engineer resolves an incident class of bug without needing to ping you. Optimizing for that — even at the cost of slower individual PRs — is the job.

Runbooks Beat Oral Tradition

A short, honest runbook — "this is what broke, this is what we did, this is why the obvious fix is wrong" — is worth a dozen design docs. It is the artifact that survives engineer turnover and compounds over time.

Feature Flags Are a Leadership Tool

Not just an engineering pattern. When the whole team uses feature flags by default on the legacy core, risk goes down and juniors can ship things they would otherwise be scared to touch. The psychological effect on team velocity matters as much as the technical one.

Don't Rewrite the Core to Feel Productive

The strongest temptation as a new lead is to propose a rewrite. Resisting that — and instead extending the legacy system with well-scoped new services — is the boring-correct answer, and the team ships more because of it.

Scope of the Lead Role

What this project is (and isn't)

This is not a single product build — it is the ongoing technical leadership of a squad that owns a multi-repo, multi-service hotel booking platform. The day-to-day is a mix of: reviewing teammates' PRs, pairing on incidents, designing new features that extend the legacy core, and shipping hands-on code where it most unblocks the team. The other Adventure projects in this portfolio (Dynamic Pricing, Inventory Automation, Hotel Center MCP) are concrete systems I built solo; this entry is about the umbrella role that sits over the whole squad.

What "Maintaining the Legacy System" Actually Means

The hotel platform is the company's revenue spine — it predates every service built around it, and most current customer traffic still flows through it. Maintaining it is not a "keep the lights on" background task; it is a first-class responsibility with real product and operational judgment.

Typical maintenance work the team handles:

Production incidents (DB contention, scheduled-job failures, upstream supplier outages) — triaged with the engineer who picked up the page, not taken over
Regression fixes triggered by upstream partner changes (wholesaler APIs, payment gateways, pricing contracts)
Performance work on hot paths that cannot be rewritten — focused on pragmatic fixes (index additions, query tuning, caching) rather than architectural overhauls
Safe dependency bumps — Spring Boot / Java version upgrades on a monolith that ships to production many times a day

What "New Features on the Legacy System" Actually Means

New feature work is designed to extend the legacy core, not to modify its shape. The pattern:

Define the new capability as a separate service or module with a clear contract to the monolith
Add a narrow, well-reviewed integration point in the legacy code
Ship behind a feature flag; dual-run in production where it makes sense; cut over
Retire the flag; document the integration point so the next engineer can extend it safely

This approach lets the squad ship meaningful new functionality (wholesaler channel expansions, pricing integrations, inventory hooks) without betting any single launch on a legacy-core refactor.

How I Coach the Team

The review bar I hold

Not "is this code correct?" — that is the minimum. The higher bar is: after this lands, can the next engineer on this file do the right thing without pinging the author? That pushes reviews toward readability, naming, test coverage on intent (not just implementation), and comments that explain why, not what.

Concrete habits I use with the squad:

Design-first for anything non-trivial. A short written proposal (goal, approach, alternatives, rollback plan) before any code. Beats surprise PRs.
"Show me the failure mode" on every review. What breaks this change in production? If neither of us can name it, the change is not ready.
Pair on the first scary PR. When a teammate is about to touch a legacy hot path for the first time, I pair through it live rather than reviewing the PR cold.
ADRs for anything we will forget in 6 months. Thin, dated, decisions-with-context. The point is searchability, not polish.

What This Adds to the Portfolio

The other Adventure projects show what I build when I own every line of code. This entry shows what I do when the goal is team output, not individual output — and specifically, when the terrain is a living legacy system rather than a greenfield build. Both skill sets matter; this page exists because the solo-developer projects alone do not reflect the scope of the role.

This case study describes the engineering approach and public-safe architectural decisions. Internal identifiers, business rules, proprietary implementation details, and sensitive operational data have been omitted or generalized.