Overview
Stepped into the technical lead role for the hotel booking platform squad — the team that owns the company's highest-revenue customer-facing system. The platform is a multi-service stack that has evolved over years: a legacy Java/Spring monolith at the core, a growing set of newer services around it (wholesaler integrations, pricing, inventory, frontend apps), and shared infra managed via Kubernetes manifests and Terraform.
Dual mandate:
- Keep the legacy system healthy — coach team members through incident response, unfamiliar code paths, and the operational realities of a system nobody originally designed from scratch.
- Ship new features on top of it — drive the design and delivery of new integrations and platform extensions without destabilizing the revenue-critical core.
Team shape: A small cross-functional squad (backend-leaning, with frontend and infra responsibilities). My role is technical rather than people-management: architecture, code review, hands-on contribution, and mentorship across the group.
My Role
As tech lead, my responsibilities span both maintenance and forward-looking work:
Maintenance & Legacy Stewardship
- Triage production incidents on the legacy hotel platform; pair with engineers on root-cause analysis instead of fixing everything myself
- Review every non-trivial PR into the legacy codebase; flag risky patterns (transaction boundaries, N+1 queries, breaking API contracts) before they ship
- Maintain runbooks and onboarding docs so new team members can ramp up on unfamiliar legacy code without blocking on me
- Own the Kubernetes manifest + Terraform pipelines that the whole squad depends on, coaching team members through safe infra changes
Feature Delivery on the Legacy Stack
- Design new integrations (wholesaler connectors, pricing hooks, inventory sync) that plug into the existing monolith without rewriting it
- Break features into work items sized for different engineers' experience levels — juniors ship vertical slices, seniors own cross-cutting concerns
- Run design reviews before implementation; enforce "document the contract, then build" instead of code-first refactors that drift
Knowledge Transfer
- Run pairing sessions on the harder parts of the legacy codebase (batch jobs, transaction management, cross-service consistency)
- Write up architectural decisions as ADRs so the reasoning survives turnover
- Push the team toward self-sufficient debugging — teach the workflow (logs → metrics → traces → code) rather than hand out answers
Tech Stack
Frontend
Backend
Infrastructure
Process
Architecture
The platform the squad owns is not a single project — it is a living ecosystem. The lead work is less about building any one system and more about keeping multiple systems consistent with each other while the team evolves them in parallel.
Legacy Core: A long-lived Java/Spring booking monolith that serves the main customer-facing site. Still under active development, but the priority is stability and backward-compatible extension — not rewrite.
Wholesaler Integrations: New frontend + backend services that plug wholesaler inventory and pricing into the core. Built alongside (not inside) the monolith so failures are isolated and new code can use a modern stack.
Infrastructure Layer: Shared Kubernetes manifests (GitOps via ArgoCD) and Terraform modules that every service in the squad depends on. Changes here ripple across the whole platform, so review bar is higher than application code.
Next-Generation Platform Work: Greenfield services that will eventually take over responsibilities from the legacy core. Kept small and well-scoped — we do not block current delivery on speculative rewrites.
Key Challenges
1. Legacy Knowledge Is Tribal
Critical behavior of the legacy system lives in the heads of a few long-tenured engineers, not in documentation. Newcomers get stuck on the same landmines repeatedly — transaction scoping, implicit API contracts, undocumented cron dependencies.
2. Shipping Features Without Breaking Revenue
The legacy platform is the company's primary revenue channel. Any regression — a pricing bug, a booking failure, a slow endpoint during peak hours — is immediately visible in revenue. New features must integrate safely.
3. Uneven Team Experience
The squad mixes engineers at very different experience levels. Over-delegating risky work to juniors causes incidents; over-centralizing decisions on me creates a bottleneck and blocks their growth.
4. Multi-Repo, Multi-Env Operational Surface
The stack spans many repos (application code, wholesaler services, Kubernetes manifests, Terraform). A single feature often touches 3–5 repos across 2–3 environments. Coordinating this well is its own skill.
Solutions & Design Decisions
Codify the Tribal Knowledge
For every incident the team resolves, we capture a short writeup: what broke, how we found it, and what the landmine was. Over time these accumulate into a real runbook corpus. New engineers read the runbooks before touching related code, and the same class of incident stops recurring.
Layered Review — Not Just Approval
I treat PR review as a teaching surface, not a gatekeeping step. Comments explain why a pattern is risky (with reference to the runbook when relevant) rather than just "please change this." The reviewee ships the fix themselves and learns the pattern; next time they apply it without me catching it.
Vertical Slice Delegation
For each new feature, I break work into end-to-end vertical slices — an API endpoint plus its DB migration plus its frontend call plus its test. Juniors own a full slice and ship it to production; seniors own the cross-cutting design decisions (transactional boundaries, auth, observability). Everyone ends the sprint with something they shipped themselves.
Safe Change Patterns for the Legacy Core
For risky edits in the legacy monolith, the team follows a shared pattern: feature flag → dual-write / dual-read → verify parity in production → cut over → remove the flag. No one merges a behavior-changing patch into the hot path without a rollback lever.
Infra as the Shared Substrate
Kubernetes manifests and Terraform are treated as a team-owned substrate, not a "platform team" external dependency. Every team member has pushed at least one infra change with my review. That removes the "I don't touch infra" bottleneck and spreads the muscle across the squad.
Results & Impact
Team Capability
- Every engineer in the squad has shipped to production independently on both the legacy core and the newer services
- Infra changes (Kubernetes / Terraform) are no longer gated on a single person — multiple engineers can review and ship safely
- On-call and incident-response load is genuinely distributed across the squad, not concentrated on me
Delivery
- Consistent delivery of new wholesaler integrations and platform features on top of the legacy stack without revenue-impacting regressions
- Infra and deploy changes land as reviewed, reversible PRs — no more ad-hoc cluster edits
- Architectural decisions are now written down (ADRs + runbooks) so the reasoning survives when people change teams
Legacy Health
- Recurring incident patterns decrease as runbooks accumulate and the team internalizes them
- The legacy monolith is still the revenue core, but it is no longer a single-person-knowledge system
Learnings
Leading Is a Force Multiplier, Not a Solo Act
The instinct when you know a legacy system well is to fix everything yourself. That is faster in the short term and strictly worse for the team in the long term. The real win is when an engineer resolves an incident class of bug without needing to ping you. Optimizing for that — even at the cost of slower individual PRs — is the job.
Runbooks Beat Oral Tradition
A short, honest runbook — "this is what broke, this is what we did, this is why the obvious fix is wrong" — is worth a dozen design docs. It is the artifact that survives engineer turnover and compounds over time.
Feature Flags Are a Leadership Tool
Not just an engineering pattern. When the whole team uses feature flags by default on the legacy core, risk goes down and juniors can ship things they would otherwise be scared to touch. The psychological effect on team velocity matters as much as the technical one.
Don't Rewrite the Core to Feel Productive
The strongest temptation as a new lead is to propose a rewrite. Resisting that — and instead extending the legacy system with well-scoped new services — is the boring-correct answer, and the team ships more because of it.
Scope of the Lead Role
What this project is (and isn't)
This is not a single product build — it is the ongoing technical leadership of a squad that owns a multi-repo, multi-service hotel booking platform. The day-to-day is a mix of: reviewing teammates' PRs, pairing on incidents, designing new features that extend the legacy core, and shipping hands-on code where it most unblocks the team. The other Adventure projects in this portfolio (Dynamic Pricing, Inventory Arbitrage, Hotel Center MCP) are concrete systems I built solo; this entry is about the umbrella role that sits over the whole squad.
What "Maintaining the Legacy System" Actually Means
The hotel platform is the company's revenue spine — it predates every service built around it, and most current customer traffic still flows through it. Maintaining it is not a "keep the lights on" background task; it is a first-class responsibility with real product and operational judgment.
Typical maintenance work the team handles:
- Production incidents (DB contention, scheduled-job failures, upstream supplier outages) — triaged with the engineer who picked up the page, not taken over
- Regression fixes triggered by upstream partner changes (wholesaler APIs, payment gateways, pricing contracts)
- Performance work on hot paths that cannot be rewritten — focused on pragmatic fixes (index additions, query tuning, caching) rather than architectural overhauls
- Safe dependency bumps — Spring Boot / Java version upgrades on a monolith that ships to production many times a day
What "New Features on the Legacy System" Actually Means
New feature work is designed to extend the legacy core, not to modify its shape. The pattern:
- Define the new capability as a separate service or module with a clear contract to the monolith
- Add a narrow, well-reviewed integration point in the legacy code
- Ship behind a feature flag; dual-run in production where it makes sense; cut over
- Retire the flag; document the integration point so the next engineer can extend it safely
This approach lets the squad ship meaningful new functionality (wholesaler channel expansions, pricing integrations, inventory hooks) without betting any single launch on a legacy-core refactor.
How I Coach the Team
The review bar I hold
Not "is this code correct?" — that is the minimum. The higher bar is: after this lands, can the next engineer on this file do the right thing without pinging the author? That pushes reviews toward readability, naming, test coverage on intent (not just implementation), and comments that explain why, not what.
Concrete habits I use with the squad:
- Design-first for anything non-trivial. A short written proposal (goal, approach, alternatives, rollback plan) before any code. Beats surprise PRs.
- "Show me the failure mode" on every review. What breaks this change in production? If neither of us can name it, the change is not ready.
- Pair on the first scary PR. When a teammate is about to touch a legacy hot path for the first time, I pair through it live rather than reviewing the PR cold.
- ADRs for anything we will forget in 6 months. Thin, dated, decisions-with-context. The point is searchability, not polish.
What This Adds to the Portfolio
The other Adventure projects show what I build when I own every line of code. This entry shows what I do when the goal is team output, not individual output — and specifically, when the terrain is a living legacy system rather than a greenfield build. Both skill sets matter; this page exists because the solo-developer projects alone do not reflect the scope of the role.