System Design for PMs: Understanding APIs and Webhooks
TL;DR
PMs must treat API and webhook design as a signal of judgment, not just a checklist of components; interviewers evaluate how you balance latency, reliability, and operational cost. In a Q3 debrief at a Series B SaaS company, the hiring manager rejected a candidate who listed REST endpoints without explaining failure handling, saying the answer showed poor product intuition. Strong answers contrast synchronous request‑response with asynchronous event‑driven patterns, call out concrete trade‑offs, and tie each technical choice to a user outcome or business metric.
Who This Is For
This guide is for product managers preparing for system design interviews at mid‑size tech firms or large enterprises where the interview loop includes a dedicated design round. If you have shipped at least one feature that involved integrating a third‑party service or building an internal service, you will find the scenarios familiar. Those transitioning from non‑technical roles or with less than two years of PM experience should first review the basics of request‑response protocols before attempting the trade‑off discussions outlined here.
How do APIs differ from webhooks in a product manager's system design interview?
An API is a synchronous interface where a client calls a service and waits for a response; a webhook is an asynchronous callback where the service initiates an HTTP request to a client‑provided URL when an event occurs. In interviews, stating that APIs are “request‑response” and webhooks are “push notifications” earns only partial credit; you must explain why you would choose one over the other based on latency tolerance, failure recovery, and operational overhead.
For example, if a user action requires immediate feedback—like processing a payment—you would choose an API because waiting for a webhook introduces unpredictable delay. Conversely, if you need to notify external systems of a batch‑completed report, a webhook reduces polling traffic and scales better with many consumers. Interviewers listen for the explicit link between the pattern and a product goal such as “reducing checkout abandonment by 5 %” or “cutting infrastructure cost by handling 10k events per minute with a single endpoint.”
What key API design principles should PMs know for system design questions?
Core API design principles that interviewers expect you to mention are consistency, versioning, authentication, rate limiting, and error handling. Consistency means using standard HTTP verbs, predictable resource paths, and uniform JSON schemas across endpoints; a candidate who mixes snake_case and camelCase in the same response was flagged in a debrief for lacking attention to detail. Versioning should be discussed as a way to evolve contracts without breaking existing integrations—common approaches include URL versioning (/v1/resource) or header‑based versioning, and you should note the trade‑off between client simplicity and server complexity.
Authentication is often covered by API keys, OAuth 2.0, or mutual TLS; you must state why you chose a scheme (e.g., OAuth 2.0 for third‑party access because it supports scoped tokens and revocation). Rate limiting protects the service from overload; mention concrete limits like 100 requests per second per API key and explain how you would communicate 429 responses with retry‑after headers. Error handling should follow RFC 7807 problem‑detail JSON, include meaningful messages, and avoid leaking stack traces. In a recent Google PM interview loop, a candidate who described rate limiting only as “we’ll add a threshold” received a follow‑up asking for the algorithm (token bucket vs fixed window) and the monitoring plan; the ability to name the algorithm and link it to a SLI (e.g., 99.9 % of requests under 100 ms latency) turned a weak answer into a strong one.
When should I discuss webhooks versus polling in my answer?
Choose webhooks when the event is infrequent, the consumer can tolerate slight delay, and you want to eliminate unnecessary HTTP requests; choose polling when you need guaranteed delivery, the consumer cannot expose a public callback URL, or the event stream is high‑frequency and requires ordering guarantees. In a debrief at a fintech startup, the hiring manager pushed back on a candidate who recommended webhooks for real‑time fraud scoring because the consumer’s system was behind a strict corporate firewall that blocked inbound hooks; the candidate then pivoted to a hybrid approach—using webhooks for non‑critical notifications and a resilient polling loop with exponential backoff for the scoring pipeline—showing awareness of operational constraints.
Quantify the trade‑off: polling every 10 seconds from 1 000 clients generates 86 400 requests per day per client, whereas a single webhook POST per event reduces traffic by two orders of magnitude but requires the provider to manage retry logic, dead‑letter queues, and idempotency. Interviewers reward answers that cite a concrete metric such as “reducing AWS API Gateway calls by 90 % saves roughly $1 200 per month at 10 million events.”
How do I explain rate limiting, authentication, and versioning in an API design?
Start by stating the goal: protect the service, secure access, and allow evolution without breaking consumers. For rate limiting, name the algorithm (token bucket is common for burst‑friendly limits), specify the limit (e.g., 5 000 requests per minute per API key), and describe the response (HTTP 429 with Retry‑After header) and client‑side handling (exponential backoff with jitter). Mention how you would monitor the limit (e.g., CloudWatch alarm on 429 rate > 1 % of traffic) and tie it to an SLO like “maintain 99.9 % successful request rate under peak load.” For authentication, choose a mechanism matched to the caller type: API keys for server‑to‑server service accounts, OAuth 2.0 authorization code flow for user‑facing apps, and JWTs for short‑lived microservice‑to‑microservice calls.
Explain token expiry, refresh strategies, and how you would revoke a compromised key (e.g., immediate deny‑list entry plus rotation within 5 minutes). For versioning, clarify that you will never break a published contract; instead, you increment the version number in the URL or header and maintain both versions for a deprecation window (typically 90 days). Provide a concrete example: “When we added pagination to /orders, we released v2 while keeping v1 active for 60 days, monitored usage via API gateway logs, and sent email reminders to the top 10 % of v1 consumers.” Interviewers look for the ability to connect each technical decision to a measurable outcome such as “deprecation plan reduced support tickets by 30 % in the following quarter.”
What trade-offs do interviewers look for when evaluating API vs webhook choices?
Interviewers assess whether you recognize that every design decision introduces a set of opposing forces: latency versus complexity, reliability versus operational overhead, and control versus consumer flexibility. A strong answer enumerates at least three trade‑offs and then selects a path based on a product‑level objective. For instance, when designing a notification system for a social app, you might note that APIs give the caller immediate feedback and simplify error handling but increase server load under bursty traffic; webhooks reduce server load and scale naturally with fan‑out but require the consumer to manage security, retries, and duplicate delivery.
You would then state that because the product goal is to deliver notifications within two seconds with 99.9 % reliability, you chose an API for direct messages (low latency critical) and a webhook for activity‑feed updates (higher latency tolerable, fan‑out to many services). In a Microsoft PM interview debrief, the hiring manager praised a candidate who explicitly called out the “duplicate delivery” risk of webhooks and proposed an idempotency key stored in Redis for 24 hours, showing depth beyond the surface pattern. Conversely, candidates who merely listed “APIs are synchronous, webhooks are asynchronous” without linking to a metric or constraint received feedback that their answer lacked judgment.
Preparation Checklist
- Review the anatomy of a RESTful API: resources, HTTP verbs, status codes, and payload formats; practice drawing a simple CRUD diagram for a product feature you have shipped.
- Study common authentication schemes (API key, OAuth 2.0, JWT) and be ready to explain why you would pick one for a given actor (internal service, partner, end‑user).
- Memorize two rate‑limiting algorithms (token bucket and fixed window) and be able to sketch how you would set burst and replenish values to meet a defined SLA.
- Identify at least two real‑world scenarios where a webhook is preferable to polling and two where polling is safer; quantify the difference in request volume or latency.
- Work through a structured preparation system (the PM Interview Playbook covers API and webhook trade‑offs with real debrief examples) to internalize the judgment framework rather than memorizing a checklist.
- Practice explaining versioning and deprecation policies using a concrete feature you have evolved, noting the timeline and communication plan.
- Prepare a one‑sentence “product impact” statement for each technical choice you discuss (e.g., “This design reduces monthly SMS cost by $800 by moving to webhook‑based delivery”).
- Run a mock system design interview with a peer and ask them to focus only on whether you linked each technical decision to a user outcome or business metric.
- Record your answer, listen back, and edit out any sentence that does not contain a trade‑off or a judgment signal.
- Review recent engineering blogs from companies you target (e.g., Stripe’s API versioning guide, Slack’s webhook security post) to cite specific limits or patterns you observed.
Mistakes to Avoid
- BAD: “I would use an API because it’s standard and everyone knows how to call it.”
- GOOD: “I chose an API for the checkout flow because the latency SLA is 200 ms; a webhook introduces variable network delay that could increase abandonment, as shown in our A/B test where adding a 150 ms callback increased drop‑off by 3.2 %.”
The first answer signals familiarity with terminology but provides no judgment; the second ties the choice to a measurable product metric and shows you considered the alternative’s downside.
- BAD: “Webhooks are better because they reduce server load.”
- GOOD: “While webhooks cut average request volume by 70 % in our notification service, they shift failure handling to the consumer; we mitigated this by requiring an idempotency key and providing a dead‑letter queue with retry‑after exponential backoff, which kept the end‑user latency under 500 ms for 99 % of events.”
The first statement omits the consumer‑side complexity; the second acknowledges the trade‑off and describes a concrete mitigation.
- BAD: “We will version the API by changing the URL when we need to.”
- GOOD: “We will use URL versioning (/v1/resource) and maintain both v1 and v2 for a 90‑day deprecation window, monitoring usage via API gateway logs; after 30 days we email the top 5 % of v1 consumers with a migration guide, reducing support tickets from version‑conflict incidents by 40 % in the following quarter.”
The first answer lacks a rollout plan and metrics; the second provides a timeline, communication tactic, and outcome measurement, which interviewers view as evidence of product judgment.
FAQ
How much time should I spend on the API/webhook portion of a system design interview?
Allocate roughly 12‑18 minutes of a 45‑minute system design round to discuss the communication layer, including API design, webhooks, polling, and related non‑functional concerns like security and observability. The remaining time should cover data storage, flow control, and scaling. If you notice the interviewer probing deeper on latency or failure scenarios, be ready to extend the discussion by 3‑5 minutes while summarizing the other sections briefly.
What salary range can I expect after passing a system design interview at a large tech company?
For a mid‑level product manager (L5/L6 equivalent) at firms like Google, Amazon, or Microsoft, the total compensation package typically ranges from $190 k to $260 k per year, consisting of a base salary between $150 k and $190 k, an annual bonus of 10‑20 %, and equity grants that vest over four years. These figures reflect publicly disclosed levels and recent offer data; actual numbers vary by location, performance, and specific org budget.
Is it ever acceptable to say I don’t know a detail like the exact rate‑limit algorithm?
It is acceptable to admit you do not recall the precise name of an algorithm, provided you immediately follow with a principled approach: describe the properties you need (e.g., burst allowance, smooth throttling) and propose a reasonable method (token bucket or leaky bucket) while noting you would confirm the exact implementation with the platform team. Interviewers value the ability to reason from first principles over rote memorization, and showing you know where to look up the detail demonstrates resourcefulness.
Ready to build a real interview prep system?
Get the full PM Interview Prep System →
The book is also available on Amazon Kindle.