DeepMind SDE onboarding and first 90 days tips 2026

DeepMind SDE Onboarding and First 90 Days Tips 2026

TL;DR

The first 90 days as a software development engineer (SDE) at DeepMind are not about writing code quickly — they’re about understanding research velocity, navigating interdisciplinary teams, and calibrating to exploratory engineering. Integration into research pods begins on day three, not week two. The onboarding bottleneck isn’t access to tools — it’s time-to-first-merge in experimental codebases. Success hinges on judgment, not output volume.

Who This Is For

This is for new SDE hires at DeepMind starting in 2026, particularly those transitioning from product engineering roles at big tech firms. It applies to IC-3 to IC-5 roles in London, Edmonton, and Mountain View offices. If your background is in scalable backends or infrastructure but you lack exposure to ML training loops or paper-driven development, this outlines the unspoken expectations.

What happens during the first week of DeepMind SDE onboarding?

Day one includes legal paperwork, laptop provisioning, and a 90-minute security briefing — but the real work starts on day two with team immersion. You’re assigned a research buddy, not just a tech mentor. This person has at least one first-author publication in NeurIPS or ICML and spends 50% of their time coding. In a Q3 2025 debrief, a hiring manager rejected a candidate’s ramp-up plan because it ignored research syncs, calling it “product-team thinking in a hypothesis-testing environment.”

Not all onboarding tasks are documented. The internal “New SDE Ramp Checklist” lists 17 items, but only 11 appear in the official LMS. The missing six — like attending a model failure postmortem or reading last quarter’s internal ablation study — are performance differentiators. Attend one and you’re seen as curious. Attend three, and you’re assumed to have research instincts.

Your first code assignment will not be a bug fix. It’s typically a minor instrumentation patch into a distributed training job — adding logging for gradient sparsity. This isn’t about the logs; it’s about forcing you to understand how training jobs fail silently. In a debrief for an IC-4 candidate, the HC noted, “They asked why we log sparsity, not how to log it. That’s the signal we want.”

Judgment layer: Onboarding at DeepMind measures your ability to operate in uncertainty, not your familiarity with Python. The problem isn’t your technical speed — it’s your question hierarchy.

How are SDEs integrated into research teams during onboarding?

You are embedded into a research pod within seven days, not after a month-long bootcamp. These pods consist of 1–2 researchers, 1–2 SDEs, and a systems engineer. The ratio matters: SDEs are not support staff. They own data pipelines, training infrastructure, and evaluation tooling — areas that determine whether a hypothesis can be tested at all.

In one Q2 2025 team review, a pod stalled for 11 days because the new SDE assumed their role was to “implement researcher requests.” They waited for specs. The lead researcher said in the feedback, “We don’t write specs — we explore. If you’re not proposing three engineering paths for a 10x faster eval loop, you’re slowing us down.”

Ownership is distributed, not assigned. No one tells you to optimize the checkpointing logic — you’re expected to notice it’s inefficient. In a 2024 HC debate, a senior director blocked an SDE promotion because “they waited for permission to refactor the rollout script. That script cost us six training days last cycle. Initiative isn’t optional.”

Not integration, but agency. The metric isn’t “How quickly did you join the team?” but “How soon did you change its trajectory?” One IC-4 hire reduced synthetic data generation latency by 40% in week three by repurposing an internal compression library no one else knew existed. That wasn’t on their ramp plan — it was curiosity.

What technical systems do new SDEs need to learn immediately?

Within 72 hours, you must access the internal JAX codebase, submit a test job to the TPU fleet, and read the latest internal documentation on the distributed checkpointing protocol. These are non-negotiable. The company’s training infrastructure runs on internally modified Kubernetes clusters with custom schedulers tuned for sparse gradient updates — not general workloads.

In a 2025 postmortem, a new SDE submitted a job that consumed 1.3k TPU hours due to a misconfigured replication strategy. It wasn’t a fire, but it triggered a retrospective. The HC noted, “They didn’t read the ‘Common TPU Anti-Patterns’ doc. That doc exists because two IC-5s made the same error last year. Ignoring known failure modes is a culture fit issue.”

You’ll use Borg, not Kubernetes; XLA, not raw PyTorch; and BigTable-backed metadata stores for experiment tracking. Git is not the source of truth — internal changelists via Piper are. One hire in 2025 lost three days trying to use GitHub-style branching. The feedback: “You’re thinking in versions. We think in atomic, reversible changes.”

Not syntax, but semantics. The problem isn’t that you don’t know JAX — it’s that you assume APIs reflect intent. At DeepMind, code is a research artifact. A single line change in a data loader can invalidate a month of experiments. That’s why code reviews take 48–72 hours, not four.

> 📖 Related: DeepMind PM hiring process complete guide 2026

How should SDEs prioritize work in the first 90 days?

Your manager will give you a ramp plan with 12–15 tasks. Ignore the order. Prioritize based on research team urgency, not task sequencing. In a 2025 performance review, an SDE completed all 15 tasks but was rated “Needs Improvement” because they deprioritized a dataset bottleneck that delayed a submission to ICLR.

Research deadlines dominate engineering milestones. If a paper is due in six weeks, everything shifts. One SDE in Edmonton rewrote a tokenizer pipeline in 72 hours because the original design caused irreproducible results. They didn’t wait for approval. Their manager later said, “That wasn’t on their plan. It was on our survival path.”

Not velocity, but impact alignment. The problem isn’t your productivity — it’s your prioritization framework. DeepMind doesn’t run on OKRs for individual contributors in research pods. Progress is measured by how much you’ve reduced uncertainty in a research direction.

You’ll face conflicting signals: your tech mentor wants you to document everything; your researcher wants fast prototypes. Resolve it by shipping minimal, auditable changes. One IC-3 hire built a model diffing tool in week four because manual comparison was causing errors. It wasn’t requested. It became standard within two months.

How is performance evaluated during the first 90 days?

Performance isn’t measured by completed ramp tasks or lines of code. It’s assessed through three silent signals: time-to-first-substantive-review-comment, frequency of inclusion in research design discussions, and whether you’re invited to the Friday “failure autopsy” meeting.

In a Q4 2025 HC, a candidate was downgraded because “they never commented on a design doc outside their immediate task.” Participation isn’t optional. Silence is interpreted as disengagement, not humility.

You’ll have a 30/60/90-day review, but the real feedback starts in week two. When you submit your first changelist, reviewers won’t just comment on code style — they’ll ask, “How does this affect reproducibility?” or “Have you considered impact on eval throughput?” If you can’t answer, you’re seen as a coder, not an SDE.

Not delivery, but insight generation. The problem isn’t that you shipped late — it’s that you didn’t anticipate the ripple. One hire in Mountain View was fast-tracked because they identified a data leakage issue in a preprocessing script during their second code review. They weren’t working on that module. They were just reading.

How do SDEs build credibility with researchers?

Credibility is earned by asking diagnostic questions, not by solving assigned tasks. In a 2025 team survey, researchers ranked “engineers who ask why this ablation matters” higher than those who “deliver quickly.” Speed is table stakes. Judgment is promotion-worthy.

One new SDE in London attended a model scaling meeting and asked, “Are we optimizing for FLOPs or wall-clock time?” The principal investigator later said, “That question saved us two weeks. No one else had framed it that way.” The SDE was invited to lead the next infrastructure eval.

Not deference, but challenge. The problem isn’t that you’re not contributing — it’s that you’re not reframing. Researchers don’t want order-takers. They want co-thinkers. If you say, “I can build that,” you’re a vendor. If you say, “That approach may not scale to batch size 8k — here’s a test,” you’re a partner.

BAD example: An SDE implemented a requested data augmentation pipeline without asking about distribution shift risks. The model overfitted. The researcher noted, “They did what I asked. That’s not enough.”

GOOD example: Another SDE pushed back on a training frequency change, citing memory thrashing patterns from a prior project. They ran a microbenchmark. The team changed course. That SDE was included in the next paper’s acknowledgments.

Preparation Checklist

Complete the internal security certification before day one — it takes 8 hours and blocks system access
Study the last three NeurIPS papers from your target team — know their methods, not just results
Set up your workstation to compile JAX from source — prebuilt binaries won’t work on internal clusters
Schedule coffee chats with two SDEs on your team — ask about their biggest infrastructure regret
Work through a structured preparation system (the PM Interview Playbook covers Google AI engineering evaluation with real debrief examples)
Read the internal “Training Job Anti-Patterns” doc — it’s not public, but you’ll be expected to know it
Practice writing changelist descriptions that explain research impact, not just technical changes

Mistakes to Avoid

BAD: Waiting for a task before starting to explore the codebase

GOOD: On day two, run a failed training job locally to understand error patterns

BAD: Focusing on completing ramp tasks in order

GOOD: Re-prioritizing based on upcoming paper deadlines, even if it skips assigned items

BAD: Treating researchers as product managers who write specs

GOOD: Proactively proposing engineering trade-offs during hypothesis design meetings

FAQ

Onboarding at DeepMind is not about ramp speed — it’s about research alignment. The first 90 days test whether you can operate in ambiguity, not whether you can code fast. Your value isn’t in delivery, but in reducing uncertainty for the team.

What’s the most common reason new SDEs fail in the first 90 days?

They optimize for task completion, not research impact. In 2025, three SDEs were offboarded because they followed ramp plans rigidly while their pods missed deadlines. The issue wasn’t skill — it was misaligned priorities. DeepMind doesn’t reward checklist execution in research environments.

How much coding is expected in the first 30 days?

Expect to submit 3–5 changelists, but the number is irrelevant. What matters is whether your changes enable new experiments or prevent failures. One SDE shipped zero changes in month one but was praised for redesigning a monitoring dashboard that caught two critical bugs early.

Is there a formal training program for new SDEs?

There is a two-day orientation, but no multi-week bootcamp. Learning is on-the-job. The assumption is you already understand distributed systems — the gap is in research integration. If you need foundational ML infrastructure training, you’re not ready for DeepMind’s SDE role.

Ready to build a real interview prep system?

Get the full PM Interview Prep System →

The book is also available on Amazon Kindle.