deutsche-telekom-ds-ds-sql-coding-2026

Deutsche Telekom Data Scientist SQL and Coding Interview 2026

TL;DR

Deutsche Telekom’s 2026 data scientist interviews prioritize SQL fluency and coding precision over theoretical knowledge. The coding rounds test real-world data pipeline logic, not leetcode-style puzzles. Your ability to write clean, efficient SQL with clear business context will decide your outcome — not your grasp of advanced statistics.

Who This Is For

You are a data scientist with 1–5 years of experience targeting roles at Deutsche Telekom in Berlin, Bonn, or Darmstadt. You’ve passed initial screenings and now face technical evaluation. This guide is not for entry-level applicants without SQL production experience or those expecting academic case studies. It’s for professionals who’ve written SQL in live environments and want to avoid being filtered out in round two.

What does the Deutsche Telekom data scientist SQL interview actually test?

The interview tests whether you can translate ambiguous business questions into executable, readable SQL — not whether you can recite window function syntax. In a Q3 2025 debrief, a candidate was rejected despite correct output because their query used nested CTEs where a single LEFT JOIN would suffice. The feedback: “They solved the problem like a student, not an engineer.”

Deutsche Telekom’s data teams run lean. They need people who write SQL that others can maintain. The queries you’ll face involve customer churn prediction, subscription overlap analysis, and usage trend aggregation across terabytes of telecom data.

Not syntax memorization — but clarity under ambiguity.

Not perfect formatting — but logical flow that survives schema changes.

Not complex subqueries — but intentional joins that scale.

One hiring manager in Bonn told me: “If I can’t explain your query to a product manager in two sentences, it’s too complicated.” That’s the bar.

You’ll get one prompt with real table schemas — CUSTOMERACTIVITYLOG, SUBSCRIPTIONHISTORY, NETWORKUSAGE — and asked to identify users at risk of downgrading. There’s no “correct” answer, but there are wrong strategies: Cartesian products, SELECT *, or ignoring null handling in billing dates.

The scoring rubric weighs three things: correctness (30%), efficiency (40%), and readability (30%). A correct query that scans 10x more rows than necessary fails. So does a fast query no one else can debug.

How is the coding round different from the SQL round?

The coding round uses Python and evaluates data manipulation logic, not library knowledge. You’ll receive a dataset in CSV-like format and asked to clean, aggregate, and return specific metrics — all in plain Python. No Pandas allowed. No NumPy. You write everything from scratch using lists, dictionaries, and basic loops.

In a March 2025 session, candidates were given a list of call records with malformed timestamps and duplicate entries. Task: calculate daily average call duration per user, excluding weekends. One candidate used a defaultdict; another used nested if-else blocks. Both passed — not for elegance, but because they handled edge cases: missing users, invalid durations, and timezone shifts.

Not algorithm speed — but correctness in messiness.

Not code golf — but explicit handling of real-world noise.

Not framework usage — but control over data flow.

The system runs your code against 50 test cases: 30 straightforward, 15 edge-case-heavy, 5 with intentional schema drift. Fail more than 8 edge cases, and you’re out — even if your logic “makes sense.”

Interviewers aren’t looking for optimal Big-O. They want to see you anticipate problems: What if a user has no calls? What if duration is negative? What if date parsing fails?

One rejected candidate wrote a clean solution but assumed all inputs were valid. The feedback: “They coded for the brochure, not the battlefield.”

You have 45 minutes. Most spend 10 minutes parsing, 25 coding, 10 testing. The ones who pass always write validation checks — even when not asked.

How many technical rounds are there, and what’s the timeline?

There are two technical rounds: a 60-minute SQL screen with a senior data scientist, followed by a 75-minute coding session with an engineering manager. Between them, you get a 3-day gap for preparation. The entire process from application to offer takes 18–24 days, including 2 non-technical rounds.

The SQL interview is first. Fail it, and you’re out. No second chances. The coding round occurs only after a “strong pass” or “pass” on SQL. A “weak pass” triggers a follow-up task: optimize a slow-running production query.

Not volume of questions — but depth on one scenario.

Not theoretical breadth — but applied consistency.

Not speed alone — but structured communication.

In a 2024 hiring committee review, 42% of borderline candidates were rejected because they didn’t verbalize assumptions. One candidate paused mid-query to say, “I’m assuming cancellation_date being null means active — should I confirm?” That comment alone pushed them to “strong pass.”

Each round is scored independently: Strong Pass, Pass, Weak Pass, Fail. Two Strong Passes get fast-tracked. Two Passes enter HC review. Any Fail = rejection. Weak Pass + Pass = additional task.

Debriefs happen within 24 hours. Hiring managers discuss not just output, but how you navigated uncertainty. Silence is penalized. Guessing is worse.

What kind of real-world problems do they use in the interview?

They use anonymized versions of live business problems — last year, it was predicting MVNO (mobile virtual network operator) churn using billing and usage patterns. The schema included CUSTOMERPLANS, USAGEDAILY, and SUPPORT_TICKETS. Task: identify customers likely to cancel in the next 30 days and propose two data-driven interventions.

Not abstract modeling — but actionable segmentation.

Not accuracy metrics — but business impact framing.

Not feature engineering — but signal isolation from noise.

In a Berlin debrief, a candidate built a perfect logistic regression pipeline — then failed. Why? The prompt didn’t ask for a model. It asked for a rule-based flagging system using existing fields. The hiring lead said: “They over-engineered a screwdriver into a power drill. We needed someone to tighten the bolt, not build a workshop.”

The right approach: find patterns in plan downgrade frequency, spike in support tickets, or usage drop. One successful candidate grouped users by “plan mismatch”: those on premium plans with below-average usage. They flagged them for targeted offers.

Interviewers watch for scoping. Do you ask clarifying questions? Do you define “likely to cancel”? Do you check for data lag in support tickets?

Another case from 2025 involved detecting fraudulent SIM swaps. The data had timestamps, device IDs, and location pings. Strong candidates checked for timezone inconsistencies and multi-login events. Weak ones jumped straight to anomaly detection algorithms.

These aren’t puzzles. They’re compressed versions of what Deutsche Telekom’s data team actually works on. If you can’t align your solution to operational feasibility, you won’t pass.

How should I prepare for the SQL and coding rounds?

Start by mastering time-series aggregation, self-joins for cohort analysis, and null-safe comparisons — the three most tested SQL concepts. For coding, practice parsing unstructured logs, handling date arithmetic, and building counters without libraries. Most candidates fail on date edge cases: leap years, daylight saving shifts, and mixed timezones.

Not broad practice — but targeted replication of telecom data patterns.

Not leetcode grinding — but business logic translation.

Not syntax drills — but schema navigation speed.

Work through a structured preparation system (the PM Interview Playbook covers telecom-specific SQL cases with real debrief examples from DACH-region tech firms).

Build a cheat sheet of five query templates: rolling averages, first-touch attribution, churn rate by cohort, plan overlap detection, and anomaly flagging. Memorize nothing — internalize the logic.

For coding, write 10 small programs without Pandas: deduplicate records, calculate percentiles manually, parse ISO timestamps, group by date ranges, and handle missing keys in dictionaries.

Simulate conditions: 60 minutes, no autocomplete, one monitor. Use sample schemas from public telecom datasets or GitHub repos mimicking CDR (call detail records).

Practice verbalizing every assumption. Say it aloud: “I’m assuming that a user without a cancellation date is active. I’m treating zero usage as valid, not missing.” This mimics real interview expectations.

Finally, review Deutsche Telekom’s public data challenges — they’ve hosted Kaggle-style contests on customer segmentation. Those problems resemble actual interview prompts.

Mistakes to Avoid

BAD: Writing a 50-line SQL query with seven CTEs to solve a three-step problem.
GOOD: Using a single SELECT with conditional aggregation and clear column aliases. One interviewer said: “If I need to scroll to understand your query, you’ve already lost.”

BAD: Assuming all input data is clean during coding.
GOOD: Writing explicit checks for nulls, negative values, and format mismatches — even if the sample data looks perfect. In a 2025 round, a candidate added try-except blocks around date parsing. That single move earned a Strong Pass.

BAD: Jumping into code without clarifying the goal.
GOOD: Asking: “Should I include test accounts?” or “Is the focus on precision or recall for churn flags?” One candidate asked whether to optimize for engineering maintainability or business insight speed. The interviewer noted: “That’s a principal data scientist question — not junior level.”

FAQ

Do they use LeetCode-style algorithm questions?

No. The coding round focuses on data manipulation, not data structures. You won’t see binary trees or dynamic programming. You will see malformed CSVs, inconsistent timestamps, and nested event logs. The challenge is correctness in chaos, not algorithmic elegance.

Is Python the only coding language accepted?

Yes. All coding interviews are in Python 3.x. You cannot use R, Scala, or SQL. The environment is basic — no Jupyter, no libraries beyond built-ins. You’re expected to know how to iterate, parse strings, and manage dictionaries without external tools.

What’s the salary range for data scientists at Deutsche Telekom in 2026?

For mid-level roles (2–4 years experience), base salaries range from €68,000 to €82,000 in Bonn and Darmstadt, €74,000 to €89,000 in Berlin. Total compensation includes a performance bonus (8–12%) and benefits like subsidized mobile plans and public transit passes. Senior roles start at €95,000 with equity-like incentives.