27 Jun 2025 2 min read Engineering Community

Green in CI - Broken in Production: 5 Reasons Why Manual Test Is Better Than Automated

“It’s green in CI - ship it!”

Those words have launched more post-mortems than rockets. Test automation catches the mistakes you expect. Real users uncover everything else. Here are five situations where a living, breathing tester can torch your spotless coverage report.

⏱️Schrödinger’s bug with a stopwatch⏱️

On CI, every step runs in perfect sequence: click, wait, assert, done. Out in the wild, people double-tap, tab-smash, or open five windows in parallel. The millisecond gap your Playwright script glosses over? A power user slams it, two async calls collide, the database deadlocks, and your night turns into a war-room Zoom.

How manual QA saves the day:

🟩 They vary click speed, timing, and order, sometimes, by accident, and stumble onto the hidden path that flips the timing.
🟩 They notice odd artifacts: a spinner that flashes for half a second, a console error that blinks and disappears, an empty network response the logger didn’t flag.

💻The Pristine-Environment delusion💻

Your Docker image is a snow-white fairy tale: single Chrome version, no extensions, no cache, stable 100 Mbps link, en-US locale.

Real devices look like a garage sale:

🟩 Corporate VPN + split DNS
🟩 Four ad blockers and a crypto-wallet extension
🟩 125 % display scaling on a 13-inch laptop
🟩 Browser language set to Ukrainian, OS set to English, and a quirky third-party IME

Automation cruises through that controlled sandbox. Meanwhile, Olga on Windows 11 + Edge 125 opens your page, the font loader chokes on the VPN, CSS fails to download, and the layout explodes. Only a manual test run in “dirty-world mode” exposes it.

👨The Art of doing everything wrong👨

Happy-path scripts:

🟥 Fill every field with valid data
🟥 Click Submit once
🟥 Expect 200 OK
🟥 Profit

Actual humans:

🟩 Paste 300 emoji into the “Company name” field
🟩 Upload a 500 MB CSV titled `THIS-IS-THE-REAL-ONE-FINAL-V2.xlsx`
🟩 Double-click Submit because the spinner took 0.8 s too long
🟩 Hit Back, Forward, Refresh, then Forward again for good measure

These “nobody would do that” moves are exactly what testers try first. Automation politely declines; manual testers revel in the chaos.

🐛 Flaky Bugs aren’t always Flaky Tests 🐛

When a nightly run fails at random, devs shrug: “false negative.” A manual tester reproduces the glitch three times in 20 minutes, notes the circumstances, grabs logs, and cross-links the stack trace to a suspect commit.

Why automation shrugs:

🟥 Timing jitter hides the real defect behind retry logic.
🟥 CI retries the job and declares it green, wiping evidence.
🟥 Metrics show “99 % pass rate” - great for dashboards, terrible for users.

Why humans persist:

🟩 They see the screen flicker, the network tab turn red, or the cursor vanish.
🟩 They dig into DevTools, record a video, and hand you a repro that makes the bug die in daylight.

👨‍🦯 UX crimes assertions can’t see 👨‍🦯

Your test asserts that the button exists and is clickable. That doesn’t mean anyone can find or read it.

80 % gray text on white? Automation: “element present.” Users: “where’s the label?”
Button jumps two pixels when hovered? Automation: “same DOM node.” Users: “why is the UI dancing?”
Tooltip spawns under the cursor so you can’t click? Automation: “aria role valid.” Users: rage-quit.

"Selenium smiles - eyeballs bleed" (c) Confucius

⏱️Schrödinger’s bug with a stopwatch⏱️

💻The Pristine-Environment delusion💻

👨The Art of doing everything wrong👨

🐛 Flaky Bugs aren’t always Flaky Tests 🐛

👨‍🦯 UX crimes assertions can’t see 👨‍🦯

You might also like...

From Chaos to Chains - How LangChain Simplifies LLM App Development

The power of Row-Level Security

Hotkeys, keyboards, personalisation: ways to move faster as a developer

Cursor Rules

Mastering AI prompts: techniques for maximizing results and efficiency