The phishing decision.

The work to do

What control participation looks like, and where it exists.

Controls don't deploy themselves. They land through workstreams owned by named teams, with security architecture in the room. Each workstream below names the teams that own it and what they deliver. The recurring loop check on each (done? when? changed?) sits with the methodology phases and SOC loop further down, where the operational loop lives.

DMARC and DNS hygiene.

Security architecture · Network and DNS · Mail platform

SPF and DKIM enforced. DMARC moved from monitor to quarantine to reject, with the path actually walked. Subdomain delegation hardened. Reports actively monitored, not piped to a forgotten mailbox.
Mail gateway and content controls.

Security architecture · Mail platform · Endpoint security

Attachment sandboxing with dynamic analysis. Link rewriting with time-of-click rescan. Impersonation protection. BEC anomalous-send detection. External sender flagging. Autoforward to external blocked.
Network defence layer.

Network engineering · Security architecture · SOC

Egress filtering deny-by-default. DNS sinkholing for known indicators. Web filtering with category enforcement. Segmentation tested against assumed-breach. IDS and IPS rule-base reviewed, not left on factory defaults.
Endpoint hardening.

Endpoint engineering · Security architecture · IT operations

Filetype blocking driven by the endpoint review of how each type behaves under duress, not a fixed list (currently ISO, LNK, HTA, OneNote, XLL on standard Windows builds; the list moves as adversary tooling moves). ASR rules deployed and audited. AppLocker or WDAC in enforced mode. Office macros blocked from internet zone. Local admin removed by default. Mark-of-the-Web propagating to archives.
OS and application hardening.

Platform engineering · Security architecture · Application security

CIS benchmarks adopted and audited. Patching SLAs documented and met. Container image hardening with a refresh cadence. Runtime protection where the workload warrants. Application allow-listing where the threat model demands.
Identity and access.

Identity · Security architecture · IT operations

MFA on every account, no exceptions. FIDO2 phishing-resistant for privileged users. Legacy authentication disabled. Conditional access with named locations. OAuth app consent governed. Privileged access workstations for tier zero. Just-in-time elevation.
Detection engineering and the SOC loop.

SOC · Detection engineering · Security architecture · Security comms

Logging coverage audited against MITRE ATT&CK. SIEM correlation rules tested, not just deployed. One-button reporter in the mail client. Triage SLA documented and met. Loop closure on every report, with feedback to the reporter and IOCs feeding the next assessment scope.
Workforce education and reporting culture.

Security comms · SOC (intel source) · People and HR · Awareness team

Education drawn from real reported lures, not vendor templates. Monthly comms that name the techniques landing against the sector. Reporting acknowledged within hours, every time. Just-in-time micro-prompts at the decision point, not quarterly modules. Success measured by reporting rate and time-to-report, not click rate. People are part of the defence, once the defence exists.

If you can't answer all three questions for each workstream above, there's room to move before the user is the right person to blame. The rest of the page is the case for why, and what to do instead.

01 · The approaches

Traditional simulation vs technical assessment.

Send a fake lure, count who clicks, send the clickers to training. That's been the default for twenty years. The market got to standard practice before anyone checked whether it works. Here's the case both ways.

Simulation-led

Send the lure, count the clicks.

Quarterly campaign cycle. Deliver, measure, remediate. Cheap, reportable, gameable.

Net valueLowper available evidence

What it does

Vendor or in-house generates a phishing template, lands it in a defined population of inboxes.
Tracking pixels and link-rewrites measure opens, clicks, and credential submissions.
Users who click receive an immediate "you have been phished" interstitial and a training module.
Aggregate metrics roll up to a quarterly click-rate dashboard for leadership.

What it does well 5 points

The ETH Zurich CCS 2024 follow-up pinned whatever effect simulations have to the nudge itself, the periodic reminder that phishing is real, not to the training content. Most employees don't read the training.

Evidence: Lain et al. CCS 2024

Mature vendor platforms automate templating, scheduling, reporting, and even role-based targeting. Marginal cost per campaign is small. Capital outlay falls on the licence.

Trade observation

PCI-DSS, ISO 27001 Annex A, NIST CSF, HIPAA. All reference awareness training and the demonstration of such. A click-rate dashboard answers the question with a number, even if the number doesn't predict anything real.

Trade observation

"Click rate fell from 14% to 6% this year" reads as progress to a non-technical audience. The number is concrete, comparable across quarters, easy to chart. Whether it predicts a real-world outcome is a separate question. Boards rarely ask it.

Trade observation

A live simulation is a recurring touchpoint between the security team and the workforce. Run well, it normalises security comms. Run badly, it builds resentment.

Trade observation

What it does badly 7 points

The 2025 large-scale empirical assessment grounded in the NIST Phish Scale, with 12,511 participants, found no statistically significant impact of training modality on either click rates or reporting behaviour across the conditions tested.

Evidence: arXiv 2506.19899 (2025)

The 14,000-employee, 15-month ETH Zurich study at IEEE S&P 2022 reported that embedded training as commonly deployed in industry does not make employees more resilient and, in places, made them more susceptible. This is now corroborated across multiple field studies.

Evidence: Lain et al. IEEE S&P 2022

Harder lures produce higher click rates, easier lures lower. No industry-standard methodology exists for calculating click rates. A team that wants a better number can simply pick easier lures next quarter and report improvement.

Industry research, Beauceron / Cybeready

The UK NCSC has stated in its public guidance for over half a decade that punishment-oriented programmes suppress reporting: users who fear reprisals will not report mistakes promptly, if at all. The behaviour the security team needs most is the one the programme discourages.

Evidence: NCSC UK guidance

A hospital case study found employee workload was the dominant predictor of phishing vulnerability. Staff intended to detect attacks but could not under load. Any programme that frames clicking as a knowledge or competence failure is testing the wrong thing.

Evidence: PMC8935590 hospital case study

Real operators use lures that work against the specific organisation: invoice fraud against finance, CV attachments against HR, internal-looking calendar invites against engineering. Vendor template libraries test against a threat that no longer exists in current operations.

Threat-led observation

"Sarah clicked again" is not a work order. It identifies no control gap, no configuration drift, no policy correction. The hours spent on the simulation programme are hours not spent on DMARC enforcement, attachment policy, FIDO2 rollout, or conditional access, which would.

Operational observation

Assessment-led

Test the controls, not the people.

Three phases. Findings that engineering can act on. Solicitation is out of scope, because solicitation is what the awareness team is for.

Net valueHighper available evidence

Three phases. Look at the endpoint's filetype handling. Look at the mail gateway and the defence layers behind it. Then take a working credential as given and work the success paths without bothering with social engineering. Solicitation is the awareness team's problem, not the assessor's.

00
Endpoint build review.

What payloads can actually fire on this endpoint? Walk through the filetypes that show up in current adversary tooling and document how each behaves under duress on the standard build. The output is the filetype block list, refreshed as the threat moves.

OwnersEndpoint engineering · Security architecture · IT operations · EDR operations
- Filetype execution policy for ISO, LNK, HTA, SVG, CHM, XLL, OneNote
- Office macro policy, MOTW propagation, protected view
- ASR rules, SmartScreen, browser download handling
- AppLocker or WDAC posture, LOLBin reachability
- EDR detection coverage for known initial access techniques
- Local privilege boundaries, sudoers, UAC, autoelevation paths
Loop

Done?

When?

Changed?
01
Mail gateway and defence layers.

The configuration review the vendor won't run on themselves. What actually lands in the inbox, what gets sandboxed, what gets stripped at the gateway.

OwnersMail platform · Security architecture · Network and DNS · SOC
- SPF, DKIM, DMARC enforcement posture (quarantine vs reject)
- Attachment sandboxing depth, dynamic analysis, file unwrap
- Link rewriting, time-of-click rescan, browser isolation handoff
- External sender flagging, impersonation protection, anti-spoof
- BEC detection, anomalous-send patterns, autoforwarding controls
- Transport rules, attachment block lists, executable handling
Loop

Done?

When?

Changed?
02
Assumed-compromise account testing.

Credential in hand on day one. No social engineering, no waiting on a click. The question is what an attacker with creds can actually do once they're inside.

OwnersSecurity architecture · Identity · SOC · Application security · External assessor (CHECK / CREST)
- MFA bypass viability: legacy auth, app passwords, device gaps
- Conditional access posture, named locations, session controls
- Token theft, AiTM proxy viability, primary refresh token reach
- OAuth consent attacks, app registration permissions
- Internal phish viability, mailbox rule abuse, SharePoint reach
- Data exfil paths, autoforward to external, eDiscovery export
Loop

Done?

When?

Changed?

What it does well 7 points

"Block ISO and LNK at the gateway." "Move DMARC from quarantine to reject." "Disable legacy auth in conditional access." Every finding lands on someone with the authority and tooling to fix it. The remediation path is concrete and verifiable.

Operational observation

Credentials get stolen. Eventually. The defensible question is not whether a user will be phished, it is what an attacker with a working credential can do next. Assumed-compromise testing maps that surface directly to MITRE ATT&CK techniques the team can detect and contain.

Aligns with NCSC layered defence model

Whether a user clicks is a function of workload, fatigue, and pretext quality, not of control posture. Decoupling the assessment from solicitation tests the controls cleanly. Solicitation is what the awareness team is for.

Methodological observation

The controls a technical assessment exercises (filetype handling, gateway posture, MFA enforcement, conditional access) defend against phishing, smishing, drive-by download, malicious removable media, and supply-chain compromise. Awareness training defends only against the user-decision moment.

Threat model observation

No employee is named, scored, or trained as a result of this assessment. Reporting culture is unaffected. The NCSC's stated concern about punitive simulation programmes is structurally avoided.

Evidence: NCSC UK guidance

Every finding has a configuration artifact attached, every retest is a diff against a known state. Reports survive auditor scrutiny in a way that a click-rate trend graph does not.

Assessment methodology observation

The remediation backlog naturally lands on identity, mail gateway, and endpoint engineering, which is where the evidence base says investment produces resilience. Hours that would have gone on quarterly campaigns move to FIDO2 rollout, attachment policy, and conditional access.

Operational observation

What it does badly 4 points

Endpoint build review, gateway configuration audit, and assumed-compromise testing each require skilled hands. Mid-market organisations may need to engage an assessor. CHECK or CREST-aligned engagement is sensible for the assumed-compromise phase.

Practitioner observation

Skilled assessor days cost more than a SaaS simulation licence. The cost lands in a single procurement decision rather than spread across operations, which is harder to justify on a budget line even where total annual cost is comparable.

Commercial observation

Some compliance frameworks ask whether a phishing simulation has been conducted in the prior period. A technical assessment is a stronger answer, but the box still has a specific name on it. The mitigation is to keep a lightweight awareness programme and align it with the assessment findings.

Compliance interface observation

A workforce still needs to know what to report and how. The technical assessment removes the assessment from the workforce, but the reporting culture, the report button, and the response loop still need to be designed. NCSC's layered defence still applies.

Evidence: NCSC UK guidance

02 · Red team context

Phishing as red team initial access.

Same logic for commissioned red team engagements. Phishing as the entry vector is the most expensive and least useful way to start. Assume-breach is the better default. Even the frameworks that used to insist on full-chain testing now say so.

~33%

Shorter engagement,
same scope

2x

Relative time on post-
compromise testing

0

Wasted weeks if the
SOC catches the lure

Phishing-led initial access

Spend the first third of the engagement on the part you already know works.

Threat intel, infrastructure, lure development, then a wait for clicks. The work you actually commissioned starts in week three at best.

Net value as entryLowper framework and field reports

What it does

Threat intelligence phase identifies plausible attacker pretexts and likely targets within the workforce.
Infrastructure setup: registered domains, SSL, mail relay, payload development, evasion testing against the target's email security stack.
Lures delivered to selected targets. The team waits for clicks, credential submissions, or sandboxed execution.
Successful initial access feeds the post-compromise phase, with whatever time remains.

What it does well 3 points

For regulated scenarios where the threat model explicitly includes external initial access, end-to-end testing answers a specific question. TIBER-EU's standard variant covers this. The trade-off is that the answer is rarely surprising.

Evidence: TIBER-EU framework, ECB

The gateway, sandboxing, and impersonation controls get an end-to-end workout that a configuration review cannot fully replicate. The catch is that this is also achievable with a much narrower mail security assessment for a fraction of the cost.

Methodological observation

CBEST and DORA both reference threat-intelligence-led testing that includes initial access scenarios. For financial entities subject to those regimes, full-chain testing is sometimes expected. The expectation is itself softening: TIBER variants now explicitly allow assume-breach.

Evidence: TIBER variants supporting assume-breach

What it does badly 7 points

Every credible threat report from Mandiant, CrowdStrike, and Verizon over the last decade shows that determined adversaries achieve initial access. Spending a quarter of a multi-week engagement re-establishing this point surfaces no actionable new finding. The defensible question is what happens after.

Industry threat reporting consensus

If the SOC catches the phish, which is the entire point of the awareness and email security programme, the red team has nowhere to go. The client paid for a multi-week engagement and got one finding about their mail filter.

Operational observation

Registered domains, SSL certificates, mail relay configuration, payload development, anti-detection tuning. None of it carries to the next engagement. One to two weeks of senior engineer time goes on building the entry vehicle rather than on testing the client's controls.

Trade observation

"We phished a finance contractor" tells you very little about whether the same approach would work against a different role next quarter. "Lateral movement from a low-priv user succeeded via SeImpersonatePrivilege abuse" tells you everything, because the technique applies to every user with that privilege.

Methodological observation

Whether a user clicks depends on their workload, fatigue, pretext quality, and the time of day the lure lands. The same engagement run twice produces different results. Assumed-breach starts from the same position every time, which makes year-over-year improvement measurable.

Methodological observation

A red team phish lands in the same inboxes that the awareness team is trying to measure separately. The signals contaminate each other. Real phishing reports become indistinguishable from red team simulations, and trend data gets noisy.

Operational observation

In a six-week engagement, post-compromise work might get two to three weeks. The same two to three weeks against the same internal estate, with a foothold provided on day one, is the assumed-breach engagement. The client pays roughly a third more for the privilege of seeing the email security stack tested in the same engagement, which a much cheaper assessment could have covered.

Operational observation

Assumed-breach entry

Start where the interesting work begins.

Client provides a low-priv foothold. Day one, the assessor is in. The whole engagement window goes on testing the internal control plane.

Net value as entryHighper framework and operational evidence

What it does

The client provides a low-privilege foothold. A standard user account, an unprivileged workstation in the target environment, or both.
The engagement starts on day one in the post-compromise position. No infrastructure setup, no waiting for clicks.
Lateral movement, privilege escalation, persistence, and impact testing proceed against the real internal control plane.
Findings map directly to MITRE ATT&CK techniques the SOC can detect and contain. The blue team gets actionable improvement work.

What it does well 7 points

No infrastructure setup, no lure development, no waiting on clicks. The full window goes to lateral movement, privilege escalation, persistence, and impact. The deliverable density is higher per pound spent.

Operational observation

Same starting position, same scope, same measurement framework. Improvement actually shows up. Two consecutive phishing-led engagements with different lure performance tell you very little about whether the internal controls got better.

Methodological observation

TIBER variants now formally allow the initial entry phase to be skipped. Operational reality has caught up with the methodology. CBEST has long permitted intelligence-led scenarios that begin from documented footholds. The framework consensus is moving in this direction.

Evidence: TIBER-EU, TIBER assume-breach variants

In phishing-led testing, detection is the failure condition. In assumed-breach, detection is a finding to celebrate and refine. The blue team's catches become rule-tuning material in the closing purple-team session.

Operational observation

"T1078.002 Domain Accounts," "T1558.003 Kerberoasting," "T1484.001 Group Policy Modification." Each finding lands in a framework the SOC already uses for detection engineering. The remediation backlog writes itself.

MITRE ATT&CK Enterprise

No infrastructure cost, shorter engagement window, no failure-mode rework. The savings come out of the part of the budget that was producing the least new information anyway.

Trade observation

The workforce is untouched. The phishing-report channel keeps its signal-to-noise ratio. The two programmes (red team and awareness) can run on independent schedules without interfering with each other's metrics.

Operational observation

What it does badly 4 points

The email security stack, perimeter web exposure, and edge identity controls go untested in a pure assumed-breach engagement. The mitigation is to commission separate, narrower assessments for those (which is also cheaper than rolling them into the red team scope).

Methodological observation

Some procurement and risk functions struggle with this. The provided credential needs scoping, time-boxing, and revocation procedures. This is a process problem, not a methodology problem, and it solves itself once a single engagement has been run through it.

Procurement observation

Senior stakeholders who think "red team" means "we tested whether attackers could get in" may need re-education. The honest answer (they will, eventually, and what matters is what happens next) is harder to put on a slide than a click-rate trend graph.

Stakeholder management observation

Assumed-breach is most valuable when the SOC and detection engineering function are mature enough to absorb the findings and improve. For organisations without that capability, the engagement still produces a remediation backlog, but the deeper purple-team value is harder to realise.

Organisational maturity observation

Framework position

TIBER-EU was built around full-chain threat-intelligence-led engagements. The framework now explicitly permits the assume-breach variant. "Use of 'assume breach' where the initial entry phase is skipped when necessary to adhere to privacy legislation." TIBER.info documentation, on the TIBER-NL and TIBER-Rijk variants

03 · The operational loop

The SOC feedback loop.

A good phish that gets past the gateway, the sandbox, the link rewriter, and the EDR, and lands in someone's inbox, isn't a control failure. It's free, current, targeted threat intel about what's working against you right now. Most programmes have nowhere to put it. Without a loop, everything else is guesswork.

01

Receive.

The lure landed. The user now has a sample of what just bypassed everything you've got.

OwnersMail platform · End user · Security comms OperationalOne-button reporter in the mail client. Button does the structured submission. No forwarding to a mailbox nobody reads. SLA targetMinutes from suspicion to report. Friction is the enemy. ValueFree, current, targeted intel. No procurement, no infrastructure.

Loop

Done?

When?

Changed?

02

Triage.

SOC picks it up, decides benign, suspicious, or malicious. Real adversary stuff jumps the queue.

OwnersSOC · Incident response OperationalAutomated acknowledge on receipt. Human classify after. Confirmed malicious gets paged. SLA targetAcknowledge under 1 hour. Classify under 4. ValueThe reporter sees the report taken seriously before they have time to doubt it.

Loop

Done?

When?

Changed?

03

Extract.

IOCs, TTPs, sender infrastructure, attachment behaviour, link patterns, who got targeted. Pulled out in a structured format that the rest of the stack can ingest.

OwnersSOC · Detection engineering · Threat intel OperationalOpen formats. STIX 2.1 or the Triageable Evidence Format (TEF). Without structure the report turns into a screenshot in a Teams channel that nobody can act on. MappingEvery observed technique tagged with a MITRE ATT&CK identifier. Initial access is usually the T1566 family. ValueThe intel is now in a form the gateway, the SIEM, the next assessor, and your peers can all eat.

Loop

Done?

When?

Changed?

04

Act.

Block what landed. Block what looks like it. Hunt the inbox set and the SIEM history for earlier related deliveries that nobody reported.

OwnersMail platform · Endpoint security · Detection engineering · Network engineering OperationalTransport rules on the gateway. Custom EDR signatures. SIEM correlation rules. Hunt queries against the prior 30 days at minimum. ReachSame lure shouldn't land twice. Variants land less often. Earlier missed deliveries surface. ValueControls got better. Attributable directly to the user who reported it.

Loop

Done?

When?

Changed?

05

Close.

Tell the reporter what their report did. Update the threat model. Share with peers through an ISAC, sector group, or NCSC where you can. Feed the IOC and TTP set into the next assumed-breach engagement so it's testing against the real adversary.

OwnersSOC · Security comms · Threat intel · Security architecture OperationalTemplated response to the reporter with an anonymised impact summary. Monthly comms showing the workforce real sector lures. Peer share where you can. ForwardIOCs and TTPs feed the next assumed-breach engagement scope. The exercise tests what's actually being thrown at the place. ValueThe workforce sees the loop close. Next assessment is tighter. Peers get earlier warning. Reporting culture compounds.

Loop

Done?

When?

Changed?

↻ Loop continues. Each pass: controls catch more, reports come in faster, the assessment programme tests what's actually happening.

Framework position

MITRE's Center for Threat-Informed Defense^[8] calls the mature state a continuous feedback loop between adversary behaviour, control visibility, and operational readiness. The phish that bypassed your controls is the way in to that loop. Without it, you're working off generic assumptions. "Mature organisations develop a continuous feedback loop between adversary behavior, control visibility, and operational readiness." MITRE Center for Threat-Informed Defense and partner guidance

04 · Side by side

Side by side on the dimensions that matter.

Click any row to expand the definition. Dot colour shows relative strength of each approach on that dimension.

Dimension	Traditional simulation	Technical assessment
Actionability of findingsCan a named person fix the finding by changing a configuration, policy, or control?	Names of clickers, no control gaps surfaced	Specific configuration changes per finding
Threat realismDoes the test reflect what real operators are doing against this organisation in the current period?	Generic vendor templates	Assumed-compromise model maps to current TTPs
Workforce impactDoes the test treat the workforce as the threat surface or as the sensor network?	Threat surface, blame risk	Unaffected, reporting culture preserved
Reporting cultureDoes the programme encourage or suppress timely reporting of real suspicious emails?	Suppresses (per NCSC)	Neutral, reporting can be built alongside
Evidence baseIs the practice supported by peer-reviewed empirical study or by vendor case studies?	Vendor case studies, multiple null results in field studies	Standard pen-test methodology, NCSC-aligned
Cost profileWhere does the spend land? Operational subscription or skilled engagement?	Low recurring spend, hidden operational cost	Higher per-engagement, focused remediation spend follows
Compliance fitDoes the practice answer the specific question auditors are asked to ask?	Direct fit with "simulation conducted" items	Stronger evidence, may require lightweight awareness alongside
Defence rangeDoes the practice defend against other initial access vectors too, or only phishing?	Phishing-only	Phishing, smishing, drive-by, removable media, supply chain

05 · Self-diagnostic

Where does your programme sit on the spectrum?

Nine questions, scored 0 to 3 each. Find out where the programme sits between simulation theatre and assessment-led resilience with a closed loop. Responses are captured locally, aggregated only if you've wired analytics in.

Programme diagnostic, 9 questions

Answer straight. The score is for you, not an auditor.

0/27

band

label

meaning

Theatre In transition Resilience

Most useful next step

Send it on

If this changed how the question lands, share it.

06 · Action sheet

What to do now, who owns it, and why it matters.

This is the takeaway. If the rest of the page was context, this is the work order. Each action names a responsible team, a regulatory or evidence-based reason, and a timeline. Print it, forward it, or paste it into a board paper. The reasoning is above. The obligations are below.

Before you start

If your organisation cannot answer "done, when, and what changed" for the first three actions on this list, the rest of the programme is compensating for a gap that technical controls should have closed. Fix the foundation before you invest in the loop.

Deploy DMARC at reject, with SPF and DKIM enforced.

Owners: Security architecture · Network and DNS · Mail platform

DMARC at monitor or quarantine is not enforcement. Move to reject. Monitor the aggregate reports for legitimate senders you missed, then enforce. This single control eliminates domain-spoofed phishing against your brand and reduces impersonation lures against your people.

Obligation: NCSC recommends DMARC enforcement for all UK organisations. PCI-DSS 4.0 references email authentication. NIST CSF PR.DS and ID.RA both map here. NHS DSPT Objective E references anti-spoofing controls.

Timeline: Monitor within 2 weeks. Quarantine within 6 weeks. Reject within 12 weeks. If you have been at monitor for more than 90 days, the data is stale and you are procrastinating.
Review the mail gateway configuration against current adversary tooling.

Owners: Mail platform · Security architecture · SOC

Attachment sandboxing with dynamic analysis. Link rewriting with time-of-click rescan. External sender flagging. BEC anomalous-send detection. Autoforward to external blocked. If you are running the vendor's default configuration, you are running the configuration the vendor's sales team chose, not the configuration your threat model requires.

Obligation: ISO 27001 Annex A (A.8.23 web filtering, A.8.24 cryptography controls). Cyber Essentials Plus boundary firewalls and internet gateways. NIS Regulations Article 14 security measures for operators of essential services.

Timeline: Configuration audit within 4 weeks. Remediation of critical findings within 8 weeks. Annual revalidation.
Map the endpoint filetype execution policy against current initial access techniques.

Owners: Endpoint engineering · Security architecture · IT operations

Walk through ISO, LNK, HTA, SVG, CHM, XLL, OneNote, and whatever the current quarter's adversary tooling favours. Document how each behaves on the standard build. Block what should not execute. Update the list as the threat moves. This is not a one-time exercise.

Obligation: Cyber Essentials Plus requires malware protection and secure configuration. CIS Benchmarks specify application control. NIST CSF PR.IP-1 covers baseline configurations. The refresh cadence is yours to set, but quarterly is the minimum defensible interval.

Timeline: Initial review within 6 weeks. Quarterly refresh thereafter.
Enforce MFA on every account. Phishing-resistant MFA for privileged users.

Owners: Identity · Security architecture · IT operations

No exceptions. App passwords disabled. Legacy authentication disabled. Conditional access with named locations and device compliance. FIDO2 or certificate-based authentication for any account with administrative privilege. If your MFA rollout has been "in progress" for more than two quarters, the project has stalled and needs re-scoping.

Obligation: Cyber Essentials Plus requires MFA for cloud services and administrator accounts. NIST CSF PR.AC-7. FCA operational resilience expectations under SM&CR. NHS DSPT Objective E. PCI-DSS 4.0 MFA requirements expand in 2025.

Timeline: Standard user MFA within 4 weeks if not already enforced. FIDO2 for privileged users within 12 weeks. Legacy auth disabled within 6 weeks.
Deploy a one-button phishing reporter in the mail client.

Owners: Security comms · Mail platform · SOC

The button must submit the email in a structured format to a real SOC queue, not forward it to a mailbox. Every submission must receive an automated acknowledgement within minutes. The reporter must never be named, scored, or sanctioned based on what they report. If the button exists but nobody reads the queue, the button is theatre.

Obligation: NCSC guidance explicitly recommends making reporting easy, fast, and free of blame. ISO 27001 A.6.8 (information security event reporting). The reporting rate and time-to-first-report are your primary phishing metrics from this point forward, not click rate.

Timeline: Button deployed within 4 weeks. SOC triage SLA defined within 6 weeks. First reporting-rate baseline measured within 12 weeks.
Build the SOC feedback loop.

Owners: SOC · Detection engineering · Security comms · Threat intel

Every reported phish that bypassed controls: triage, extract IOCs and TTPs, block at gateway and endpoint, hunt the inbox set for earlier related deliveries, close the loop with the reporter. The reporter gets told what their report did. The controls get updated. The next assessment tests against what actually landed.

Obligation: MITRE's threat-informed defence model. NIST CSF DE.AE (anomalies and events), DE.CM (continuous monitoring), RS.AN (analysis). NIS Regulations continuous improvement requirement. This is the loop that turns individual reports into organisational resilience.

Timeline: Basic loop (triage + block) within 8 weeks. Full loop (extract + hunt + reporter feedback + comms) within 16 weeks. Quarterly review of loop effectiveness.
Commission an assumed-breach assessment.

Owners: Security architecture · Identity · SOC · External assessor (CHECK / CREST)

Start from a low-privilege credential on day one. No social engineering, no waiting for clicks. Test lateral movement, privilege escalation, persistence, and impact against the real internal control plane. Map findings to MITRE ATT&CK techniques the SOC can detect and contain. The exercise tests what an attacker can actually do once they are inside, which is the question that matters.

Obligation: TIBER-EU and CBEST both support assume-breach entry. NIST CSF PR.PT and DE.CM. Cyber Essentials Plus scope covers internal vulnerability assessment. FCA operational resilience expectations. The assumed-breach model is now the framework consensus, not the alternative position.

Timeline: Scope and procure within 8 weeks. Execute within the following 4 weeks. Remediation of critical findings within 8 weeks of report delivery. Annual cadence thereafter.
Replace click rate with reporting rate as your primary metric.

Owners: Security comms · SOC · Leadership

Click rate is gameable by lure difficulty, produces no actionable finding, and predicts nothing about real-world resilience. Reporting rate measures the behaviour you actually need: people telling you when something suspicious arrives. Time-to-first-report measures how fast your human sensor network responds. Both improve when the loop is visible and reporters are thanked.

Obligation: No framework mandates click rate as a metric. Several (NCSC, NIST CSF) reference reporting culture as a resilience indicator. The switch is a governance decision, not a technical one. Present it to the board alongside the evidence base (Lain et al. 2022, CCS 2024, arXiv 2025).

Timeline: Board paper within 4 weeks. Metric switch within 8 weeks. First quarterly report on reporting rate within 16 weeks.
Redesign workforce education around real reported lures, not vendor templates.

Owners: Security comms · SOC (intel source) · Awareness team

Monthly comms that name the techniques currently landing against your sector. Draw the examples from the SOC loop, not from a vendor's template library. Just-in-time micro-prompts at the decision point, not quarterly CBT modules. Acknowledge every report. Celebrate reporting publicly. Never name or sanction a clicker. The workforce is part of the defence once the defence exists — not before.

Obligation: NCSC guidance states that punitive programmes suppress reporting. ISO 27001 A.6.3 (information security awareness, education, and training). The training content must be drawn from current threat intelligence, not from a static curriculum. The ETH Zurich studies (2022, 2024) and the 2025 NIST Phish Scale study provide the evidence base for this redesign.

Timeline: First real-lure comms within 4 weeks of SOC loop going live. Monthly cadence. Quarterly review of reporting rate trend. Annual assessment of programme effectiveness against the diagnostic on this page.

The order matters

Actions 1-4 are technical controls that reduce the attack surface before a human decision point is reached. Action 5-6 build the detection and response loop. Action 7 validates the controls under simulated pressure. Actions 8-9 are the cultural and communication layer that sits on top of the technical foundation. If you do 8-9 without 1-7, you are training people to spot threats that your controls should have stopped. That is the status quo this page argues against.

References

Sources.

Lain, D., Kostiainen, K., Capkun, S. Phishing in Organizations: Findings from a Large-Scale and Long-Term Study. IEEE Symposium on Security and Privacy (S&P), 2022. 14,000 employees, 15 months. arxiv.org/pdf/2112.07498
Lain, D., Jost, T., Matetic, S., Kostiainen, K., Capkun, S. Content, Nudges and Incentives: A Study on the Effectiveness and Perception of Embedded Phishing Training. ACM CCS 2024. Distinguished Paper Award. 4,554 participants. arxiv.org/abs/2409.01378
Anti-Phishing Training Does Not Work. A Large-Scale Empirical Assessment of Multi-Modal Training Grounded in the NIST Phish Scale. arXiv preprint, 2025. 12,511 participants. arxiv.org/html/2506.19899v1
Various. Phishing simulation exercise in a large hospital: a case study. PubMed Central. Workload-driven susceptibility analysis. ncbi.nlm.nih.gov/pmc/articles/PMC8935590
UK National Cyber Security Centre. Phishing attacks: defending your organisation. Official guidance, multi-layer mitigation framework, updated 2025. ncsc.gov.uk/guidance/phishing
European Central Bank. TIBER-EU: Threat Intelligence-based Ethical Red Teaming. Framework for controlled, intelligence-led red team testing of financial entities, supporting DORA compliance. ecb.europa.eu/paym/cyber-resilience/tiber-eu
TIBER.info documentation. TIBER-NL and TIBER-Rijk assume-breach variants. Explicit framework provision for skipping the initial entry phase where operationally or legally appropriate. tiber.info/documentation
MITRE Center for Threat-Informed Defense. Threat-Informed Defense. Continuous feedback loop between adversary behaviour, control visibility, and operational readiness. ATT&CK is the underlying taxonomy. mitre.org/focus-areas/cybersecurity/threat-informed-defense