Phishing for clicks is easy.
Phishing for credentials a little harder.
Phishing for shells is the money shot.
But phishing defence has always been done in half measures. Let's break down how to do it properly, and the delta between the status quo and proper.
What control participation looks like, and where it exists.
Controls don't deploy themselves. They land through workstreams owned by named teams, with security architecture in the room. Each workstream below names the teams that own it and what they deliver. The recurring loop check on each (done? when? changed?) sits with the methodology phases and SOC loop further down, where the operational loop lives.
-
DMARC and DNS hygiene.Security architecture · Network and DNS · Mail platform
SPF and DKIM enforced. DMARC moved from monitor to quarantine to reject, with the path actually walked. Subdomain delegation hardened. Reports actively monitored, not piped to a forgotten mailbox.
-
Mail gateway and content controls.Security architecture · Mail platform · Endpoint security
Attachment sandboxing with dynamic analysis. Link rewriting with time-of-click rescan. Impersonation protection. BEC anomalous-send detection. External sender flagging. Autoforward to external blocked.
-
Network defence layer.Network engineering · Security architecture · SOC
Egress filtering deny-by-default. DNS sinkholing for known indicators. Web filtering with category enforcement. Segmentation tested against assumed-breach. IDS and IPS rule-base reviewed, not left on factory defaults.
-
Endpoint hardening.Endpoint engineering · Security architecture · IT operations
Filetype blocking driven by the endpoint review of how each type behaves under duress, not a fixed list (currently ISO, LNK, HTA, OneNote, XLL on standard Windows builds; the list moves as adversary tooling moves). ASR rules deployed and audited. AppLocker or WDAC in enforced mode. Office macros blocked from internet zone. Local admin removed by default. Mark-of-the-Web propagating to archives.
-
OS and application hardening.Platform engineering · Security architecture · Application security
CIS benchmarks adopted and audited. Patching SLAs documented and met. Container image hardening with a refresh cadence. Runtime protection where the workload warrants. Application allow-listing where the threat model demands.
-
Identity and access.Identity · Security architecture · IT operations
MFA on every account, no exceptions. FIDO2 phishing-resistant for privileged users. Legacy authentication disabled. Conditional access with named locations. OAuth app consent governed. Privileged access workstations for tier zero. Just-in-time elevation.
-
Detection engineering and the SOC loop.SOC · Detection engineering · Security architecture · Security comms
Logging coverage audited against MITRE ATT&CK. SIEM correlation rules tested, not just deployed. One-button reporter in the mail client. Triage SLA documented and met. Loop closure on every report, with feedback to the reporter and IOCs feeding the next assessment scope.
-
Workforce education and reporting culture.Security comms · SOC (intel source) · People and HR · Awareness team
Education drawn from real reported lures, not vendor templates. Monthly comms that name the techniques landing against the sector. Reporting acknowledged within hours, every time. Just-in-time micro-prompts at the decision point, not quarterly modules. Success measured by reporting rate and time-to-report, not click rate. People are part of the defence, once the defence exists.
If you can't answer all three questions for each workstream above, there's room to move before the user is the right person to blame. The rest of the page is the case for why, and what to do instead.
Traditional simulation vs technical assessment.
Send a fake lure, count who clicks, send the clickers to training. That's been the default for twenty years. The market got to standard practice before anyone checked whether it works. Here's the case both ways.
- Vendor or in-house generates a phishing template, lands it in a defined population of inboxes.
- Tracking pixels and link-rewrites measure opens, clicks, and credential submissions.
- Users who click receive an immediate "you have been phished" interstitial and a training module.
- Aggregate metrics roll up to a quarterly click-rate dashboard for leadership.
The ETH Zurich CCS 2024 follow-up pinned whatever effect simulations have to the nudge itself, the periodic reminder that phishing is real, not to the training content. Most employees don't read the training.
Mature vendor platforms automate templating, scheduling, reporting, and even role-based targeting. Marginal cost per campaign is small. Capital outlay falls on the licence.
PCI-DSS, ISO 27001 Annex A, NIST CSF, HIPAA. All reference awareness training and the demonstration of such. A click-rate dashboard answers the question with a number, even if the number doesn't predict anything real.
"Click rate fell from 14% to 6% this year" reads as progress to a non-technical audience. The number is concrete, comparable across quarters, easy to chart. Whether it predicts a real-world outcome is a separate question. Boards rarely ask it.
A live simulation is a recurring touchpoint between the security team and the workforce. Run well, it normalises security comms. Run badly, it builds resentment.
The 2025 large-scale empirical assessment grounded in the NIST Phish Scale, with 12,511 participants, found no statistically significant impact of training modality on either click rates or reporting behaviour across the conditions tested.
The 14,000-employee, 15-month ETH Zurich study at IEEE S&P 2022 reported that embedded training as commonly deployed in industry does not make employees more resilient and, in places, made them more susceptible. This is now corroborated across multiple field studies.
Harder lures produce higher click rates, easier lures lower. No industry-standard methodology exists for calculating click rates. A team that wants a better number can simply pick easier lures next quarter and report improvement.
The UK NCSC has stated in its public guidance for over half a decade that punishment-oriented programmes suppress reporting: users who fear reprisals will not report mistakes promptly, if at all. The behaviour the security team needs most is the one the programme discourages.
A hospital case study found employee workload was the dominant predictor of phishing vulnerability. Staff intended to detect attacks but could not under load. Any programme that frames clicking as a knowledge or competence failure is testing the wrong thing.
Real operators use lures that work against the specific organisation: invoice fraud against finance, CV attachments against HR, internal-looking calendar invites against engineering. Vendor template libraries test against a threat that no longer exists in current operations.
"Sarah clicked again" is not a work order. It identifies no control gap, no configuration drift, no policy correction. The hours spent on the simulation programme are hours not spent on DMARC enforcement, attachment policy, FIDO2 rollout, or conditional access, which would.
Three phases. Look at the endpoint's filetype handling. Look at the mail gateway and the defence layers behind it. Then take a working credential as given and work the success paths without bothering with social engineering. Solicitation is the awareness team's problem, not the assessor's.
-
00Endpoint build review.
What payloads can actually fire on this endpoint? Walk through the filetypes that show up in current adversary tooling and document how each behaves under duress on the standard build. The output is the filetype block list, refreshed as the threat moves.
OwnersEndpoint engineering · Security architecture · IT operations · EDR operations- Filetype execution policy for ISO, LNK, HTA, SVG, CHM, XLL, OneNote
- Office macro policy, MOTW propagation, protected view
- ASR rules, SmartScreen, browser download handling
- AppLocker or WDAC posture, LOLBin reachability
- EDR detection coverage for known initial access techniques
- Local privilege boundaries, sudoers, UAC, autoelevation paths
LoopDone?When?Changed? -
01Mail gateway and defence layers.
The configuration review the vendor won't run on themselves. What actually lands in the inbox, what gets sandboxed, what gets stripped at the gateway.
OwnersMail platform · Security architecture · Network and DNS · SOC- SPF, DKIM, DMARC enforcement posture (quarantine vs reject)
- Attachment sandboxing depth, dynamic analysis, file unwrap
- Link rewriting, time-of-click rescan, browser isolation handoff
- External sender flagging, impersonation protection, anti-spoof
- BEC detection, anomalous-send patterns, autoforwarding controls
- Transport rules, attachment block lists, executable handling
LoopDone?When?Changed? -
02Assumed-compromise account testing.
Credential in hand on day one. No social engineering, no waiting on a click. The question is what an attacker with creds can actually do once they're inside.
OwnersSecurity architecture · Identity · SOC · Application security · External assessor (CHECK / CREST)- MFA bypass viability: legacy auth, app passwords, device gaps
- Conditional access posture, named locations, session controls
- Token theft, AiTM proxy viability, primary refresh token reach
- OAuth consent attacks, app registration permissions
- Internal phish viability, mailbox rule abuse, SharePoint reach
- Data exfil paths, autoforward to external, eDiscovery export
LoopDone?When?Changed?
"Block ISO and LNK at the gateway." "Move DMARC from quarantine to reject." "Disable legacy auth in conditional access." Every finding lands on someone with the authority and tooling to fix it. The remediation path is concrete and verifiable.
Credentials get stolen. Eventually. The defensible question is not whether a user will be phished, it is what an attacker with a working credential can do next. Assumed-compromise testing maps that surface directly to MITRE ATT&CK techniques the team can detect and contain.
Whether a user clicks is a function of workload, fatigue, and pretext quality, not of control posture. Decoupling the assessment from solicitation tests the controls cleanly. Solicitation is what the awareness team is for.
The controls a technical assessment exercises (filetype handling, gateway posture, MFA enforcement, conditional access) defend against phishing, smishing, drive-by download, malicious removable media, and supply-chain compromise. Awareness training defends only against the user-decision moment.
No employee is named, scored, or trained as a result of this assessment. Reporting culture is unaffected. The NCSC's stated concern about punitive simulation programmes is structurally avoided.
Every finding has a configuration artifact attached, every retest is a diff against a known state. Reports survive auditor scrutiny in a way that a click-rate trend graph does not.
The remediation backlog naturally lands on identity, mail gateway, and endpoint engineering, which is where the evidence base says investment produces resilience. Hours that would have gone on quarterly campaigns move to FIDO2 rollout, attachment policy, and conditional access.
Endpoint build review, gateway configuration audit, and assumed-compromise testing each require skilled hands. Mid-market organisations may need to engage an assessor. CHECK or CREST-aligned engagement is sensible for the assumed-compromise phase.
Skilled assessor days cost more than a SaaS simulation licence. The cost lands in a single procurement decision rather than spread across operations, which is harder to justify on a budget line even where total annual cost is comparable.
Some compliance frameworks ask whether a phishing simulation has been conducted in the prior period. A technical assessment is a stronger answer, but the box still has a specific name on it. The mitigation is to keep a lightweight awareness programme and align it with the assessment findings.
A workforce still needs to know what to report and how. The technical assessment removes the assessment from the workforce, but the reporting culture, the report button, and the response loop still need to be designed. NCSC's layered defence still applies.
Phishing as red team initial access.
Same logic for commissioned red team engagements. Phishing as the entry vector is the most expensive and least useful way to start. Assume-breach is the better default. Even the frameworks that used to insist on full-chain testing now say so.
same scope
compromise testing
SOC catches the lure
- Threat intelligence phase identifies plausible attacker pretexts and likely targets within the workforce.
- Infrastructure setup: registered domains, SSL, mail relay, payload development, evasion testing against the target's email security stack.
- Lures delivered to selected targets. The team waits for clicks, credential submissions, or sandboxed execution.
- Successful initial access feeds the post-compromise phase, with whatever time remains.
For regulated scenarios where the threat model explicitly includes external initial access, end-to-end testing answers a specific question. TIBER-EU's standard variant covers this. The trade-off is that the answer is rarely surprising.
The gateway, sandboxing, and impersonation controls get an end-to-end workout that a configuration review cannot fully replicate. The catch is that this is also achievable with a much narrower mail security assessment for a fraction of the cost.
CBEST and DORA both reference threat-intelligence-led testing that includes initial access scenarios. For financial entities subject to those regimes, full-chain testing is sometimes expected. The expectation is itself softening: TIBER variants now explicitly allow assume-breach.
Every credible threat report from Mandiant, CrowdStrike, and Verizon over the last decade shows that determined adversaries achieve initial access. Spending a quarter of a multi-week engagement re-establishing this point surfaces no actionable new finding. The defensible question is what happens after.
If the SOC catches the phish, which is the entire point of the awareness and email security programme, the red team has nowhere to go. The client paid for a multi-week engagement and got one finding about their mail filter.
Registered domains, SSL certificates, mail relay configuration, payload development, anti-detection tuning. None of it carries to the next engagement. One to two weeks of senior engineer time goes on building the entry vehicle rather than on testing the client's controls.
"We phished a finance contractor" tells you very little about whether the same approach would work against a different role next quarter. "Lateral movement from a low-priv user succeeded via SeImpersonatePrivilege abuse" tells you everything, because the technique applies to every user with that privilege.
Whether a user clicks depends on their workload, fatigue, pretext quality, and the time of day the lure lands. The same engagement run twice produces different results. Assumed-breach starts from the same position every time, which makes year-over-year improvement measurable.
A red team phish lands in the same inboxes that the awareness team is trying to measure separately. The signals contaminate each other. Real phishing reports become indistinguishable from red team simulations, and trend data gets noisy.
In a six-week engagement, post-compromise work might get two to three weeks. The same two to three weeks against the same internal estate, with a foothold provided on day one, is the assumed-breach engagement. The client pays roughly a third more for the privilege of seeing the email security stack tested in the same engagement, which a much cheaper assessment could have covered.
- The client provides a low-privilege foothold. A standard user account, an unprivileged workstation in the target environment, or both.
- The engagement starts on day one in the post-compromise position. No infrastructure setup, no waiting for clicks.
- Lateral movement, privilege escalation, persistence, and impact testing proceed against the real internal control plane.
- Findings map directly to MITRE ATT&CK techniques the SOC can detect and contain. The blue team gets actionable improvement work.
No infrastructure setup, no lure development, no waiting on clicks. The full window goes to lateral movement, privilege escalation, persistence, and impact. The deliverable density is higher per pound spent.
Same starting position, same scope, same measurement framework. Improvement actually shows up. Two consecutive phishing-led engagements with different lure performance tell you very little about whether the internal controls got better.
TIBER variants now formally allow the initial entry phase to be skipped. Operational reality has caught up with the methodology. CBEST has long permitted intelligence-led scenarios that begin from documented footholds. The framework consensus is moving in this direction.
In phishing-led testing, detection is the failure condition. In assumed-breach, detection is a finding to celebrate and refine. The blue team's catches become rule-tuning material in the closing purple-team session.
"T1078.002 Domain Accounts," "T1558.003 Kerberoasting," "T1484.001 Group Policy Modification." Each finding lands in a framework the SOC already uses for detection engineering. The remediation backlog writes itself.
No infrastructure cost, shorter engagement window, no failure-mode rework. The savings come out of the part of the budget that was producing the least new information anyway.
The workforce is untouched. The phishing-report channel keeps its signal-to-noise ratio. The two programmes (red team and awareness) can run on independent schedules without interfering with each other's metrics.
The email security stack, perimeter web exposure, and edge identity controls go untested in a pure assumed-breach engagement. The mitigation is to commission separate, narrower assessments for those (which is also cheaper than rolling them into the red team scope).
Some procurement and risk functions struggle with this. The provided credential needs scoping, time-boxing, and revocation procedures. This is a process problem, not a methodology problem, and it solves itself once a single engagement has been run through it.
Senior stakeholders who think "red team" means "we tested whether attackers could get in" may need re-education. The honest answer (they will, eventually, and what matters is what happens next) is harder to put on a slide than a click-rate trend graph.
Assumed-breach is most valuable when the SOC and detection engineering function are mature enough to absorb the findings and improve. For organisations without that capability, the engagement still produces a remediation backlog, but the deeper purple-team value is harder to realise.
The SOC feedback loop.
A good phish that gets past the gateway, the sandbox, the link rewriter, and the EDR, and lands in someone's inbox, isn't a control failure. It's free, current, targeted threat intel about what's working against you right now. Most programmes have nowhere to put it. Without a loop, everything else is guesswork.
The lure landed. The user now has a sample of what just bypassed everything you've got.
SOC picks it up, decides benign, suspicious, or malicious. Real adversary stuff jumps the queue.
IOCs, TTPs, sender infrastructure, attachment behaviour, link patterns, who got targeted. Pulled out in a structured format that the rest of the stack can ingest.
Block what landed. Block what looks like it. Hunt the inbox set and the SIEM history for earlier related deliveries that nobody reported.
Tell the reporter what their report did. Update the threat model. Share with peers through an ISAC, sector group, or NCSC where you can. Feed the IOC and TTP set into the next assumed-breach engagement so it's testing against the real adversary.
Side by side on the dimensions that matter.
Click any row to expand the definition. Dot colour shows relative strength of each approach on that dimension.
| Dimension | Traditional simulation | Technical assessment |
|---|---|---|
| Actionability of findingsCan a named person fix the finding by changing a configuration, policy, or control? | Names of clickers, no control gaps surfaced | Specific configuration changes per finding |
| Threat realismDoes the test reflect what real operators are doing against this organisation in the current period? | Generic vendor templates | Assumed-compromise model maps to current TTPs |
| Workforce impactDoes the test treat the workforce as the threat surface or as the sensor network? | Threat surface, blame risk | Unaffected, reporting culture preserved |
| Reporting cultureDoes the programme encourage or suppress timely reporting of real suspicious emails? | Suppresses (per NCSC) | Neutral, reporting can be built alongside |
| Evidence baseIs the practice supported by peer-reviewed empirical study or by vendor case studies? | Vendor case studies, multiple null results in field studies | Standard pen-test methodology, NCSC-aligned |
| Cost profileWhere does the spend land? Operational subscription or skilled engagement? | Low recurring spend, hidden operational cost | Higher per-engagement, focused remediation spend follows |
| Compliance fitDoes the practice answer the specific question auditors are asked to ask? | Direct fit with "simulation conducted" items | Stronger evidence, may require lightweight awareness alongside |
| Defence rangeDoes the practice defend against other initial access vectors too, or only phishing? | Phishing-only | Phishing, smishing, drive-by, removable media, supply chain |
Where does your programme sit on the spectrum?
Nine questions, scored 0 to 3 each. Find out where the programme sits between simulation theatre and assessment-led resilience with a closed loop. Responses are captured locally, aggregated only if you've wired analytics in.
Answer straight. The score is for you, not an auditor.
If this changed how the question lands, share it.
What to do now, who owns it, and why it matters.
This is the takeaway. If the rest of the page was context, this is the work order. Each action names a responsible team, a regulatory or evidence-based reason, and a timeline. Print it, forward it, or paste it into a board paper. The reasoning is above. The obligations are below.
-
Deploy DMARC at reject, with SPF and DKIM enforced.Owners: Security architecture · Network and DNS · Mail platform
DMARC at monitor or quarantine is not enforcement. Move to reject. Monitor the aggregate reports for legitimate senders you missed, then enforce. This single control eliminates domain-spoofed phishing against your brand and reduces impersonation lures against your people.
Obligation: NCSC recommends DMARC enforcement for all UK organisations. PCI-DSS 4.0 references email authentication. NIST CSF PR.DS and ID.RA both map here. NHS DSPT Objective E references anti-spoofing controls.
Timeline: Monitor within 2 weeks. Quarantine within 6 weeks. Reject within 12 weeks. If you have been at monitor for more than 90 days, the data is stale and you are procrastinating.
-
Review the mail gateway configuration against current adversary tooling.Owners: Mail platform · Security architecture · SOC
Attachment sandboxing with dynamic analysis. Link rewriting with time-of-click rescan. External sender flagging. BEC anomalous-send detection. Autoforward to external blocked. If you are running the vendor's default configuration, you are running the configuration the vendor's sales team chose, not the configuration your threat model requires.
Obligation: ISO 27001 Annex A (A.8.23 web filtering, A.8.24 cryptography controls). Cyber Essentials Plus boundary firewalls and internet gateways. NIS Regulations Article 14 security measures for operators of essential services.
Timeline: Configuration audit within 4 weeks. Remediation of critical findings within 8 weeks. Annual revalidation.
-
Map the endpoint filetype execution policy against current initial access techniques.Owners: Endpoint engineering · Security architecture · IT operations
Walk through ISO, LNK, HTA, SVG, CHM, XLL, OneNote, and whatever the current quarter's adversary tooling favours. Document how each behaves on the standard build. Block what should not execute. Update the list as the threat moves. This is not a one-time exercise.
Obligation: Cyber Essentials Plus requires malware protection and secure configuration. CIS Benchmarks specify application control. NIST CSF PR.IP-1 covers baseline configurations. The refresh cadence is yours to set, but quarterly is the minimum defensible interval.
Timeline: Initial review within 6 weeks. Quarterly refresh thereafter.
-
Enforce MFA on every account. Phishing-resistant MFA for privileged users.Owners: Identity · Security architecture · IT operations
No exceptions. App passwords disabled. Legacy authentication disabled. Conditional access with named locations and device compliance. FIDO2 or certificate-based authentication for any account with administrative privilege. If your MFA rollout has been "in progress" for more than two quarters, the project has stalled and needs re-scoping.
Obligation: Cyber Essentials Plus requires MFA for cloud services and administrator accounts. NIST CSF PR.AC-7. FCA operational resilience expectations under SM&CR. NHS DSPT Objective E. PCI-DSS 4.0 MFA requirements expand in 2025.
Timeline: Standard user MFA within 4 weeks if not already enforced. FIDO2 for privileged users within 12 weeks. Legacy auth disabled within 6 weeks.
-
Deploy a one-button phishing reporter in the mail client.Owners: Security comms · Mail platform · SOC
The button must submit the email in a structured format to a real SOC queue, not forward it to a mailbox. Every submission must receive an automated acknowledgement within minutes. The reporter must never be named, scored, or sanctioned based on what they report. If the button exists but nobody reads the queue, the button is theatre.
Obligation: NCSC guidance explicitly recommends making reporting easy, fast, and free of blame. ISO 27001 A.6.8 (information security event reporting). The reporting rate and time-to-first-report are your primary phishing metrics from this point forward, not click rate.
Timeline: Button deployed within 4 weeks. SOC triage SLA defined within 6 weeks. First reporting-rate baseline measured within 12 weeks.
-
Build the SOC feedback loop.Owners: SOC · Detection engineering · Security comms · Threat intel
Every reported phish that bypassed controls: triage, extract IOCs and TTPs, block at gateway and endpoint, hunt the inbox set for earlier related deliveries, close the loop with the reporter. The reporter gets told what their report did. The controls get updated. The next assessment tests against what actually landed.
Obligation: MITRE's threat-informed defence model. NIST CSF DE.AE (anomalies and events), DE.CM (continuous monitoring), RS.AN (analysis). NIS Regulations continuous improvement requirement. This is the loop that turns individual reports into organisational resilience.
Timeline: Basic loop (triage + block) within 8 weeks. Full loop (extract + hunt + reporter feedback + comms) within 16 weeks. Quarterly review of loop effectiveness.
-
Commission an assumed-breach assessment.Owners: Security architecture · Identity · SOC · External assessor (CHECK / CREST)
Start from a low-privilege credential on day one. No social engineering, no waiting for clicks. Test lateral movement, privilege escalation, persistence, and impact against the real internal control plane. Map findings to MITRE ATT&CK techniques the SOC can detect and contain. The exercise tests what an attacker can actually do once they are inside, which is the question that matters.
Obligation: TIBER-EU and CBEST both support assume-breach entry. NIST CSF PR.PT and DE.CM. Cyber Essentials Plus scope covers internal vulnerability assessment. FCA operational resilience expectations. The assumed-breach model is now the framework consensus, not the alternative position.
Timeline: Scope and procure within 8 weeks. Execute within the following 4 weeks. Remediation of critical findings within 8 weeks of report delivery. Annual cadence thereafter.
-
Replace click rate with reporting rate as your primary metric.Owners: Security comms · SOC · Leadership
Click rate is gameable by lure difficulty, produces no actionable finding, and predicts nothing about real-world resilience. Reporting rate measures the behaviour you actually need: people telling you when something suspicious arrives. Time-to-first-report measures how fast your human sensor network responds. Both improve when the loop is visible and reporters are thanked.
Obligation: No framework mandates click rate as a metric. Several (NCSC, NIST CSF) reference reporting culture as a resilience indicator. The switch is a governance decision, not a technical one. Present it to the board alongside the evidence base (Lain et al. 2022, CCS 2024, arXiv 2025).
Timeline: Board paper within 4 weeks. Metric switch within 8 weeks. First quarterly report on reporting rate within 16 weeks.
-
Redesign workforce education around real reported lures, not vendor templates.Owners: Security comms · SOC (intel source) · Awareness team
Monthly comms that name the techniques currently landing against your sector. Draw the examples from the SOC loop, not from a vendor's template library. Just-in-time micro-prompts at the decision point, not quarterly CBT modules. Acknowledge every report. Celebrate reporting publicly. Never name or sanction a clicker. The workforce is part of the defence once the defence exists — not before.
Obligation: NCSC guidance states that punitive programmes suppress reporting. ISO 27001 A.6.3 (information security awareness, education, and training). The training content must be drawn from current threat intelligence, not from a static curriculum. The ETH Zurich studies (2022, 2024) and the 2025 NIST Phish Scale study provide the evidence base for this redesign.
Timeline: First real-lure comms within 4 weeks of SOC loop going live. Monthly cadence. Quarterly review of reporting rate trend. Annual assessment of programme effectiveness against the diagnostic on this page.
Sources.
- Phishing in Organizations: Findings from a Large-Scale and Long-Term Study. IEEE Symposium on Security and Privacy (S&P), 2022. 14,000 employees, 15 months. arxiv.org/pdf/2112.07498
- Content, Nudges and Incentives: A Study on the Effectiveness and Perception of Embedded Phishing Training. ACM CCS 2024. Distinguished Paper Award. 4,554 participants. arxiv.org/abs/2409.01378
- A Large-Scale Empirical Assessment of Multi-Modal Training Grounded in the NIST Phish Scale. arXiv preprint, 2025. 12,511 participants. arxiv.org/html/2506.19899v1
- Phishing simulation exercise in a large hospital: a case study. PubMed Central. Workload-driven susceptibility analysis. ncbi.nlm.nih.gov/pmc/articles/PMC8935590
- Phishing attacks: defending your organisation. Official guidance, multi-layer mitigation framework, updated 2025. ncsc.gov.uk/guidance/phishing
- TIBER-EU: Threat Intelligence-based Ethical Red Teaming. Framework for controlled, intelligence-led red team testing of financial entities, supporting DORA compliance. ecb.europa.eu/paym/cyber-resilience/tiber-eu
- TIBER-NL and TIBER-Rijk assume-breach variants. Explicit framework provision for skipping the initial entry phase where operationally or legally appropriate. tiber.info/documentation
- Threat-Informed Defense. Continuous feedback loop between adversary behaviour, control visibility, and operational readiness. ATT&CK is the underlying taxonomy. mitre.org/focus-areas/cybersecurity/threat-informed-defense