The complete MFF guide

Everything about the framework, in one place — from the epistemic labels to the shields, in plain language. Free and open to everyone.

Basics — what MFF is

MFF is a structured set of instructions you give to any AI so it stops sounding uniformly confident. It must declare how certain it is, cite verifiable sources or admit it can't, and follow defence rules that prevent invented facts. Same tools the experts use — for everyone.

The 6 epistemic labels

Every MFF response carries an epistemic label — a colour-coded reliability signal the model must apply to each claim it makes. No more uniformly confident output.

Verified

Grounded in cited sources or demonstrable evidence.

Plausible

Consistent with available knowledge, not directly verified.

Uncertain

Insufficient evidence to assess — the model says so explicitly.

Speculative

Weakly supported; inference from analogy or partial data.

Likely false

Contradicted by available evidence — flagged, not hidden.

Not assessable

Insufficient information to evaluate. The model admits the limit.

The 7 shields (L1–L7)

Each shield — also called level or module — protects against a specific type of error. You can activate them individually or combine them. The first is always on.

L1 — Forced Citation

Show me the source

Always active. AI cannot label information as 🟢 CERTAIN without a citable source: if missing, it automatically downgrades to 🔵 PROBABLE.

Example: You ask "what's the daily dose of vitamin C?" — AI cites EFSA or the Department of Health. No source → no "certain".
L2 — External Verification Zone (EVZ)

Tell me what to check

When a claim has operational impact and reliability lower than [🔵 PROBABLE], the AI attaches an ── EVZ ── block explaining how, where and why to verify it externally.

Example: Looking for a rental. You ask about lease laws. At the bottom you find: "── EVZ ── ⚠️ Verify on Revenue Agency: tax thresholds may have been updated for 2026. Urgency: HIGH before signing."
L3 — Falsification

When this answer doesn't hold

For every important answer, AI explicitly declares the conditions that would disprove it ("This answer is invalid if..."). Popper's principle applied: if you don't know how it could be wrong, you don't know how much to trust it.

Example: You ask if a fixed-rate mortgage is a good idea today. AI answers, then adds: "This answer is invalid if: (1) the ECB raises rates more than expected in the next 6 months, (2) your income doesn't stay stable, (3) you plan to pay off early within 5 years."
L4 — Web Grounding

Search the internet before answering

When you toggle the 🌐 button in chat, the AI searches the web live (MWAL engine: Tavily on MFF side or your BYOK Brave/Tavily keys) before answering. Without a real search, the AI cannot label 🌐 CONFIRMED:WEB — no fabricated sources.

Example: You ask "what's the Italian VAT rate on books today?" You press 🌐. AI searches online and answers: "[🌐 CONFIRMED:WEB — source: agenziaentrate.gov.it] VAT on paper books in Italy is 4%."
L5 — Peer Review

Who watches the watchers

The answer goes through an external check: either a qualified human review (variant A) or a second independent LLM (variant B). Never the same AI rechecking itself — that would make no sense.

Example: You're drafting a contract. Variant A (Human): AI completes the text then suggests "I recommend having this clause reviewed by a lawyer". Variant B (LLM): the answer is rechecked by a second AI model before being delivered to you.
L6 — Epistemic Drift

Warn me if you contradict yourself

Every 5 turns the AI re-reads its own previous answers and looks for 🟢 CERTAIN labels without a source, or claims that contradict the initial ones. If it finds two or more drifts, it emits a ⚠️ DRIFT ALERT L6 and recalibrates.

Example: After 20 messages about an investment, AI: "⚠️ DRIFT ALERT L6: at turn 4 I said the expected return was 4-6%; now I'm treating 8% as certain without a new source. Recalibrating: back to 4-6% until you have updated data."
L7 — Multi-Source Validation (MSV)

Compare multiple sources

AI doesn't stop at the first source. The second source you choose yourself: when creating a session you select a peer model from a different provider (e.g. session on Claude, peer on GPT-4) and MFF automatically routes the same question to the second model for comparison. The three variants (A: web + peer AI, B: simulated by the same model, C: external input you bring) describe how the ── MSV ── block gets filled.

Example: Primary session on Claude, peer model on GPT-4. You ask if a supplement is safe. Claude answers citing EFSA; GPT-4 is automatically queried and flags university studies on rare side effects. You get a ── MSV ── block with CONVERGENCE: MEDIUM and a recalibrated final label.
NIST-RMF — Standard

The international standard

NIST is the U.S. agency for scientific standards (like ISO, but for AI). Their framework is the global guide for using AI responsibly in critical domains.

When it activates: automatically for domains like Medicine, Security, Engineering, Legal, Scientific Research. Ensures the conversation respects globally recognized safety parameters.
PAVA — anti-degradation
PAVA · R17 — the shield that never sleeps

Beyond the seven shields, the PAVA anti-degradation protocol (R17) stops the model from quietly dropping the rules in long conversations. It re-checks integrity every few turns — so turn 40 is as disciplined as turn 1.

When to enable it: long transcriptions, faithful reproductions of documents, code to copy in full, complex prompts to follow to the end. Without PAVA, around turn 30 the AI starts "cutting corners".
APEX prompt & activation

The APEX prompt (currently v1.5.3) is the full activation of the framework. You can generate a custom one for your case with the generator, or use the app which applies it automatically to every message.

Open the generator →

Glossary (plain language)

Every technical term, paired with a plain-language bridge so anyone can follow.

🤖 AI terms in general 10
AI / Artificial Intelligence
A program capable of generating human-like answers by analyzing huge amounts of text. It doesn't "think" like us: it predicts the most likely next word, one after another.
LLM
Large Language Model. The technical "brain" behind ChatGPT, Claude, Gemini.
Prompt
The instruction you give the AI. Everything you type in the chat is a prompt.
Token
The minimum unit of text the AI processes. A long word can be 2-3 tokens. AIs have a token limit per conversation.
Context
The memory of the current conversation. When AI "forgets" things said earlier, it's because it has exceeded its context limit.
Multimodal
An AI that understands not just text, but also images, audio, video, documents.
Hallucination
When AI invents information that looks real but isn't. Like daydreaming and believing it.
Bias
Systematic distortion in answers. AI can inherit prejudices from the data it was trained on.
Training
The phase where AI "studies" billions of texts. It happens once. That's why AI may not know recent things.
Cut-off
The deadline beyond which AI knows nothing. Example: cut-off January 2024 = no knowledge of post-January 2024 events.
🎓 Philosophical and scientific terms 11
Epistemic / Epistemology
From Greek "episteme" = knowledge. About how much and how we know something. "Epistemic level" = how confident you are about a claim. Example: "I know 2+2=4" (high) vs "I think it'll rain tomorrow" (low).
Exegesis
Critical interpretation of a text. Not repeating the words, but explaining the actual meaning. Example: the exegesis of a contract clause explains its practical consequences.
Inference
A conclusion drawn from known information. "The floor is wet → I infer it has rained." Not certainty, just probable deduction.
Speculation
Unproven hypothesis, based on reasoning but not evidence. "Maybe the company will fail next year."
Estimate
Approximate calculation based on partial data. Halfway between certainty and speculation.
Etiology
The study of the causes of a phenomenon (especially in medicine). "What is the cause of this disease?"
Validation
A verification process that confirms whether something is correct, through evidence or checks.
Primary source
The original source: the scientific study, the law, the official document. Not a third-party summary.
Secondary source
A reworking or summary of primary sources. Wikipedia is a secondary source.
Cross-check
Cross-comparison between multiple sources to verify the consistency of the information.
Peer review
Scientific review done by peers, experts in the same field. The quality standard of serious research.
🛡️ Framework-specific terms 16
Framework
A structured set of rules. Like a recipe: it isn't the dish, it's the method to make it well.
MFF
MarcoFLY Framework. The protocol's name.
MFF-EX
EXecutable. The "operational" part of the framework — the instructions AI executes.
MFF-EL
Epistemic Layer. The "epistemic layer" — the rules on how to handle certainty.
APEX
Complete and optimized version of the framework. Like the "Pro" version of a software.
L1 → L7
The seven Layers of defense against errors and hallucinations. L1 always on, the others optional.
ZVE
External Verification Zone. The box AI creates to flag what should be verified elsewhere.
PAVA
Protocol for Alerts and Reliability Assessment. The system that automatically labels the certainty of every claim.
Epistemic Drift
When AI "slides" gradually from certain facts to assumptions, without noticing.
Domain
The thematic area of the conversation (medical, legal, scientific, daily…). The framework adapts to the domain.
Module / Layer / Level
Synonyms in the framework. They refer to the individual L1-L7 "shields".
Shield
Evocative term for "defensive module". Each layer is a shield against a type of error.
VMS
Multi-Source Validation. The L7 module that compares multiple sources.
Forced Citation
The L1 rule: AI must always declare where information comes from.
Contextual Exegesis
The L4 rule: separate facts from interpretations made by the AI itself.
MWAL
MarcoFLY Web Access Layer. The module that brings web search into the chat when a real-time source is needed.
📋 The six certainty levels (MFF-EL labels) 6
🟢 CERTAIN
Direct source, official document, no doubt. AI has concrete in-session evidence (URL, log, verified document). What you read is real.
🔵 PROBABLE
Solid logic, certainty not guaranteed. The model is confident — but the exact source isn't in session. Well-structured deduction, not a proven fact.
🟡 MAYBE
A hypothesis, not an answer. The model estimates, it doesn't know. Significant error margin. Useful as a starting point — dangerous as a finish line.
🟠 DEPENDS
True only under certain conditions. Read the premises before acting. The statement holds — but only if the variables occur.
🔴 IDK
The most honest answer an AI can give you. Insufficient data, contradictory sources. Instead of inventing, MFF stops. A well-packaged hallucination is more dangerous than silence.
⚫ CANNOT
It's not reticence, it's structural honesty. The request goes beyond the cutoff, outside the training set, or against safety policies. AI doesn't improvise, bypass, or invent. It stops.
🏛️ International standards cited 4
NIST-RMF
National Institute of Standards and Technology — Risk Management Framework. The U.S. agency for scientific standards. Their framework is the global guide for using AI safely.
TrAM
Trustworthiness Assessment Model. Scientific model (Schlicker et al. 2025) for evaluating the reliability of automated systems. It explains the difference between trusting upfront and trusting after verification.
AT / PT (in TrAM)
Antecedent Trust / Post Trust. AT = the trust you have before using a tool. PT = the trust you have after using and verifying it.
OSF / PsyArXiv
Open-access scientific repositories where research is published. The "public library" of modern science.
💻 Minor technical terms 7
Open Source
Public code, inspectable and modifiable by anyone.
CC BY-NC-ND 4.0
The license under which the site is distributed: free consultation, citing the author. Non-commercial, no derivatives.
PWA
Progressive Web App. The site behaves like an app: it can be "installed" on the phone.
Repository / Repo
Public online folder (e.g. GitHub) where a project's code lives.
BYOK
Bring Your Own Key. You use your own personal API key: your conversations stay yours and never pass through MFF servers.
SSE
Server-Sent Events. Real-time text streaming: the words arrive as they are written.
ORCID
A unique identifier for researchers — the ID card of the scientific world, recognised by universities and journals.
🏷️ Abbreviations and tags used by the Framework 20
📎 Source suffixes (appear after the colored label)
[doc]
Official document (law, contract, paper, manual, public document).
[test]
Direct testimony (interview, statement, deposition).
[comm]
Official communication from an institution, company, or agency.
[log]
Technical log or registry (system events, audit trail, monitoring).
[emp]
Empirical data (controlled experiment, measurement, direct observation).
[lit]
Scientific literature (peer-reviewed paper, academic book, review).
[leg]
Legal or regulatory source (ruling, decree, regulation, directive).
[cert]
Verifiable formal certification (attestation, ISO, third-party certifier).
🧠 Memory and calibration
[mem]
Unverified memory: AI remembers it from training, but no proof in the current session. Always to verify.
[p:xx%]
Subjective probability estimated by the model. Example: [p:75%] = 75% internal confidence.
⚙️ Pipeline phase tags
[PHASE E]
Epistemic Phase: AI produces the answer and labels it with certainty markers.
[PHASE X]
Exegetical Phase: AI interprets and explains, separating facts from inferences.
🛡️ Module tags (defense levels)
[L1]…[L7]
Reference to the defense level activated for the specific claim.
[ACTIVE] / [ALWAYS ACTIVE]
Module status in the current session. L1 is always active, others as configured.
[SIM] / [EXT]
VMS module mode. SIM = simulated (AI validates autonomously). EXT = external (requires user input).
📦 Special blocks in responses
── ZVE ──
External Verification Zone. Box that lists what to check outside the chat and how urgently.
── VMS ──
Multi-Source Validation. Box showing the comparison between different sources with convergence score.
⚠️ DRIFT ALERT L6
Epistemic drift alert: AI is sliding from facts to assumptions. It stops and recalibrates.
State Card
Periodic session summary: project, domain, active levels, main thesis, external validations.
PENDING · R1–R20
PENDING = label suspended pending external verification. R1–R20 = internal operational rules (e.g. R12: no 🟢 CERTAIN without source).
FAQ
Is MFF free?

Yes. It's a free public beta — donationware. You bring your own AI key (BYOK); MFF doesn't charge for AI usage.

Do I need to be technical?

No. MFF is built for anyone — students, researchers, professionals, curious people. Every technical term has a plain-language explanation.

Which AI does it work with?

Any of them — ChatGPT, Claude, Gemini, Mistral and more. The app supports 13+ providers; the prompt works on any model you paste it into.

Does MFF see my conversations?

No. With BYOK your key and your chats stay yours — nothing passes through MFF servers. Only anonymous page-view analytics, no profiling.