SatyaX Methodology

Learn how SatyaX evaluates credibility using writing patterns, evidence retrieval, source reliability, and AI-assisted reasoning.

Layer 1
Writing Style Analysis (BERT)
SatyaX uses a BERT model fine-tuned on 40,000+ real and fake news articles to evaluate how content is written — not whether it is factually true. This produces four linguistic risk metrics.
Sensationalism Risk — detects alarm-driven, emotionally charged language
Clickbait Probability — identifies attention-manipulation patterns
Emotional Language Score — measures sentiment-driven word choice
Style Confidence — overall model confidence in the classification

⚠ BERT evaluates linguistic patterns only. It does not verify factual accuracy. Short, neutral, factual statements often score high sensationalism — this is a known limitation.

Layer 2
Real-Time Evidence Retrieval
SatyaX searches DuckDuckGo News for the first 30 words of your input, returning up to 5 related real-world articles. Each source is evaluated for reliability using a domain trust database.
No API key required — DuckDuckGo search is always-on
Domain reliability scored 1–10: Reuters=10, BBC=9, Unknown=5, Tabloids=2–3
Results are cross-referenced with the Gemini AI analysis
Layer 3
AI Verification Engine (Gemini)
Google Gemini Flash performs a deep, paragraph-level analysis of the content combined with the retrieved evidence. It returns six types of output:
Verdict — REAL / FAKE / MISLEADING / PARTIALLY TRUE / CONTEXT NEEDED / UNVERIFIED
Truth Score — 0–100 numeric credibility rating
ELI5 Summary — one sentence verdict in plain English
Key Claims — individual verifiable claims extracted and checked
Paragraph Analysis — each section of the article classified independently
Confusion Clarification — what is accurate, what is missing, what you should know
Layer 4
URL Content Extraction (5-Layer Pipeline)
When a URL is submitted, SatyaX attempts to extract article text through five progressive fallback layers:
Layer 1 — Trafilatura: High-precision content extraction (primary)
Layer 2 — newspaper3k: News-site optimised extractor
Layer 3 — BeautifulSoup: Raw HTML paragraph extraction
Layer 4 — Google Cache: For bot-protected sites (Reuters, NYT)
Layer 5 — Archive.org: For paywalled or recently blocked pages
Verdict Framework
Six-Category Verdict System
SatyaX avoids binary REAL/FAKE labeling. All six verdicts carry distinct meaning:
REAL — Accurate, verified, or strongly supported by evidence
🚫FAKE — False, fabricated, or directly contradicted by evidence
⚠️MISLEADING — Technically true but framed to deceive or distort
🔶PARTIALLY TRUE — Some claims correct, others false or unverifiable
💬CONTEXT NEEDED — Accurate but missing critical background context
🔍UNVERIFIED — Insufficient evidence to confirm or deny at analysis time
SatyaX is an AI-assisted tool. Results are based on publicly available information at analysis time. AI systems can make errors, especially for recent events beyond their knowledge cutoff. SatyaX is not a substitute for expert journalism, professional fact-checking, or legal judgment. Always cross-reference critical claims with multiple trusted primary sources.