SatyaX Methodology

Learn how SatyaX evaluates credibility using writing patterns, evidence retrieval, source reliability, and AI-assisted reasoning.

Layer 1

Writing Style Analysis (BERT)

SatyaX uses a BERT model fine-tuned on 40,000+ real and fake news articles to evaluate how content is written — not whether it is factually true. This produces four linguistic risk metrics.

›Sensationalism Risk — detects alarm-driven, emotionally charged language

›Clickbait Probability — identifies attention-manipulation patterns

›Emotional Language Score — measures sentiment-driven word choice

›Style Confidence — overall model confidence in the classification

⚠ BERT evaluates linguistic patterns only. It does not verify factual accuracy. Short, neutral, factual statements often score high sensationalism — this is a known limitation.

Layer 2

Real-Time Evidence Retrieval

SatyaX searches DuckDuckGo News for the first 30 words of your input, returning up to 5 related real-world articles. Each source is evaluated for reliability using a domain trust database.

›No API key required — DuckDuckGo search is always-on

›Domain reliability scored 1–10: Reuters=10, BBC=9, Unknown=5, Tabloids=2–3

›Results are cross-referenced with the Gemini AI analysis

Layer 3

AI Verification Engine (Gemini)

Google Gemini Flash performs a deep, paragraph-level analysis of the content combined with the retrieved evidence. It returns six types of output:

›Verdict — REAL / FAKE / MISLEADING / PARTIALLY TRUE / CONTEXT NEEDED / UNVERIFIED

›Truth Score — 0–100 numeric credibility rating

›ELI5 Summary — one sentence verdict in plain English

›Key Claims — individual verifiable claims extracted and checked

›Paragraph Analysis — each section of the article classified independently

›Confusion Clarification — what is accurate, what is missing, what you should know

Layer 4

URL Content Extraction (5-Layer Pipeline)

When a URL is submitted, SatyaX attempts to extract article text through five progressive fallback layers:

›Layer 1 — Trafilatura: High-precision content extraction (primary)

›Layer 2 — newspaper3k: News-site optimised extractor

›Layer 3 — BeautifulSoup: Raw HTML paragraph extraction

›Layer 4 — Google Cache: For bot-protected sites (Reuters, NYT)

›Layer 5 — Archive.org: For paywalled or recently blocked pages

Verdict Framework

Six-Category Verdict System

SatyaX avoids binary REAL/FAKE labeling. All six verdicts carry distinct meaning:

✅REAL — Accurate, verified, or strongly supported by evidence

🚫FAKE — False, fabricated, or directly contradicted by evidence

⚠️MISLEADING — Technically true but framed to deceive or distort

🔶PARTIALLY TRUE — Some claims correct, others false or unverifiable

💬CONTEXT NEEDED — Accurate but missing critical background context

🔍UNVERIFIED — Insufficient evidence to confirm or deny at analysis time

SatyaX is an AI-assisted tool. Results are based on publicly available information at analysis time. AI systems can make errors, especially for recent events beyond their knowledge cutoff. SatyaX is not a substitute for expert journalism, professional fact-checking, or legal judgment. Always cross-reference critical claims with multiple trusted primary sources.