How SatyaX Works

1

Input Collection

What can you submit?

Users can submit any form of informational content. The system accepts both direct text input and news article links for analysis.

📰 News article text 💬 Headline 📱 Social media post 🔗 Article URL 🎭 Satire article

Goal: Convert any piece of information into a standardized analysis pipeline.

2

Content Extraction

URL? We scrape the full article.

For URLs, SatyaX extracts the full article text using a 5-layer cascading scraper pipeline. For direct text, content moves immediately to the next stage.

Layer 1: Trafilatura Layer 2: newspaper3k Layer 3: BeautifulSoup Layer 4: Google Cache Layer 5: Archive.org

Only the article body is analyzed. Noise is removed before any AI processing begins.

3

Writing Style Analysis (BERT)

Linguistic pattern detection. Not truth detection.

A fine-tuned BERT model performs linguistic analysis — trained on 40,000+ real and fake news articles. Important: this module does not determine whether a claim is true or false. It identifies writing characteristics commonly associated with misinformation.

📊 Sensationalism Risk 🎣 Clickbait Probability ❤️‍🔥 Emotional Language 📰 Journalistic Similarity 🎭 Manipulative Patterns

Example: "Scientists HATE this one secret trick!"
→ High Clickbait Probability · High Sensationalism · Low Journalistic Similarity

4

Claim Extraction

Every verifiable claim, isolated.

The AI identifies individual factual claims made within the content and extracts them separately. Breaking content into individual claims allows more accurate, targeted verification instead of treating the entire article as one unit.

Input: "NASA confirms the Moon is made of cheese."

Extracted:
Claim #1 — NASA confirmed the Moon is made of cheese.
→ Each claim is verified independently in Step 7.

5

Real-Time Evidence Retrieval

Live web search. Not a static database.

The system searches the live web for supporting and contradicting evidence. This is not a pre-built fact database — results reflect current publicly available information.

📰 News reports 🏛️ Official announcements 🏗️ Government publications 🔬 Scientific sources 🌐 Public web

Goal: Find independent, real-time evidence related to each claim — both confirming and contradicting.

6

Source Reliability Assessment

Not all sources are equal.

Retrieved sources are evaluated for credibility based on domain reputation, editorial standards, historical reliability, and source authority. Each source is scored 1–10 and displayed alongside results.

✓ Trusted 9–10 ~ Moderate 5–8 ⚠ Low 1–4

reuters.com → 10/10 Trusted | bbc.com → 9/10 Trusted | unknown-blog.net → 4/10 Low Reliability

7

AI Fact Verification Engine

Evidence-based reasoning. Not style-based guessing.

The verification engine (powered by Google Gemini) cross-references the original content, extracted claims, retrieved evidence, and source quality to determine a verdict. Unlike traditional classifiers, the engine uses evidence-based reasoning.

✅ REAL

🚫 FAKE

⚠️ MISLEADING

🔶 PARTIALLY TRUE

💬 CONTEXT NEEDED

🔍 UNVERIFIED

8

Truth Score Generation

A credibility score from 0 to 100.

A credibility score is generated based on evidence strength, source quality, and claim verifiability. The score is a credibility indicator, not an absolute truth measurement.

80–100

High Credibility

Well-supported by evidence with strong source quality.

50–79

Mixed Credibility

Some valid information but lacks context or evidence.

0–49

Low Credibility

Major inaccuracies or unsupported/misleading claims.

9

Explainable AI Output

Not just a verdict. A transparent explanation.

The platform provides transparent reasoning. The goal is not only to classify content but to explain why a conclusion was reached. Users receive the full evidence trail.

📋 Verdict 📊 Truth Score 🎯 Confidence Level 🔎 Evidence Found 🎯 Key Claims 📄 Paragraph Analysis 🛡️ Source Reliability 🧒 ELI5 Explanation

Ready to verify something?