Adalytica logo

Knowledge Hub

Signal Methodology Overview

AlphaPulse reads everything markets are saying - news wires, analyst commentary, social posts, regulatory filings - and boils each topic down to two scores: the Pulse Strength (how bullish or bearish the coverage reads) and the Attention Pulse(how much it’s being talked about).

New content flows in continuously. Each article runs through AI models tuned for finance and geopolitics, which estimate its sentiment and weight the source by how reliable that outlet has been historically. The raw scores are then normalised, aggregated, and quality-checked before the gauge updates.

Design principle

AlphaPulse measures what people are saying about markets, not what prices are doing. That makes it a leading indicator - moves often show up in coverage and sentiment before they show up on a price chart, so it complements price data rather than echoing it.
72
High sentiment
48
Neutral
21
Low sentiment
81
High attention

How Sentiment Works

Sentiment is the numerical distillation of opinion from text. For every article, post, or filing the pipeline ingests, a model assigns a score in [1,+1][-1, +1] - where −1 is maximally negative, 0 is neutral, and +1 is maximally positive - relative to a specific tracked topic such as Bitcoin, Gold, or Trump Policy.

The same article can carry different sentiment scores for different topics. A piece headlined "Trump sorgt für neue Turbulenzen, Bitcoin reagiert prompt" simultaneously registers negative sentiment on Trump policy and slightly negative sentiment on Bitcoin - each scored independently by the topic router.

Infographic showing how individual news and social sources are scored for sentiment and aggregated into the AlphaPulse signal

As the infographic above illustrates, sources span languages, regions, and media types simultaneously - a German newswire, a Russian social post, and a Chinese broadcaster can all influence the same topic signal in the same update cycle. This cross-lingual aggregation is intentional: market-moving information is written in the language it originates in, and translating it introduces lag.

Per-document score−1 → +1Raw sentiment logit assigned by the domain-adapted transformer model.
Source weightingw ∈ (0, 1]Authority weight derived from reach, editorial standards, and historical accuracy.
Aggregation windowRollingWeighted scores accumulate over a rolling window before normalisation to 0–100.

Why multiple languages?

Over 70% of financial market events originate outside English-language media. Restricting ingestion to English introduces a systematic blind spot. AlphaPulse processes sources in 72 languages and normalises scores to the same scale before aggregation.

NLP Processing Pipeline

Each ingested document passes through a four-stage pipeline:

  1. Topic routing - a lightweight classifier maps each document to one or more of the 100+ tracked topics using keyword priors and embedding similarity. Documents with confidence below 0.60 are discarded.
  2. Sentiment scoring - a domain-adapted transformer model (financial BERT family) assigns a logit in [1,+1][-1, +1]. The logit is calibrated against a held-out labelled set of financial news updated quarterly.
  3. Source weighting - each source has an authority weight ws(0,1]w_s \in (0, 1] derived from reach, editorial standards, and historical signal quality. The weighted contribution of document ii is wsiy^iw_{s_i} \cdot \hat{y}_i.
  4. Aggregation & normalisation - weighted scores are aggregated over a rolling window and mapped to the 0–100 percentile scale described below.

Score Normalisation (0–100)

Raw aggregate scores are converted to percentile ranks over a trailing 24-month window. Let μ\mu and σ\sigma be the rolling mean and standard deviation of the raw score for a given topic. The displayed score is:

S=50+50xμmax(σ,ϵ)S = 50 + 50 \cdot \frac{x - \mu}{\max(\sigma,\,\epsilon)}

where xx is the current raw aggregate, ϵ\epsilon is a small regularisation constant that prevents division by zero for low-activity topics, and the result is clipped to [0,100][0, 100].

Cross-topic comparability

Because every topic is normalised to its own rolling baseline, a score of 70 means the same thing across Bitcoin and German Bunds - both are in the 70th percentile of their own sentiment history. Absolute raw scores are not comparable; the percentile rank is.

Sensitivity Filter - Square Root of Time Variance Threshold

The Sensitivity slider (range 1–5) in the signal table controls the noise-rejection threshold for change columns. It implements a Square Root of Time scaling rule borrowed from quantitative risk management: noise in a time series grows proportionally to t\sqrt{t}, not linearly.

A change Δ\Delta over a lookback window of dd days is highlighted as significant only when:

ΔCd|\Delta| \geq C \cdot \sqrt{d}

where CC is the chosen sensitivity level. For the default C=2C = 2:

Window√dMin |Δ| at C=2
1 day1.002.0 pts
7 days2.655.3 pts
30 days5.4810.9 pts

Why this matters

A 5-point move over one day is more surprising than a 5-point move over 30 days - the 30-day window has a larger variance budget. The √T rule captures this intuition formally, so the significance threshold automatically scales with the lookback period rather than requiring separate thresholds per timeframe.

At C=1C = 1: maximum reactivity - even small moves are flagged; useful for short-term scanning but noisier. At C=5C = 5: only regime-level dislocations are highlighted, suitable for weekly or macro-cycle monitoring.

Confidence & Significance Thresholds

In addition to the user-facing Sensitivity filter, the pipeline applies internal quality gates before publishing any score. A topic score is only published when:

  • At least 12 unique source documents contributed to the aggregate window.
  • Source diversity exceeds a minimum entropy threshold (no single source contributes more than 40% of the weighted mass).
  • The rolling σ\sigma is non-zero (topic has shown historical variance - prevents spurious scores for newly tracked topics).

Low-activity topics

Topics with fewer than 12 documents in a window will show a stale score with a greyed-out timestamp rather than a fresh reading. This is intentional - a thin-data score is more misleading than no score.