Generic keyword/topic monitoring engine. Polls news, web search, RSS feeds, and Twitter/X, filters results by your configured keywords, classifies risk and sentiment, stores items in SQLite, and sends alerts to Telegram or WhatsApp.
- Collect — queries Google News RSS, SearXNG (web/news), direct RSS feeds, and Twitter/X for articles matching your topic keywords
- Filter — keeps only relevant + recent items, deduplicates across sources
- Classify — assigns a tone (positif/neutre/negatif) and risk level (low/medium/high/critical) using configurable term lists
- Store — persists items in a local SQLite database and writes JSONL snapshots
- Alert — sends undelivered items to Telegram and/or WhatsApp, ordered by risk
- Python 3.10+
- A running SearXNG instance (for web/news queries)
- A Telegram bot token + chat ID, and/or a Green API account (for WhatsApp)
- Optional: a twitterapi.io API key for Twitter/X search
# 1. Clone / copy the project folder
git clone <repo> topic-watch
cd topic-watch
# 2. Install dependencies
bash setup.sh
# 3. Configure secrets
cp .env.example .env
$EDITOR .env # fill in TELEGRAM_BOT_TOKEN, TELEGRAM_CHAT_ID, SEARX_BASE …
# 4. Create your topic config
cp config.yaml my-topic.yaml
$EDITOR my-topic.yaml # customize keywords, queries, RSS feeds …
# 5. First collection run
.venv/bin/python watch.py --config my-topic.yaml --mode collect
# 6. Send alerts for what was found
.venv/bin/python watch.py --config my-topic.yaml --mode alerts --send-telegramAll fields are optional — unset fields default to empty lists / sane defaults.
| Field | Type | Description |
|---|---|---|
project_name |
string | Human-readable name shown in alert headers |
project_slug |
string | Lowercase slug used for DB and file naming |
standalone_triggers |
list | Any single term triggers relevance (substring match) |
anchor_terms |
list | Scored terms — ≥ anchor_score_threshold hits → relevant |
anchor_score_threshold |
int | Minimum anchor score (default: 2) |
positive_terms |
list | Boost positive tone score |
negative_terms |
list | Boost negative tone score |
risk_terms |
list | Each hit raises risk; 1→medium, 3+→high |
mobilization_signals |
list | Any single hit → critical risk |
risk_overrides |
dict | "Actor name": critical — force risk level per actor |
excluded_url_patterns |
list | Python regex patterns matched against url title — skip if matched |
search_queries.news |
list | Queries sent to Google News RSS |
search_queries.web |
list | Queries sent to SearXNG |
search_queries.twitter |
list | Queries sent to twitterapi.io search |
search_queries.mobilization |
list | Additional queries sent to SearXNG |
hostile_rss_feeds |
list | RSS feed URLs fetched directly (filtered post-fetch) |
news_rss_feeds |
list | Same, but source_type tagged as news_rss |
twitter_accounts |
list | Twitter handles monitored by timeline |
target_actors |
list | Cross-queried with event_join_terms → news + web queries |
social_actors |
list | Same, for Facebook/Instagram/TikTok site: queries |
event_join_terms |
list | Event-side terms for actor × event cross-queries |
project_name: "Climate Watch France"
project_slug: "climate-fr"
standalone_triggers:
- "loi climat"
- "taxe carbone"
anchor_terms:
- "climat"
- "co2"
- "transition énergétique"
- "cop30"
anchor_score_threshold: 2
negative_terms:
- "opposition"
- "recours"
- "blocage"
risk_terms:
- "manifestation"
- "grève"
- "blocage"
search_queries:
news:
- '"loi climat" OR "taxe carbone" 2026'
web:
- 'site:x.com ("loi climat" OR "taxe carbone")'
twitter:
- '"taxe carbone" lang:fr'
news_rss_feeds:
- "https://www.lemonde.fr/planete/rss_full.xml"
- "https://www.liberation.fr/arc/outboundfeeds/rss/category/terre/?outputType=xml"python watch.py [--config FILE] --mode MODE [OPTIONS]
| Option | Default | Description |
|---|---|---|
--config FILE |
config.yaml |
Path to topic YAML config |
--mode |
collect |
collect, alerts, or digest |
--window-hours N |
6 |
Digest window (hours) |
--send-telegram |
off | Send alerts to Telegram |
--send-whatsapp |
off | Send alerts to WhatsApp (Green API) |
--push-loki |
off | Push items to Grafana Loki |
--push-xo |
off | Push items to local OTLP collector (port 4318) |
Each config has its own project_slug, which gives it a separate DB and output files.
# Climate alerts
python watch.py --config topics/climate.yaml --mode collect
python watch.py --config topics/climate.yaml --mode alerts --send-telegram
# Elections alerts
python watch.py --config topics/elections.yaml --mode collect
python watch.py --config topics/elections.yaml --mode alerts --send-telegram# Every 30 minutes: collect + alert
*/30 * * * * cd /opt/topic-watch && .venv/bin/python watch.py --mode collect >> /var/log/topic-watch.log 2>&1
*/30 * * * * cd /opt/topic-watch && .venv/bin/python watch.py --mode alerts --send-telegram >> /var/log/topic-watch.log 2>&1
# Daily digest at 8:00
0 8 * * * cd /opt/topic-watch && .venv/bin/python watch.py --mode digest --window-hours 24 --send-telegram >> /var/log/topic-watch.log 2>&1Or use the provided helper:
chmod +x run_watch.sh
./run_watch.sh --config my-topic.yamlsetup.sh installs and starts SearXNG automatically via Docker Compose on port 8889.
run_watch.sh checks at each run that SearXNG is up and restarts it if needed.
If you prefer to manage it manually:
# Start
docker compose up -d searxng
# Stop
docker compose down
# Logs
docker compose logs -f searxngThe JSON output format (required for web queries) is pre-enabled in searxng/settings.yml.
| Variable | Required | Description |
|---|---|---|
SEARX_BASE |
Yes (web queries) | SearXNG base URL |
TELEGRAM_BOT_TOKEN |
For Telegram | Bot token from @BotFather |
TELEGRAM_CHAT_ID |
For Telegram | Target channel/group ID |
GREEN_API_INSTANCE |
For WhatsApp | Green API instance ID |
GREEN_API_TOKEN |
For WhatsApp | Green API token |
GREEN_API_CHAT_ID |
For WhatsApp | Target WhatsApp chat ID |
TWITTERAPIO_KEY |
For Twitter | twitterapi.io API key |
GRAFANA_LOKI_URL |
Optional | Loki push endpoint |
GRAFANA_LOKI_USER |
Optional | Loki basic-auth user |
GRAFANA_CLOUD_TOKEN |
Optional | Loki basic-auth token |
CRON_OUTPUT_DIR |
Optional | Override digest file output directory |
topic-watch/
├── watch.py ← main CLI entry point
├── config.yaml ← example topic config (copy and customize)
├── .env.example ← env variable template (copy to .env)
├── requirements.txt
├── setup.sh ← creates .venv and installs deps
├── run_watch.sh ← collect + alert in one call
├── engine/
│ ├── config.py ← YAML config loader
│ ├── fetch.py ← HTTP fetchers (SearXNG, RSS, Twitter, DDG)
│ ├── classify.py ← relevance, tone, risk classification
│ ├── collect.py ← query builder + collection pipeline
│ ├── db.py ← SQLite persistence
│ └── notify.py ← Telegram, WhatsApp, Loki, OTLP, digest
├── data/ ← SQLite databases (auto-created, gitignored)
└── output/ ← JSONL snapshots + digest files (gitignored)
Implement a function in engine/notify.py following the same pattern as telegram_send(), then call it from alert_mode().