Add automatic cleanup for old detections and IP blocks

Implement automated detection cleanup after 48 hours and IP unblocking after 2 hours using systemd timers and Python scripts.

Replit-Commit-Author: Agent
Replit-Commit-Session-Id: 7a657272-55ba-4a79-9a2e-f1ed9bc7a528
Replit-Commit-Checkpoint-Type: intermediate_checkpoint
Replit-Commit-Event-Id: 3809a8a0-8dd5-4b5a-9e32-9e075dab335e
Replit-Commit-Screenshot-Url: https://storage.googleapis.com/screenshot-production-us-central1/449cf7c4-c97a-45ae-8234-e5c5b8d6a84f/7a657272-55ba-4a79-9a2e-f1ed9bc7a528/L6QSDnx
This commit is contained in:
marco370 2025-11-25 10:40:44 +00:00
parent 313bdfb068
commit 791b7caa4d
7 changed files with 663 additions and 137 deletions

View File

@ -0,0 +1,326 @@
# IDS - Guida Cleanup Detections Automatico
## 📋 Overview
Sistema automatico di pulizia delle detections e gestione IP bloccati secondo regole temporali:
1. **Cleanup Detections**: Elimina detections non bloccate più vecchie di **48 ore**
2. **Auto-Unblock**: Sblocca IP bloccati da più di **2 ore** senza nuove anomalie
## ⚙️ Componenti
### 1. Script Python: `python_ml/cleanup_detections.py`
Script principale che esegue le operazioni di cleanup:
- Elimina detections vecchie dal database
- Marca come "sbloccati" gli IP nel DB (NON rimuove da MikroTik firewall!)
- Logging completo in `/var/log/ids/cleanup.log`
### 2. Wrapper Bash: `deployment/run_cleanup.sh`
Wrapper che carica le variabili d'ambiente e esegue lo script Python.
### 3. Systemd Service: `ids-cleanup.service`
Service oneshot che esegue il cleanup una volta.
### 4. Systemd Timer: `ids-cleanup.timer`
Timer che esegue il cleanup **ogni ora alle XX:10** (es. 10:10, 11:10, 12:10...).
## 🚀 Installazione
```bash
cd /opt/ids
# Esegui setup automatico
sudo ./deployment/setup_cleanup_timer.sh
# Output:
# ✅ Cleanup timer installato e avviato con successo!
```
## 📊 Monitoraggio
### Stato Timer
```bash
# Verifica che il timer sia attivo
sudo systemctl status ids-cleanup.timer
# Prossima esecuzione programmata
systemctl list-timers ids-cleanup.timer
```
### Log
```bash
# Real-time log
tail -f /var/log/ids/cleanup.log
# Ultime 50 righe
tail -50 /var/log/ids/cleanup.log
# Log completo
cat /var/log/ids/cleanup.log
```
## 🔧 Uso Manuale
### Esecuzione Immediata
```bash
# Via systemd (consigliato)
sudo systemctl start ids-cleanup.service
# Oppure direttamente
sudo ./deployment/run_cleanup.sh
```
### Test con Output Verbose
```bash
cd /opt/ids
source .env
python3 python_ml/cleanup_detections.py
```
## 📝 Regole di Cleanup
### Regola 1: Cleanup Detections (48 ore)
**Query SQL**:
```sql
DELETE FROM detections
WHERE detected_at < NOW() - INTERVAL '48 hours'
AND blocked = false
```
**Logica**:
- Se un IP è stato rilevato ma **non bloccato**
- E non ci sono nuove detections da **48 ore**
- → Eliminalo dal database
**Esempio**:
- IP `1.2.3.4` rilevato il 23/11 alle 10:00
- Non bloccato (risk_score < 80)
- Nessuna nuova detection per 48 ore
- → **25/11 alle 10:10** → IP eliminato ✅
### Regola 2: Auto-Unblock (2 ore)
**Query SQL**:
```sql
UPDATE detections
SET blocked = false, blocked_at = NULL
WHERE blocked = true
AND blocked_at < NOW() - INTERVAL '2 hours'
AND NOT EXISTS (
SELECT 1 FROM detections d2
WHERE d2.source_ip = detections.source_ip
AND d2.detected_at > NOW() - INTERVAL '2 hours'
)
```
**Logica**:
- Se un IP è **bloccato**
- E bloccato da **più di 2 ore**
- E **nessuna nuova detection** nelle ultime 2 ore
- → Sbloccalo nel DB
**⚠️ ATTENZIONE**: Questo sblocca solo nel **database**, NON rimuove l'IP dalle **firewall list MikroTik**!
**Esempio**:
- IP `5.6.7.8` bloccato il 25/11 alle 08:00
- Nessuna nuova detection per 2 ore
- → **25/11 alle 10:10**`blocked=false` nel DB ✅
- → **ANCORA nella firewall MikroTik**
### Come rimuovere da MikroTik
```bash
# Via API ML Backend
curl -X POST http://localhost:8000/unblock-ip \
-H "Content-Type: application/json" \
-d '{"ip_address": "5.6.7.8"}'
```
## 🛠️ Configurazione
### Modifica Intervalli
#### Cambia soglia cleanup (es. 72 ore invece di 48)
Modifica `python_ml/cleanup_detections.py`:
```python
# Linea ~47
deleted_count = cleanup_old_detections(conn, hours=72) # ← Cambia qui
```
#### Cambia soglia unblock (es. 4 ore invece di 2)
Modifica `python_ml/cleanup_detections.py`:
```python
# Linea ~51
unblocked_count = unblock_old_ips(conn, hours=4) # ← Cambia qui
```
### Modifica Frequenza Esecuzione
Modifica `deployment/systemd/ids-cleanup.timer`:
```ini
[Timer]
# Ogni 6 ore invece di ogni ora
OnCalendar=00/6:10:00
```
Dopo le modifiche:
```bash
sudo systemctl daemon-reload
sudo systemctl restart ids-cleanup.timer
```
## 📊 Output Esempio
```
============================================================
CLEANUP DETECTIONS - Avvio
============================================================
✅ Connesso al database
[1/2] Cleanup detections vecchie...
Trovate 45 detections da eliminare (più vecchie di 48h)
✅ Eliminate 45 detections vecchie
[2/2] Sblocco IP vecchi...
Trovati 3 IP da sbloccare (bloccati da più di 2h)
- 1.2.3.4 (tipo: ddos, score: 85.2)
- 5.6.7.8 (tipo: port_scan, score: 82.1)
- 9.10.11.12 (tipo: brute_force, score: 90.5)
✅ Sbloccati 3 IP nel database
⚠️ ATTENZIONE: IP ancora presenti nelle firewall list MikroTik!
💡 Per rimuoverli dai router, usa: curl -X POST http://localhost:8000/unblock-ip -d '{"ip_address": "X.X.X.X"}'
============================================================
CLEANUP COMPLETATO
- Detections eliminate: 45
- IP sbloccati (DB): 3
============================================================
```
## 🔍 Troubleshooting
### Timer non parte
```bash
# Verifica che il timer sia enabled
sudo systemctl is-enabled ids-cleanup.timer
# Se disabled, abilita
sudo systemctl enable ids-cleanup.timer
sudo systemctl start ids-cleanup.timer
```
### Errori nel log
```bash
# Controlla errori
grep ERROR /var/log/ids/cleanup.log
# Controlla connessione DB
grep "Connesso al database" /var/log/ids/cleanup.log
```
### Test connessione DB
```bash
cd /opt/ids
source .env
python3 -c "
import psycopg2
conn = psycopg2.connect(
host='$PGHOST',
port=$PGPORT,
user='$PGUSER',
password='$PGPASSWORD',
database='$PGDATABASE'
)
print('✅ DB connesso!')
conn.close()
"
```
## 📈 Metriche
### Query per statistiche
```sql
-- Detections per età
SELECT
CASE
WHEN detected_at > NOW() - INTERVAL '2 hours' THEN '< 2h'
WHEN detected_at > NOW() - INTERVAL '24 hours' THEN '< 24h'
WHEN detected_at > NOW() - INTERVAL '48 hours' THEN '< 48h'
ELSE '> 48h'
END as age_group,
COUNT(*) as count,
COUNT(CASE WHEN blocked THEN 1 END) as blocked_count
FROM detections
GROUP BY age_group
ORDER BY age_group;
-- IP bloccati per durata
SELECT
source_ip,
blocked_at,
EXTRACT(EPOCH FROM (NOW() - blocked_at)) / 3600 as hours_blocked,
anomaly_type,
risk_score::numeric
FROM detections
WHERE blocked = true
ORDER BY blocked_at DESC;
```
## ⚙️ Integrazione con Altri Sistemi
### Notifiche Email (opzionale)
Aggiungi a `python_ml/cleanup_detections.py`:
```python
import smtplib
from email.mime.text import MIMEText
if unblocked_count > 0:
msg = MIMEText(f"Sbloccati {unblocked_count} IP")
msg['Subject'] = 'IDS Cleanup Report'
msg['From'] = 'ids@example.com'
msg['To'] = 'admin@example.com'
s = smtplib.SMTP('localhost')
s.send_message(msg)
s.quit()
```
### Webhook (opzionale)
```python
import requests
requests.post('https://hooks.slack.com/...', json={
'text': f'IDS Cleanup: {deleted_count} detections eliminate, {unblocked_count} IP sbloccati'
})
```
## 🔒 Sicurezza
- Script eseguito come **root** (necessario per systemd)
- Credenziali DB caricate da `.env` (NON hardcoded)
- Log in `/var/log/ids/` con permessi `644`
- Service con `NoNewPrivileges=true` e `PrivateTmp=true`
## 📅 Scheduler
Il timer è configurato per eseguire:
- **Frequenza**: Ogni ora
- **Minuto**: XX:10 (10 minuti dopo l'ora)
- **Randomizzazione**: ±5 minuti per load balancing
- **Persistent**: Recupera esecuzioni perse durante downtime
**Esempio orari**: 00:10, 01:10, 02:10, ..., 23:10
## ✅ Checklist Post-Installazione
- [ ] Timer installato: `systemctl status ids-cleanup.timer`
- [ ] Prossima esecuzione visibile: `systemctl list-timers`
- [ ] Test manuale OK: `sudo ./deployment/run_cleanup.sh`
- [ ] Log creato: `ls -la /var/log/ids/cleanup.log`
- [ ] Nessun errore nel log: `grep ERROR /var/log/ids/cleanup.log`
- [ ] Cleanup funzionante: verificare conteggio detections prima/dopo
## 🆘 Supporto
Per problemi o domande:
1. Controlla log: `tail -f /var/log/ids/cleanup.log`
2. Verifica timer: `systemctl status ids-cleanup.timer`
3. Test manuale: `sudo ./deployment/run_cleanup.sh`
4. Apri issue su GitHub o contatta il team

48
deployment/run_cleanup.sh Normal file
View File

@ -0,0 +1,48 @@
#!/bin/bash
# =========================================================
# IDS - Cleanup Detections Runner
# =========================================================
# Esegue cleanup automatico delle detections secondo regole:
# - Cancella detections non anomale dopo 48h
# - Sblocca IP bloccati se non più anomali dopo 2h
#
# Uso: ./run_cleanup.sh
# =========================================================
set -e
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/.." && pwd)"
# Carica variabili ambiente
if [ -f "$PROJECT_ROOT/.env" ]; then
set -a
source "$PROJECT_ROOT/.env"
set +a
else
echo "❌ File .env non trovato in $PROJECT_ROOT"
exit 1
fi
# Log
LOG_FILE="/var/log/ids/cleanup.log"
mkdir -p /var/log/ids
echo "=========================================" >> "$LOG_FILE"
echo "[$(date)] Cleanup automatico avviato" >> "$LOG_FILE"
echo "=========================================" >> "$LOG_FILE"
# Esegui cleanup
cd "$PROJECT_ROOT"
python3 python_ml/cleanup_detections.py >> "$LOG_FILE" 2>&1
EXIT_CODE=$?
if [ $EXIT_CODE -eq 0 ]; then
echo "[$(date)] Cleanup completato con successo" >> "$LOG_FILE"
else
echo "[$(date)] Cleanup fallito (exit code: $EXIT_CODE)" >> "$LOG_FILE"
fi
echo "" >> "$LOG_FILE"
exit $EXIT_CODE

View File

@ -0,0 +1,64 @@
#!/bin/bash
# =========================================================
# IDS - Setup Cleanup Timer
# =========================================================
# Installa e avvia il timer systemd per cleanup automatico
#
# Uso: sudo ./deployment/setup_cleanup_timer.sh
# =========================================================
set -e
if [ "$EUID" -ne 0 ]; then
echo "❌ Questo script deve essere eseguito come root (sudo)"
exit 1
fi
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
echo "🔧 Setup IDS Cleanup Timer..."
echo ""
# 1. Crea directory log
echo "[1/6] Creazione directory log..."
mkdir -p /var/log/ids
chmod 755 /var/log/ids
# 2. Rendi eseguibili gli script
echo "[2/6] Permessi esecuzione script..."
chmod +x "$SCRIPT_DIR/run_cleanup.sh"
chmod +x "$SCRIPT_DIR/../python_ml/cleanup_detections.py"
# 3. Copia service file
echo "[3/6] Installazione service file..."
cp "$SCRIPT_DIR/systemd/ids-cleanup.service" /etc/systemd/system/
cp "$SCRIPT_DIR/systemd/ids-cleanup.timer" /etc/systemd/system/
# 4. Reload systemd
echo "[4/6] Reload systemd daemon..."
systemctl daemon-reload
# 5. Abilita timer
echo "[5/6] Abilitazione timer..."
systemctl enable ids-cleanup.timer
# 6. Avvia timer
echo "[6/6] Avvio timer..."
systemctl start ids-cleanup.timer
echo ""
echo "✅ Cleanup timer installato e avviato con successo!"
echo ""
echo "📊 Status:"
systemctl status ids-cleanup.timer --no-pager -l
echo ""
echo "📅 Prossima esecuzione:"
systemctl list-timers ids-cleanup.timer --no-pager
echo ""
echo "💡 Comandi utili:"
echo " - Test manuale: sudo ./deployment/run_cleanup.sh"
echo " - Esegui ora: sudo systemctl start ids-cleanup.service"
echo " - Stato timer: sudo systemctl status ids-cleanup.timer"
echo " - Log cleanup: tail -f /var/log/ids/cleanup.log"
echo " - Disabilita timer: sudo systemctl stop ids-cleanup.timer && sudo systemctl disable ids-cleanup.timer"
echo ""

View File

@ -0,0 +1,26 @@
[Unit]
Description=IDS Cleanup Detections Service
Documentation=https://github.com/yourusername/ids
After=network.target postgresql.service
[Service]
Type=oneshot
User=root
WorkingDirectory=/opt/ids
EnvironmentFile=/opt/ids/.env
ExecStart=/opt/ids/deployment/run_cleanup.sh
# Logging
StandardOutput=append:/var/log/ids/cleanup.log
StandardError=append:/var/log/ids/cleanup.log
# Security
NoNewPrivileges=true
PrivateTmp=true
# Restart policy (non necessario per oneshot)
# Restart=on-failure
# RestartSec=30
[Install]
WantedBy=multi-user.target

View File

@ -0,0 +1,18 @@
[Unit]
Description=IDS Cleanup Detections Timer
Documentation=https://github.com/yourusername/ids
Requires=ids-cleanup.service
[Timer]
# Esegui ogni ora, 10 minuti dopo l'ora (es. 10:10, 11:10, 12:10...)
OnCalendar=hourly
OnCalendar=*:10:00
# Esegui subito se il sistema era spento durante l'esecuzione programmata
Persistent=true
# Randomizza esecuzione di ±5 minuti per evitare picchi di carico
RandomizedDelaySec=300
[Install]
WantedBy=timers.target

View File

@ -0,0 +1,169 @@
#!/usr/bin/env python3
"""
IDS - Cleanup Detections Script
================================
Automatizza la pulizia delle detections e lo sblocco degli IP secondo le regole:
1. Cancella detections non anomale dopo 48 ore
2. Sblocca IP bloccati se non più anomali dopo 2 ore
Esecuzione: Ogni ora via cron/systemd timer
"""
import os
import sys
import logging
from datetime import datetime, timedelta
import psycopg2
from psycopg2.extras import RealDictCursor
from dotenv import load_dotenv
# Setup logging
logging.basicConfig(
level=logging.INFO,
format='[%(asctime)s] %(levelname)s: %(message)s',
handlers=[
logging.FileHandler('/var/log/ids/cleanup.log'),
logging.StreamHandler(sys.stdout)
]
)
logger = logging.getLogger(__name__)
# Load environment
load_dotenv()
def get_db_connection():
"""Connessione al database PostgreSQL"""
return psycopg2.connect(
host=os.getenv('PGHOST', 'localhost'),
port=int(os.getenv('PGPORT', 5432)),
user=os.getenv('PGUSER'),
password=os.getenv('PGPASSWORD'),
database=os.getenv('PGDATABASE')
)
def cleanup_old_detections(conn, hours=48):
"""
Cancella detections vecchie di più di N ore.
Logica: Se un IP è stato rilevato ma dopo 48 ore non è più
considerato anomalo (non appare in nuove detections), eliminalo.
"""
cursor = conn.cursor(cursor_factory=RealDictCursor)
cutoff_time = datetime.now() - timedelta(hours=hours)
# Conta detections da eliminare
cursor.execute("""
SELECT COUNT(*) as count
FROM detections
WHERE detected_at < %s
AND blocked = false
""", (cutoff_time,))
count = cursor.fetchone()['count']
if count > 0:
logger.info(f"Trovate {count} detections da eliminare (più vecchie di {hours}h)")
# Elimina
cursor.execute("""
DELETE FROM detections
WHERE detected_at < %s
AND blocked = false
""", (cutoff_time,))
conn.commit()
logger.info(f"✅ Eliminate {cursor.rowcount} detections vecchie")
else:
logger.info(f"Nessuna detection da eliminare (soglia: {hours}h)")
cursor.close()
return count
def unblock_old_ips(conn, hours=2):
"""
Sblocca IP bloccati da più di N ore.
Logica: Se un IP è stato bloccato ma dopo 2 ore non è più
anomalo (nessuna nuova detection), sbloccalo dal DB.
NOTA: Questo NON rimuove l'IP dalle firewall list dei router MikroTik.
Per quello serve chiamare l'API /unblock-ip del ML backend.
"""
cursor = conn.cursor(cursor_factory=RealDictCursor)
cutoff_time = datetime.now() - timedelta(hours=hours)
# Trova IP bloccati da più di N ore senza nuove detections
cursor.execute("""
SELECT d.source_ip, d.blocked_at, d.anomaly_type, d.risk_score
FROM detections d
WHERE d.blocked = true
AND d.blocked_at < %s
AND NOT EXISTS (
SELECT 1 FROM detections d2
WHERE d2.source_ip = d.source_ip
AND d2.detected_at > %s
)
""", (cutoff_time, cutoff_time))
ips_to_unblock = cursor.fetchall()
if ips_to_unblock:
logger.info(f"Trovati {len(ips_to_unblock)} IP da sbloccare (bloccati da più di {hours}h)")
for ip_data in ips_to_unblock:
ip = ip_data['source_ip']
logger.info(f" - {ip} (tipo: {ip_data['anomaly_type']}, score: {ip_data['risk_score']})")
# Aggiorna DB
cursor.execute("""
UPDATE detections
SET blocked = false, blocked_at = NULL
WHERE source_ip = %s
""", (ip,))
conn.commit()
logger.info(f"✅ Sbloccati {len(ips_to_unblock)} IP nel database")
logger.warning("⚠️ ATTENZIONE: IP ancora presenti nelle firewall list MikroTik!")
logger.info("💡 Per rimuoverli dai router, usa: curl -X POST http://localhost:8000/unblock-ip -d '{\"ip_address\": \"X.X.X.X\"}'")
else:
logger.info(f"Nessun IP da sbloccare (soglia: {hours}h)")
cursor.close()
return len(ips_to_unblock)
def main():
"""Esecuzione cleanup completo"""
logger.info("=" * 60)
logger.info("CLEANUP DETECTIONS - Avvio")
logger.info("=" * 60)
try:
conn = get_db_connection()
logger.info("✅ Connesso al database")
# 1. Cleanup detections vecchie (48h)
logger.info("\n[1/2] Cleanup detections vecchie...")
deleted_count = cleanup_old_detections(conn, hours=48)
# 2. Sblocco IP vecchi (2h)
logger.info("\n[2/2] Sblocco IP vecchi...")
unblocked_count = unblock_old_ips(conn, hours=2)
conn.close()
logger.info("\n" + "=" * 60)
logger.info("CLEANUP COMPLETATO")
logger.info(f" - Detections eliminate: {deleted_count}")
logger.info(f" - IP sbloccati (DB): {unblocked_count}")
logger.info("=" * 60)
return 0
except Exception as e:
logger.error(f"❌ Errore durante cleanup: {e}", exc_info=True)
return 1
if __name__ == "__main__":
sys.exit(main())

147
replit.md
View File

@ -20,17 +20,18 @@ This project is a full-stack web application for an Intrusion Detection System (
- Commit message: italiano - Commit message: italiano
## System Architecture ## System Architecture
The IDS employs a React-based frontend for real-time monitoring, detection visualization, and whitelist management, built with ShadCN UI and TanStack Query. The backend consists of a Python FastAPI service dedicated to ML analysis (Isolation Forest with 25 targeted features), MikroTik API management, and a detection engine that scores anomalies from 0-100 across five risk levels. A Node.js (Express) backend handles API requests from the frontend, manages the PostgreSQL database, and coordinates service operations. The IDS employs a React-based frontend for real-time monitoring, detection visualization, and whitelist management, built with ShadCN UI and TanStack Query. The backend consists of a Python FastAPI service dedicated to ML analysis and a Node.js (Express) backend handling API requests, PostgreSQL database management, and service coordination.
**Key Architectural Decisions & Features:** **Key Architectural Decisions & Features:**
- **Log Collection & Processing**: MikroTik syslog data (UDP:514) is sent to RSyslog, parsed by `syslog_parser.py`, and stored in PostgreSQL. The parser includes auto-cleanup with a 3-day retention policy. - **Log Collection & Processing**: MikroTik syslog data (UDP:514) is parsed by `syslog_parser.py` and stored in PostgreSQL with a 3-day retention policy. The parser includes auto-reconnect and error recovery mechanisms.
- **Machine Learning**: An Isolation Forest model trained on 25 network log features performs real-time anomaly detection, assigning a risk score. - **Machine Learning**: An Isolation Forest model (sklearn.IsolationForest) trained on 25 network log features performs real-time anomaly detection, assigning a risk score (0-100 across five risk levels). A hybrid ML detector (Isolation Forest + Ensemble Classifier with weighted voting) reduces false positives. The system supports weekly automatic retraining of models.
- **Automated Blocking**: Critical IPs (score >= 80) are automatically blocked in parallel across all configured MikroTik routers via their REST API. - **Automated Blocking**: Critical IPs (score >= 80) are automatically blocked in parallel across configured MikroTik routers via their REST API.
- **Service Monitoring & Management**: A dashboard provides real-time status (green/red indicators) for the ML Backend, Database, and Syslog Parser. Service management (start/stop/restart) for Python services is available via API endpoints, secured with API key authentication and Systemd integration for production-grade control and auto-restart capabilities. - **Automatic Cleanup**: An hourly systemd timer (`cleanup_detections.py`) removes old detections (48h) and auto-unblocks IPs (2h).
- **IP Geolocation**: Integrated `ip-api.com` for enriching detection data with geographical and Autonomous System (AS) information, including intelligent caching. - **Service Monitoring & Management**: A dashboard provides real-time status (ML Backend, Database, Syslog Parser). API endpoints, secured with API key authentication and Systemd integration, allow for service management (start/stop/restart) of Python services.
- **Database Management**: PostgreSQL is used for all persistent data. An intelligent database versioning system ensures efficient SQL migrations, applying only new scripts. Dual-mode database drivers (`@neondatabase/serverless` for Replit, `pg` for AlmaLinux) ensure environment compatibility. - **IP Geolocation**: Integration with `ip-api.com` enriches detection data with geographical and AS information, utilizing intelligent caching.
- **Database Management**: PostgreSQL is used for all persistent data. An intelligent database versioning system ensures efficient SQL migrations. Dual-mode database drivers (`@neondatabase/serverless` for Replit, `pg` for AlmaLinux) ensure environment compatibility.
- **Microservices**: Clear separation of concerns between the Python ML backend and the Node.js API backend. - **Microservices**: Clear separation of concerns between the Python ML backend and the Node.js API backend.
- **UI/UX**: Utilizes ShadCN UI for a modern component library and `react-hook-form` with Zod for robust form validation. - **UI/UX**: Utilizes ShadCN UI for a modern component library and `react-hook-form` with Zod for robust form validation. Analytics dashboards provide visualizations of normal and attack traffic, including real-time and historical data.
## External Dependencies ## External Dependencies
- **React**: Frontend framework. - **React**: Frontend framework.
@ -39,7 +40,8 @@ The IDS employs a React-based frontend for real-time monitoring, detection visua
- **MikroTik API REST**: For router communication and IP blocking. - **MikroTik API REST**: For router communication and IP blocking.
- **ShadCN UI**: Frontend component library. - **ShadCN UI**: Frontend component library.
- **TanStack Query**: Data fetching for the frontend. - **TanStack Query**: Data fetching for the frontend.
- **Isolation Forest**: Machine Learning algorithm for anomaly detection. - **Isolation Forest (scikit-learn)**: Machine Learning algorithm for anomaly detection.
- **xgboost, joblib**: ML libraries used in the hybrid detector.
- **RSyslog**: Log collection daemon. - **RSyslog**: Log collection daemon.
- **Drizzle ORM**: For database schema definition in Node.js. - **Drizzle ORM**: For database schema definition in Node.js.
- **Neon Database**: Cloud-native PostgreSQL service (used in Replit). - **Neon Database**: Cloud-native PostgreSQL service (used in Replit).
@ -47,130 +49,3 @@ The IDS employs a React-based frontend for real-time monitoring, detection visua
- **psycopg2**: PostgreSQL adapter for Python. - **psycopg2**: PostgreSQL adapter for Python.
- **ip-api.com**: External API for IP geolocation data. - **ip-api.com**: External API for IP geolocation data.
- **Recharts**: Charting library for analytics visualization. - **Recharts**: Charting library for analytics visualization.
## Recent Updates (Novembre 2025)
### 🛡️ Syslog Parser Resilience & Monitoring (25 Nov 2025 - 11:00)
- **Feature**: Parser resiliente con auto-recovery e monitoring automatico
- **Problema Risolto**: Parser si bloccava periodicamente (ultimo: 24 Nov mattina)
- **Root Cause**: Database connection timeout, eccezioni non gestite, cleanup bloccante
- **Soluzioni Implementate**:
1. **Auto-Reconnect**: Riconnessione automatica su DB timeout
2. **Error Recovery**: Continue processing dopo eccezioni (non crashare!)
3. **Health Check**: Log ogni 5 minuti `[HEALTH] Parser alive: X righe, Y salvate, Z errori`
4. **Monitoring Script**: `deployment/check_parser_health.sh` (cron ogni 5 min)
5. **Auto-Restart**: Se ultimo log > 5 min fa → restart automatico
- **Files Modificati**:
- `python_ml/syslog_parser.py` - metodo `reconnect_db()` + try/catch nidificati
- `deployment/check_parser_health.sh` - health check con auto-restart
- `deployment/setup_parser_monitoring.sh` - setup cron job
- `deployment/TROUBLESHOOTING_SYSLOG_PARSER.md` - guida completa
- **Timestamp Detection Chiariti**:
- `first_seen/last_seen`: timestamp dei log network_logs (es. 18:46:21)
- `detected_at`: quando ML backend rileva anomalia (es. 19:45 - 1 ora dopo!)
- Il delay è normale: ML backend esegue analisi batch ogni ora
- **Deploy**: `./update_from_git.sh``sudo systemctl restart ids-syslog-parser``sudo ./deployment/setup_parser_monitoring.sh`
- **Monitoring**: `tail -f /var/log/ids/parser-health.log`
### 🔧 Analytics Aggregator Fix - Data Consistency (24 Nov 2025 - 17:00)
- **BUG FIX CRITICO**: Risolto mismatch dati Dashboard Live
- **Problema**: Distribuzione traffico mostrava 262k attacchi ma breakdown solo 19
- **ROOT CAUSE**: Aggregatore contava **occorrenze** invece di **pacchetti** in `attacks_by_type` e `attacks_by_country`
- **Soluzione**:
1. Spostato conteggio da loop detections → loop pacchetti
2. `attacks_by_type[tipo] += packets` (non +1!)
3. `attacks_by_country[paese] += packets` (non +1!)
4. Fallback "unknown"/"Unknown" per dati mancanti (tipo/geo)
5. Logging validazione: verifica breakdown_total == attack_packets
- **Invariante matematica**: `Σ(attacks_by_type) == Σ(attacks_by_country) == attack_packets`
- **Files modificati**: `python_ml/analytics_aggregator.py`
- **Deploy**: Restart ML backend + aggregator run manuale per testare
- **Validazione**: Log mostra `match: True` e nessun warning mismatch
### 📊 Network Analytics & Dashboard System (24 Nov 2025 - 11:30)
- **Feature Completa**: Sistema analytics con traffico normale + attacchi, visualizzazioni grafiche avanzate, dati permanenti
- **Componenti**:
1. **Database**: `network_analytics` table con aggregazioni orarie/giornaliere permanenti
2. **Aggregatore Python**: `analytics_aggregator.py` classifica traffico ogni ora
3. **Systemd Timer**: Esecuzione automatica ogni ora (:05 minuti)
4. **API**: `/api/analytics/recent` e `/api/analytics/range`
5. **Frontend**: Dashboard Live (real-time 3 giorni) + Analytics Storici (permanente)
- **Grafici**: Area Chart, Pie Chart, Bar Chart, Line Chart, Real-time Stream
- **Flag Emoji**: 🇮🇹🇺🇸🇷🇺🇨🇳 per identificazione immediata paese origine
- **Deploy**: Migration 005 + `./deployment/setup_analytics_timer.sh`
- **Security Fix**: Rimosso hardcoded path, implementato wrapper script sicuro `run_analytics.sh` per esecuzioni manuali
- **Production-grade**: Credenziali gestite via systemd EnvironmentFile (automatico) o wrapper script (manuale)
- **Frontend Fix**: Analytics History ora usa dati orari (`hourly: true`) finché aggregazione daily non è schedulata
### 🌍 IP Geolocation Integration (22 Nov 2025 - 13:00)
- **Feature**: Informazioni geografiche complete (paese, città, organizzazione, AS) per ogni IP
- **API**: ip-api.com con batch async lookup (100 IP in ~1.5s invece di 150s!)
- **Performance**: Caching intelligente + fallback robusto
- **Display**: Globe/Building/MapPin icons nella pagina Detections
- **Deploy**: Migration 004 + restart ML backend
### 🤖 Hybrid ML Detector - False Positive Reduction System (24 Nov 2025)
- **Obiettivo**: Riduzione falsi positivi 80-90% mantenendo alta detection accuracy
- **Architettura**:
1. **Isolation Forest (sklearn)**: n_estimators=250, contamination=0.03 (tuning scientifico)
2. **Feature Selection**: Chi-Square test riduce 25→18 feature più rilevanti
3. **Ensemble Classifier**: DT + RF + XGBoost con voting ponderato (1:2:2)
4. **Confidence Scoring**: 3-tier system (High≥95%, Medium≥70%, Low<70%)
5. **Validation Framework**: CICIDS2017 dataset con Precision/Recall/F1/FPR metrics
- **Componenti**:
- `python_ml/ml_hybrid_detector.py` - Core detector con IF + ensemble + feature selection
- `python_ml/dataset_loader.py` - CICIDS2017 loader con mappatura 80→25 features
- `python_ml/validation_metrics.py` - Production-grade metrics calculator
- `python_ml/train_hybrid.py` - CLI training script (test/train/validate)
- **Dipendenze ML**: xgboost==2.0.3, joblib==1.3.2, scikit-learn==1.3.2
- **Backward Compatibility**: USE_HYBRID_DETECTOR env var (default=true)
- **Target Metrics**: Precision≥90%, Recall≥80%, FPR≤5%, F1≥85%
- **Deploy**: Vedere `deployment/CHECKLIST_ML_HYBRID.md`
#### 🎯 Decisione Architetturale - sklearn.IsolationForest (24 Nov 2025 - 22:00)
- **Problema Deploy**: eif==2.0.2 incompatibile con Python 3.11 (richiede distutils rimosso, API Cython obsolete, fermo dal 2021)
- **Tentativi falliti** (1+ ora bloccati): Build isolation flags, Cython pre-install, PIP_NO_BUILD_ISOLATION, Python downgrade consideration
- **Analisi Architect**:
- Extended IF (eif) NON supporta Python ≥3.11 (incompatibilità fondamentale C++/Cython)
- Downgrade Python 3.10 = ricreare venv + 50 dipendenze (rischio regressioni, EOL 2026)
- PyOD NON ha Extended IF (solo standard IF wrapper sklearn - fonte verificata)
- **Codice aveva GIÀ fallback funzionante** a `sklearn.ensemble.IsolationForest`!
- **DECISIONE FINALE**: Usare sklearn.IsolationForest (fallback pre-esistente)
- ✅ Compatibile Python 3.11+ (wheels pre-compilati, zero compilazione)
- ✅ **ZERO modifica codice** (fallback già implementato con flag EIF_AVAILABLE)
- ✅ Target metrics raggiungibili con IF standard + ensemble + feature selection
- ✅ Production-grade, libreria scikit-learn mantenuta e stabile
- ✅ Installazione semplificata: `pip install xgboost joblib` (2 step invece di 4!)
- **Files modificati**:
- `requirements.txt`: Rimosso `eif==2.0.2` e `Cython==3.0.5` (non più necessari)
- `deployment/install_ml_deps.sh`: Semplificato da 4 a 2 step, nessuna compilazione
- `deployment/CHECKLIST_ML_HYBRID.md`: Aggiornato con nuove istruzioni semplificate
### 🔄 Database Schema Adaptation & Auto-Training (24 Nov 2025 - 23:30)
- **Database Schema Fix**: Adattato ML detector allo schema reale `network_logs`
- Query SQL corretta: `destination_ip` (non `dest_ip`), `destination_port` (non `dest_port`)
- Feature extraction: supporto `packet_length` invece di `packets`/`bytes` separati
- Backward compatible: funziona sia con schema MikroTik che dataset CICIDS2017
- **Training Automatico Settimanale**:
- Script wrapper: `deployment/run_ml_training.sh` (carica credenziali da .env)
- Systemd service: `ids-ml-training.service`
- Systemd timer: `ids-ml-training.timer` (ogni Lunedì 03:00 AM)
- Setup automatico: `./deployment/setup_ml_training_timer.sh`
- Log persistenti: `/var/log/ids/ml-training.log`
- **Workflow Completo**:
1. Timer systemd esegue training settimanale automatico
2. Script carica ultimi 7 giorni di traffico dal database (234M+ records)
3. Training Hybrid ML (IF + Ensemble + Feature Selection)
4. Modelli salvati in `python_ml/models/`
5. ML backend li carica automaticamente al prossimo riavvio
- **Files creati**:
- `deployment/run_ml_training.sh` - Wrapper sicuro per training
- `deployment/train_hybrid_production.sh` - Script training manuale completo
- `deployment/systemd/ids-ml-training.service` - Service systemd
- `deployment/systemd/ids-ml-training.timer` - Timer settimanale
- `deployment/setup_ml_training_timer.sh` - Setup automatico
- **Files modificati**:
- `python_ml/train_hybrid.py` - Query SQL adattata allo schema DB reale
- `python_ml/ml_hybrid_detector.py` - Supporto `packet_length`, backward compatible
- `python_ml/dataset_loader.py` - Fix timestamp mancante in dataset sintetico
- **Impatto**: Sistema userà automaticamente sklearn IF tramite fallback, tutti gli 8 checkpoint fail-fast funzionano identicamente