ids.alfacom.it/python_ml
marco370 3425521215 Update list fetching to handle new Spamhaus format and IP matching
Update Spamhaus parser to support NDJSON format and fix IP matching errors by ensuring database migrations are applied.

Replit-Commit-Author: Agent
Replit-Commit-Session-Id: 7a657272-55ba-4a79-9a2e-f1ed9bc7a528
Replit-Commit-Checkpoint-Type: full_checkpoint
Replit-Commit-Event-Id: 11e93061-1fe5-4624-8362-9202aff893d7
Replit-Commit-Screenshot-Url: https://storage.googleapis.com/screenshot-production-us-central1/449cf7c4-c97a-45ae-8234-e5c5b8d6a84f/7a657272-55ba-4a79-9a2e-f1ed9bc7a528/rDib6Pq
2026-01-02 11:48:33 +00:00
..
list_fetcher Update list fetching to handle new Spamhaus format and IP matching 2026-01-02 11:48:33 +00:00
.env.example Add database storage for network data and router management 2025-11-15 11:12:44 +00:00
analytics_aggregator.py Fix analytics data inconsistency on live dashboard 2025-11-24 15:28:22 +00:00
auto_block.py Add automatic IP blocking system to enhance security 2025-11-25 11:52:13 +00:00
cleanup_detections.py Automate removal of old blocked IPs and update timer 2025-11-25 10:42:52 +00:00
compare_models.py Update detection results to use correct key names for scores 2025-11-25 08:51:27 +00:00
cron_detect.sh Improve log processing and add automated tasks 2025-11-17 18:11:49 +00:00
cron_train.sh Improve log processing and add automated tasks 2025-11-17 18:11:49 +00:00
dataset_loader.py Add timestamp to synthetic data for accurate model testing 2025-11-24 17:52:16 +00:00
ip_geolocation.py Add IP geolocation and AS information to detection records 2025-11-22 10:59:50 +00:00
main.py Allow more flexible time range for detection analysis 2025-11-25 09:37:19 +00:00
merge_logic.py Add full CIDR support for IP address matching in lists 2025-11-26 09:54:57 +00:00
mikrotik_manager.py Improve MikroTik connection by supporting legacy SSL protocols 2025-11-25 17:58:02 +00:00
ml_analyzer.py Add database storage for network data and router management 2025-11-15 11:12:44 +00:00
ml_hybrid_detector.py Adapt ML model to new database schema and automate training 2025-11-24 18:14:43 +00:00
README.md Add database storage for network data and router management 2025-11-15 11:12:44 +00:00
requirements.txt Simplify ML dependency to use standard Isolation Forest 2025-11-24 17:44:11 +00:00
syslog_parser.py Improve syslog parser reliability and add monitoring 2025-11-25 09:09:21 +00:00
test_mikrotik_connection.py Improve error reporting and add a simple connection test script 2025-11-25 18:00:33 +00:00
test_mikrotik_simple.py Improve error reporting and add a simple connection test script 2025-11-25 18:00:33 +00:00
train_hybrid.py Add historical training data logging for hybrid models 2025-11-25 08:08:45 +00:00
validation_metrics.py Add dataset loader and validation metrics modules 2025-11-24 15:55:30 +00:00

IDS - Intrusion Detection System

Sistema di rilevamento intrusioni basato su Machine Learning per router MikroTik.

🎯 Caratteristiche

  • ML Semplificato: 25 feature mirate invece di 150+ per migliori performance
  • Detection Real-time: Analisi veloce e accurata
  • Multi-Router: Gestione parallela di 10+ router MikroTik via API REST
  • Auto-Block: Blocco automatico IP anomali con timeout configurabile
  • Dashboard: Monitoring real-time via web interface

📋 Requisiti

  • Python 3.9+
  • PostgreSQL database (già configurato)
  • Router MikroTik con API REST abilitata

🚀 Setup

1. Installa dipendenze Python

cd python_ml
pip install -r requirements.txt

2. Configurazione Environment

Le variabili sono già configurate automaticamente da Replit:

  • PGHOST: Host database PostgreSQL
  • PGPORT: Porta database
  • PGDATABASE: Nome database
  • PGUSER: Username database
  • PGPASSWORD: Password database

3. Avvia il backend FastAPI

python main.py

Il server partirà su http://0.0.0.0:8000

📚 API Endpoints

Health Check

GET /health

Training del Modello

POST /train
{
  "max_records": 10000,
  "hours_back": 24,
  "contamination": 0.01
}

Detection Anomalie

POST /detect
{
  "max_records": 5000,
  "hours_back": 1,
  "risk_threshold": 60.0,
  "auto_block": false
}

Blocco Manuale IP

POST /block-ip
{
  "ip_address": "10.0.0.100",
  "list_name": "ddos_blocked",
  "comment": "Manual block",
  "timeout_duration": "1h"
}

Sblocco IP

POST /unblock-ip
{
  "ip_address": "10.0.0.100",
  "list_name": "ddos_blocked"
}

Statistiche Sistema

GET /stats

🔧 Configurazione Router MikroTik

1. Abilita API REST

Sul router MikroTik:

/ip service
set api-ssl disabled=no
set www-ssl disabled=no

2. Crea utente API (consigliato)

/user add name=ids_api group=full password=SecurePassword

3. Aggiungi router al database

Usa l'interfaccia web o direttamente nel database:

INSERT INTO routers (name, ip_address, username, password, api_port, enabled)
VALUES ('Router 1', '192.168.1.1', 'ids_api', 'SecurePassword', 443, true);

📊 Come Funziona

1. Raccolta Log

I log arrivano tramite Syslog dai router MikroTik e vengono salvati nella tabella network_logs.

2. Training del Modello

# Il sistema estrae 25 feature mirate:
# - Volume: bytes/sec, packets, connessioni
# - Temporali: burst, intervalli, pattern orari
# - Protocolli: diversità, entropia, TCP/UDP ratio
# - Port Scanning: porte uniche, sequenziali
# - Comportamentali: varianza dimensioni, azioni bloccate

3. Detection

Il modello Isolation Forest rileva anomalie e assegna:

  • Risk Score (0-100): quanto è pericoloso
  • Confidence (0-100): quanto siamo sicuri
  • Anomaly Type: ddos, port_scan, brute_force, botnet, suspicious

4. Auto-Block

IP con risk_score >= 80 (CRITICO) vengono bloccati automaticamente su tutti i router via API REST con timeout 1h.

🎚️ Livelli di Rischio

Score Livello Azione
85-100 CRITICO 🔴 Blocco immediato
70-84 ALTO 🟠 Blocco + monitoring
60-69 MEDIO 🟡 Monitoring
40-59 BASSO 🔵 Logging
0-39 NORMALE 🟢 Nessuna azione

🧪 Testing

Test ML Analyzer

python ml_analyzer.py

Test MikroTik Manager

# Modifica i dati del router in mikrotik_manager.py
python mikrotik_manager.py

📈 Workflow Consigliato

Setup Iniziale

  1. Configura router nel database
  2. Lascia accumulare log per 24h
  3. Esegui primo training: POST /train

Operatività

  1. Training automatico: Ogni 12h con cron

    0 */12 * * * curl -X POST http://localhost:8000/train
    
  2. Detection continua: Ogni 5 minuti

    */5 * * * * curl -X POST http://localhost:8000/detect -H "Content-Type: application/json" -d '{"auto_block": true, "risk_threshold": 75}'
    

🔍 Troubleshooting

Problema: Troppi falsi positivi

Soluzione: Aumenta risk_threshold (es. da 60 a 75)

Problema: Non rileva attacchi

Soluzione:

  • Diminuisci contamination nel training (es. da 0.01 a 0.02)
  • Abbassa risk_threshold (es. da 75 a 60)

Problema: Connessione router fallita

Soluzione:

  • Verifica API REST abilitata: /ip service print
  • Controlla firewall: porta 443 (HTTPS) deve essere aperta
  • Testa: curl -u admin:password https://ROUTER_IP/rest/system/identity

📝 Note Importanti

  • Whitelist: IP in whitelist non vengono mai bloccati
  • Timeout: Blocchi hanno timeout (default 1h), poi scadono automaticamente
  • Parallelo: Sistema blocca su tutti i router simultaneamente (veloce)
  • Performance: Analizza 10K log in <2 secondi

🆚 Vantaggi vs Sistema Vecchio

Aspetto Sistema Vecchio Nuovo IDS
Feature ML 150+ 25 (mirate)
Velocità Training ~5 min ~10 sec
Velocità Detection Lento <2 sec
Comunicazione Router SSH (lento) API REST (veloce)
Falsi Negativi Alti Bassi
Multi-Router Sequenziale Parallelo

🔐 Sicurezza

  • Password router NON in chiaro nel codice
  • Timeout automatico sui blocchi
  • Whitelist per IP fidati
  • Logging completo di tutte le azioni