ids.alfacom.it/replit.md

# IDS - Intrusion Detection System

Sistema di rilevamento intrusioni per router MikroTik basato su Machine Learning.

## Progetto

**Tipo**: Full-stack Web Application + Python ML Backend
**Stack**: React + FastAPI + PostgreSQL + MikroTik API REST

## Architettura

### Frontend (React)
- Dashboard monitoring real-time
- Visualizzazione detections e router
- Gestione whitelist
- ShadCN UI components
- TanStack Query per data fetching

### Backend Python (FastAPI)
- **ML Analyzer**: Isolation Forest con 25 feature mirate
- **MikroTik Manager**: Comunicazione API REST parallela con 10+ router
- **Detection Engine**: Scoring 0-100 con 5 livelli di rischio
- Endpoints: /train, /detect, /block-ip, /unblock-ip, /stats

### Backend Node.js (Express)
- API REST per frontend
- Gestione database PostgreSQL
- Routes: routers, detections, logs, whitelist, training-history

### Database (PostgreSQL)
- `routers`: Configurazione router MikroTik
- `network_logs`: Log syslog da router
- `detections`: Anomalie rilevate dal ML
- `whitelist`: IP fidati
- `training_history`: Storia training modelli

## Workflow

1. **Log Collection**: Router → Syslog (UDP:514) → RSyslog → syslog_parser.py → PostgreSQL `network_logs`
2. **Training**: Python ML estrae 25 feature → Isolation Forest
3. **Detection**: Analisi real-time → Scoring 0-100 → Classificazione
4. **Auto-Block**: IP critico (>=80) → API REST → Tutti i router (parallelo)

## Fix Recenti (Novembre 2025)

### ✅ Sistema Completamente Funzionante (17 Nov 2025 - 19:30)
- **Backend Python FastAPI**: ✅ Porta 8000, modello ML caricato, endpoint /stats funzionante
- **Database PostgreSQL**: ✅ 5 tabelle (network_logs, detections, routers, whitelist, training_history)
- **Syslog Parser**: ✅ Funzionante, log salvati continuamente
- **Pattern Regex**: ✅ Match rate 99.9% su log MikroTik reali
- **ML Detection**: ✅ Modello Isolation Forest addestrato, pronto per detection automatica
- **Deployment**: ✅ Git workflow automatizzato con `push-gitlab.sh` e `update_from_git.sh --db`

### Backend FastAPI Fix (17 Nov 2025 - 19:30)
- **Problema**: Endpoint `/stats` falliva con errore 500
- **Causa 1**: Colonna `logged_at` non esiste (nome corretto: `timestamp`)
- **Causa 2**: Tabella `routers` mancante
- **Causa 3**: Query non gestivano risultati `None`
- **Soluzione**:
  - Corretto nome colonna da `logged_at` a `timestamp` in `/stats`
  - Creato script SQL `database-schema/create_routers.sql`
  - Aggiunta gestione `None` per tutte le query
- **Risultato**: Endpoint `/stats` funzionante, API completa operativa

### Crontab Automation Fix (18 Nov 2025 - 09:30)
- **Problema 1**: Training/Detection crontab falliscono con `ModuleNotFoundError: No module named 'requests'`
- **Problema 2**: Script check_backend/frontend falliscono con `Permission denied` su `/var/run/ids/`
- **Causa 1**: Crontab usavano Python inline con modulo `requests` non installato
- **Causa 2**: Utente `ids` non ha permessi scrittura su `/var/run/ids/`
- **Soluzione**:
  - Creati script shell dedicati: `cron_train.sh` e `cron_detect.sh` (usano `curl` invece di Python)
  - Aggiornati script monitoring: `check_backend.sh` e `check_frontend.sh` (usano `/var/log/ids/` invece di `/var/run/ids/`)
  - Aggiornato `setup_crontab.sh` per usare i nuovi script
- **Risultato**: Automazione crontab completamente funzionante senza dipendenze Python esterne

### Schema Database Fix (17 Nov 2025)
- **Problema**: Tabella `network_logs` mancante, schema TypeScript disallineato con Python
- **Soluzione**: Schema aggiornato con campi corretti (router_name, destination_ip/port, packet_length, raw_message)
- **Script SQL**: `database-schema/create_network_logs.sql` per creazione tabella
- **Update automatico**: `./update_from_git.sh --db` applica tutti gli script SQL in `database-schema/`

### Pattern Regex Fix (17 Nov 2025)
- **Problema**: Pattern regex non matchavano formato reale log MikroTik
- **Formato vecchio**: `src-address=IP:PORT dst-address=IP:PORT proto=UDP` ❌
- **Formato reale**: `proto UDP, IP:PORT->IP:PORT, len 1280` ✅
- **Risultato**: Match rate 99.9%, ~670K log salvati correttamente

### PostgreSQL Authentication Fix
- **Problema**: Password authentication failed (SCRAM-SHA-256 vs MD5)
- **Soluzione**: `deployment/fix_postgresql_auth.sh` configura SCRAM-SHA-256 in pg_hba.conf
- **Password encryption**: ALTER SYSTEM SET password_encryption = 'scram-sha-256'
- **Utente ricreato**: DROP + CREATE con formato SCRAM corretto

### IPv4 Force Fix
- **Problema**: syslog_parser si connetteva a ::1 (IPv6) invece di 127.0.0.1 (IPv4)
- **Soluzione**: PGHOST=127.0.0.1 in .env (NON usare localhost)
- **Parser**: load_dotenv() carica .env automaticamente

### Git Ownership Fix
- **Problema**: dubious ownership error in /opt/ids
- **Soluzione**: `deployment/fix_git_ownership.sh` aggiunge safe.directory
- **Update script**: `deployment/update_from_git.sh` ora esegue git come utente ids

## File Importanti

### Python ML Backend
- `python_ml/ml_analyzer.py`: Core ML (25 feature, Isolation Forest)
- `python_ml/mikrotik_manager.py`: Gestione router API REST
- `python_ml/main.py`: FastAPI server
- `python_ml/requirements.txt`: Dipendenze Python

### Frontend
- `client/src/pages/Dashboard.tsx`: Dashboard principale
- `client/src/pages/Detections.tsx`: Lista rilevamenti
- `client/src/pages/Routers.tsx`: Gestione router
- `client/src/App.tsx`: App root con sidebar

### Backend Node
- `server/routes.ts`: API endpoints
- `server/storage.ts`: Database operations
- `server/db.ts`: PostgreSQL connection
- `shared/schema.ts`: Drizzle ORM schema

## Deployment e Aggiornamenti

### PRIMO DEPLOYMENT (Bootstrap) - Server AlmaLinux
**Documentazione**: `deployment/BOOTSTRAP_PRIMO_DEPLOYMENT.md`

```bash
# Clone in directory separata (preserva .env esistente)
cd /opt
sudo -u ids git clone https://[CREDENTIALS]@git.alfacom.it/marco/ids.git ids_git

# Copia .env esistente
sudo -u ids cp /opt/ids/.env /opt/ids_git/.env

# Swap atomico directory
mv /opt/ids /opt/ids_legacy
mv /opt/ids_git /opt/ids

# Installa dipendenze e riavvia servizi
cd /opt/ids
sudo -u ids npm install
cd python_ml && sudo -u ids pip3.11 install -r requirements.txt
```

### Aggiornamenti Futuri (Dopo Bootstrap)
```bash
# Aggiornamento standard (codice + dipendenze)
cd /opt/ids
./update_from_git.sh

# Aggiornamento con sincronizzazione schema database
./update_from_git.sh --db
```

**IMPORTANTE**: `update_from_git.sh` fa backup automatico di `.env` e `git.env` prima del pull!

### Export Schema Database (Solo Struttura)
```bash
# Su server production, esporta schema per commit su git
cd /opt/ids/deployment
./export_db_schema.sh

# Risultato: database-schema/schema.sql (NO dati, SOLO DDL)
```

### Push su Git (Da Replit)
```bash
# Esporta schema + commit + push
cd /opt/ids
./push-gitlab.sh          # Patch version (1.0.0 → 1.0.1)
./push-gitlab.sh minor    # Minor version (1.0.5 → 1.1.0)
./push-gitlab.sh major    # Major version (1.1.5 → 2.0.0)
```

## Comandi Utili

### Start Python Backend
```bash
cd python_ml
pip install -r requirements.txt
python main.py
```

### API Calls
```bash
# Training
curl -X POST http://localhost:8000/train \
  -H "Content-Type: application/json" \
  -d '{"max_records": 10000, "hours_back": 24}'

# Detection
curl -X POST http://localhost:8000/detect \
  -H "Content-Type: application/json" \
  -d '{"max_records": 5000, "auto_block": true, "risk_threshold": 75}'

# Stats
curl http://localhost:8000/stats
```

### Database
```bash
npm run db:push  # Sync schema to PostgreSQL
```

## Configurazione Router MikroTik

### Abilita API REST
```
/ip service
set api-ssl disabled=no
set www-ssl disabled=no
```

### Aggiungi Router
Via dashboard web o SQL:
```sql
INSERT INTO routers (name, ip_address, username, password, api_port, enabled)
VALUES ('Router 1', '192.168.1.1', 'admin', 'password', 443, true);
```

## Feature ML (25 totali)

### Volume (5)
- total_packets, total_bytes, conn_count
- avg_packet_size, bytes_per_second

### Temporali (8)
- time_span_seconds, conn_per_second
- hour_of_day, day_of_week
- max_burst, avg_burst, burst_variance, avg_interval

### Protocol Diversity (6)
- unique_protocols, unique_dest_ports, unique_dest_ips
- protocol_entropy, tcp_ratio, udp_ratio

### Port Scanning (3)
- unique_ports_contacted, port_scan_score, sequential_ports

### Behavioral (3)
- packets_per_conn, packet_size_variance, blocked_ratio

## Livelli di Rischio

- 🔴 CRITICO (85-100): Blocco immediato
- 🟠 ALTO (70-84): Blocco + monitoring
- 🟡 MEDIO (60-69): Monitoring
- 🔵 BASSO (40-59): Logging
- 🟢 NORMALE (0-39): Nessuna azione

## Vantaggi vs Sistema Precedente

- **Feature**: 150+ → 25 (mirate)
- **Training**: ~5 min → ~10 sec
- **Detection**: Lento → <2 sec
- **Router Comm**: SSH → API REST
- **Multi-Router**: Sequenziale → Parallelo
- **Database**: MySQL → PostgreSQL
- **Falsi Negativi**: Alti → Bassi

## Note

- Whitelist: IP protetti da blocco automatico
- Timeout: Blocchi scadono dopo 1h (configurabile)
- Parallel Blocking: Tutti i router aggiornati simultaneamente
- Auto-Training: Configurabile via cron (consigliato ogni 12h)
- Auto-Detection: Configurabile via cron (consigliato ogni 5 min)

## Sicurezza

- Password router gestite da database (non in codice)
- API REST più sicura di SSH
- Timeout automatico blocchi
- Logging completo operazioni
- PostgreSQL con connessione sicura

## Development

- Frontend: Workflow "Start application" (auto-reload)
- Python Backend: `python python_ml/main.py`
- API Docs: http://localhost:8000/docs
- Database: PostgreSQL via Neon (environment variables auto-configurate)

## Preferenze Utente

### Operazioni Git e Deployment
- **IMPORTANTE**: L'agente NON deve usare comandi git (push-gitlab.sh) perché Replit blocca le operazioni git
- **Workflow corretto**:
  1. Utente riporta errori/problemi dal server AlmaLinux
  2. Agente risolve problemi e modifica file su Replit
  3. **Utente esegue manualmente**: `./push-gitlab.sh` per commit+push
  4. **Utente esegue sul server**: `./update_from_git.sh` o `./update_from_git.sh --db`
  5. Utente testa e riporta risultati all'agente
  6. Ripeti fino a funzionamento completo

### Linguaggio
- Tutte le risposte dell'agente devono essere in **italiano**
- Codice e documentazione tecnica: inglese
- Commit message: italiano