# IDS - Intrusion Detection System

Sistema di rilevamento intrusioni per router MikroTik basato su Machine Learning.

## Progetto

**Tipo**: Full-stack Web Application + Python ML Backend  
**Stack**: React + FastAPI + PostgreSQL + MikroTik API REST

## Architettura

### Frontend (React)
- Dashboard monitoring real-time
- Visualizzazione detections e router
- Gestione whitelist
- ShadCN UI components
- TanStack Query per data fetching

### Backend Python (FastAPI)
- **ML Analyzer**: Isolation Forest con 25 feature mirate
- **MikroTik Manager**: Comunicazione API REST parallela con 10+ router
- **Detection Engine**: Scoring 0-100 con 5 livelli di rischio
- Endpoints: /train, /detect, /block-ip, /unblock-ip, /stats

### Backend Node.js (Express)
- API REST per frontend
- Gestione database PostgreSQL
- Routes: routers, detections, logs, whitelist, training-history

### Database (PostgreSQL)
- `routers`: Configurazione router MikroTik
- `network_logs`: Log syslog da router
- `detections`: Anomalie rilevate dal ML
- `whitelist`: IP fidati
- `training_history`: Storia training modelli

## Workflow

1. **Log Collection**: Router → Syslog (UDP:514) → RSyslog → syslog_parser.py → PostgreSQL `network_logs`
2. **Training**: Python ML estrae 25 feature → Isolation Forest
3. **Detection**: Analisi real-time → Scoring 0-100 → Classificazione
4. **Auto-Block**: IP critico (>=80) → API REST → Tutti i router (parallelo)

## Fix Recenti (Novembre 2025)

### PostgreSQL Authentication Fix
- **Problema**: Password authentication failed (SCRAM-SHA-256 vs MD5)
- **Soluzione**: `deployment/fix_postgresql_auth.sh` configura SCRAM-SHA-256 in pg_hba.conf
- **Password encryption**: ALTER SYSTEM SET password_encryption = 'scram-sha-256'
- **Utente ricreato**: DROP + CREATE con formato SCRAM corretto

### IPv4 Force Fix
- **Problema**: syslog_parser si connetteva a ::1 (IPv6) invece di 127.0.0.1 (IPv4)
- **Soluzione**: PGHOST=127.0.0.1 in .env (NON usare localhost)
- **Parser**: load_dotenv() carica .env automaticamente

### Git Ownership Fix
- **Problema**: dubious ownership error in /opt/ids
- **Soluzione**: `deployment/fix_git_ownership.sh` aggiunge safe.directory
- **Update script**: `deployment/update_from_git.sh` ora esegue git come utente ids

## File Importanti

### Python ML Backend
- `python_ml/ml_analyzer.py`: Core ML (25 feature, Isolation Forest)
- `python_ml/mikrotik_manager.py`: Gestione router API REST
- `python_ml/main.py`: FastAPI server
- `python_ml/requirements.txt`: Dipendenze Python

### Frontend
- `client/src/pages/Dashboard.tsx`: Dashboard principale
- `client/src/pages/Detections.tsx`: Lista rilevamenti
- `client/src/pages/Routers.tsx`: Gestione router
- `client/src/App.tsx`: App root con sidebar

### Backend Node
- `server/routes.ts`: API endpoints
- `server/storage.ts`: Database operations
- `server/db.ts`: PostgreSQL connection
- `shared/schema.ts`: Drizzle ORM schema

## Deployment e Aggiornamenti

### Aggiornamento da Git (Server AlmaLinux)
```bash
# Aggiornamento standard (codice + dipendenze)
cd /opt/ids
./update_from_git.sh

# Aggiornamento con sincronizzazione schema database
./update_from_git.sh --db
```

### Export Schema Database (Solo Struttura)
```bash
# Su server production, esporta schema per commit su git
cd /opt/ids/deployment
./export_db_schema.sh

# Risultato: database-schema/schema.sql (NO dati, SOLO DDL)
```

## Comandi Utili

### Start Python Backend
```bash
cd python_ml
pip install -r requirements.txt
python main.py
```

### API Calls
```bash
# Training
curl -X POST http://localhost:8000/train \
  -H "Content-Type: application/json" \
  -d '{"max_records": 10000, "hours_back": 24}'

# Detection
curl -X POST http://localhost:8000/detect \
  -H "Content-Type: application/json" \
  -d '{"max_records": 5000, "auto_block": true, "risk_threshold": 75}'

# Stats
curl http://localhost:8000/stats
```

### Database
```bash
npm run db:push  # Sync schema to PostgreSQL
```

## Configurazione Router MikroTik

### Abilita API REST
```
/ip service
set api-ssl disabled=no
set www-ssl disabled=no
```

### Aggiungi Router
Via dashboard web o SQL:
```sql
INSERT INTO routers (name, ip_address, username, password, api_port, enabled)
VALUES ('Router 1', '192.168.1.1', 'admin', 'password', 443, true);
```

## Feature ML (25 totali)

### Volume (5)
- total_packets, total_bytes, conn_count
- avg_packet_size, bytes_per_second

### Temporali (8)
- time_span_seconds, conn_per_second
- hour_of_day, day_of_week
- max_burst, avg_burst, burst_variance, avg_interval

### Protocol Diversity (6)
- unique_protocols, unique_dest_ports, unique_dest_ips
- protocol_entropy, tcp_ratio, udp_ratio

### Port Scanning (3)
- unique_ports_contacted, port_scan_score, sequential_ports

### Behavioral (3)
- packets_per_conn, packet_size_variance, blocked_ratio

## Livelli di Rischio

- 🔴 CRITICO (85-100): Blocco immediato
- 🟠 ALTO (70-84): Blocco + monitoring
- 🟡 MEDIO (60-69): Monitoring
- 🔵 BASSO (40-59): Logging
- 🟢 NORMALE (0-39): Nessuna azione

## Vantaggi vs Sistema Precedente

- **Feature**: 150+ → 25 (mirate)
- **Training**: ~5 min → ~10 sec
- **Detection**: Lento → <2 sec
- **Router Comm**: SSH → API REST
- **Multi-Router**: Sequenziale → Parallelo
- **Database**: MySQL → PostgreSQL
- **Falsi Negativi**: Alti → Bassi

## Note

- Whitelist: IP protetti da blocco automatico
- Timeout: Blocchi scadono dopo 1h (configurabile)
- Parallel Blocking: Tutti i router aggiornati simultaneamente
- Auto-Training: Configurabile via cron (consigliato ogni 12h)
- Auto-Detection: Configurabile via cron (consigliato ogni 5 min)

## Sicurezza

- Password router gestite da database (non in codice)
- API REST più sicura di SSH
- Timeout automatico blocchi
- Logging completo operazioni
- PostgreSQL con connessione sicura

## Development

- Frontend: Workflow "Start application" (auto-reload)
- Python Backend: `python python_ml/main.py`
- API Docs: http://localhost:8000/docs
- Database: PostgreSQL via Neon (environment variables auto-configurate)

## Preferenze Utente

### Operazioni Git
- **IMPORTANTE**: Tutte le operazioni git (commit, push) vengono eseguite **manualmente dall'utente** tramite shell Replit
- L'agente **NON deve mai** eseguire comandi git automaticamente
- L'utente preferisce avere pieno controllo su commit e versioning
- Workflow: Agente modifica file → Utente esegue git commit/push manualmente