python train_hybrid.py --test [WARNING] Extended Isolation Forest not available, using standard IF ====================================================================== IDS HYBRID ML TEST - SYNTHETIC DATA ====================================================================== INFO:dataset_loader:Creating sample dataset (10000 samples)... INFO:dataset_loader:Sample dataset created: 10000 rows INFO:dataset_loader:Attack distribution: attack_type normal 8981 brute_force 273 suspicious 258 ddos 257 port_scan 231 Name: count, dtype: int64 [TEST] Created synthetic dataset: 10000 samples Normal: 8,981 (89.8%) Attacks: 1,019 (10.2%) [TEST] Training on 6,281 normal samples... [HYBRID] Training hybrid model on 6281 logs... [HYBRID] Extracted features for 100 unique IPs [HYBRID] Pre-training Isolation Forest for feature selection... [HYBRID] Generated 3 pseudo-anomalies from pre-training IF [HYBRID] Feature selection: 25 → 18 features [HYBRID] Selected features: total_packets, conn_count, time_span_seconds, conn_per_second, hour_of_day... (+13 more) [HYBRID] Normalizing features... [HYBRID] Training Extended Isolation Forest (contamination=0.03)... /opt/ids/python_ml/venv/lib64/python3.11/site-packages/sklearn/ensemble/_iforest.py:307: UserWarning: max_samples (256) is greater than the total number of samples (100). max_samples will be set to n_samples for estimation. warn( [HYBRID] Generating pseudo-labels from Isolation Forest... [HYBRID] ⚠ IF found only 3 anomalies (need 10) [HYBRID] Applying ADAPTIVE percentile fallback... [HYBRID] Trying 5% percentile → 5 anomalies [HYBRID] Trying 10% percentile → 10 anomalies [HYBRID] ✅ Success with 10% percentile [HYBRID] Pseudo-labels: 10 anomalies, 90 normal [HYBRID] Training ensemble classifier (DT + RF + XGBoost)... [HYBRID] Class distribution OK: [0 1] (counts: [90 10]) [HYBRID] Ensemble .fit() completed successfully [HYBRID] ✅ Ensemble verified: produces 2 class probabilities [HYBRID] Ensemble training completed and verified! [HYBRID] Models saved to models [HYBRID] Ensemble classifier included [HYBRID] ✅ Training completed successfully! 10/100 IPs flagged as anomalies [HYBRID] ✅ Ensemble classifier verified and ready for production [DETECT] Ensemble classifier available - computing hybrid score... [DETECT] IF scores: min=0.0, max=100.0, mean=57.6 [DETECT] Ensemble scores: min=86.9, max=97.2, mean=92.1 [DETECT] Combined scores: min=54.3, max=93.1, mean=78.3 [DETECT] ✅ Hybrid scoring active: 40% IF + 60% Ensemble [TEST] Detection results: Total detections: 100 High confidence: 0 Medium confidence: 85 Low confidence: 15 [TEST] Top 5 detections: 1. 192.168.0.24: risk=93.1, type=suspicious, confidence=medium 2. 192.168.0.27: risk=92.7, type=suspicious, confidence=medium 3. 192.168.0.88: risk=92.5, type=suspicious, confidence=medium 4. 192.168.0.70: risk=92.3, type=suspicious, confidence=medium 5. 192.168.0.4: risk=91.4, type=suspicious, confidence=medium ❌ Error: index 7000 is out of bounds for axis 0 with size 3000 Traceback (most recent call last): File "/opt/ids/python_ml/train_hybrid.py", line 361, in main test_on_synthetic(args) File "/opt/ids/python_ml/train_hybrid.py", line 283, in test_on_synthetic y_pred[i] = 1 ~~~~~~^^^ IndexError: index 7000 is out of bounds for axis 0 with size 3000