hill climbing

Published 2025-05-15

Weighted ensemble of models; linear reduction of stacking. The process is, very generally, start with your best model and iteratively try adding all the other models, choosing the one that yields the greatest improvement. It’s conceptually pretty straightforward but the score improvements are remarkable.

Hill climbers benefit from a diversity of models — frequently, an individually low-scoring model will greatly improve a hill climber if it differs significantly from the existing stack.

Sample code:

n       = len(names)
oof_mat = np.clip(np.column_stack(oof_preds), 0, None)
test_mat= np.clip(np.column_stack(test_preds), 0, None)

# initialise with the single best model
best_idx    = np.argmin([rmsle(y_valid, oof_mat[:, i]) for i in range(n)])
ensemble    = [best_idx]
ens_sum_oof = oof_mat[:, best_idx].copy()
ens_sum_tst = test_mat[:, best_idx].copy()
best_score  = rmsle(y_valid, ens_sum_oof)

print(f"start: {names[best_idx]},  RMSLE={best_score:.5f}")

# greedy loop
for i in range(1, 100):          # a `while True` works too
    found_better = False
    for j in range(n):
        cand_sum   = ens_sum_oof + oof_mat[:, j]
        cand_pred  = cand_sum / (len(ensemble) + 1)
        score      = rmsle(y_valid, cand_pred)
        if score < best_score - 1e-6:        # tiny tolerance
            best_score   = score
            best_cand    = j
            found_better = True
    if not found_better:
        break
    # accept candidate
    ensemble.append(best_cand)
    ens_sum_oof += oof_mat[:, best_cand]
    ens_sum_tst += test_mat[:, best_cand]
    print(f" + {names[best_cand]:<15} → RMSLE {best_score:.5f}")

# final blended prediction
final_pred_test = ens_sum_tst / len(ensemble)
weights = {names[i]: ensemble.count(i)/len(ensemble) for i in set(ensemble)}

print("\nEnsemble weights:")
for k, w in weights.items():
    print(f"  {k:<15}: {w:.3f}")

It’s also worth considering that, in cases where adding a single model doesn’t improve the score, it may be that adding 2 or more models will. I’m therefore now using a hill climber that ‘looks 2 models ahead’ — it sometimes perform better than the one-look-ahead climber.