PID controllers are ubiquitous in robotics for their simplicity and effectiveness. While Bayesian Optimization (BO) and Differential Evolution (DE) automate gain selection, the influence of initial PID gains and the exploration–exploitation (E–E) balance remains poorly understood. We propose a unified framework that systematically crosses multiple initial-state hypotheses with three E–E strategies — balanced, exploration‑heavy, and exploitation‑heavy — and evaluates them on two mobile‑robot platforms. Results demonstrate that a balanced E–E policy seeded with well‑chosen initial gains produces the fastest convergence (≤1.1 s settling) and 100 % reliability, while DE offers unmatched robustness across all scenarios.
The framework consists of two cooperating modules:
6
unique configurations per optimizer.// Inputs: config, optimizer_type (BO / DE), ST_threshold, constraints
initialize optimizer(config.initial_state)
repeat
(Kp, Ki, Kd) ← optimizer.suggest()
(ST, RT, OS) ← run_rotation(Kp, Ki, Kd) // Execute 90° turn
if (RT, OS) within constraints then
optimizer.update(ST) // reward
else
optimizer.penalize() // discourage
until ST ≤ ST_threshold or max_iter reached
return best (Kp, Ki, Kd), ST
The executor stops early when the settling‑time (ST) objective is achieved, drastically reducing iteration count for promising configurations.
We ran 24 distinct trials (2 initial states × 3 E–E levels × 2 optimizers) on each robot, repeating each trial 10 times for statistical significance. Table 1 summarises the best‑run metrics; Figures 3 & 4 visualise the settling‑time distributions.
Robot | E-E Config | Opt. | ST (ms) | Conv % | RT (ms) | OS % | Iter. |
---|
This study underscores the intertwined roles of initial PID gains and exploration‑exploitation strategy in auto‑tuning. A deliberately balanced search seeded with informed gains yields rapid, reliable performance, while DE remains a fail‑safe optimiser under tough conditions. Future work will fuse BO’s sample‑efficiency with DE’s robustness and extend the framework to multi‑objective tasks.
This work was supported in part by the National Science Foundation (NSF) CRII: CPS program under grant number 2347294.