PID Auto-Tuning Framework

Abstract

PID controllers are ubiquitous in robotics for their simplicity and effectiveness. While Bayesian Optimization (BO) and Differential Evolution (DE) automate gain selection, the influence of initial PID gains and the exploration–exploitation (E–E) balance remains poorly understood. We propose a unified framework that systematically crosses multiple initial-state hypotheses with three E–E strategies — balanced, exploration‑heavy, and exploitation‑heavy — and evaluates them on two mobile‑robot platforms. Results demonstrate that a balanced E–E policy seeded with well‑chosen initial gains produces the fastest convergence (≤1.1 s settling) and 100 % reliability, while DE offers unmatched robustness across all scenarios.

Framework Overview

Block diagram of the proposed PID auto‑tuning framework showing Configurations Generator and Trials Executer modules. — Figure 1. Framework architecture illustrating how initial PID states and exploration–exploitation levels are paired before tuning with BO / DE.

The framework consists of two cooperating modules:

Configurations Generator — forms the Cartesian product of predefined initial PID states and E–E levels, yielding 6 unique configurations per optimizer.
Trials Executer — sequentially runs each configuration on the target robot, closes the optimization loop, and logs performance metrics.

Methodology

Workflow of the Trials Executer block. — Figure 2. Trials Executer workflow for one configuration (trial).

Algorithm 1 – Trials Executer Pseudocode

// Inputs: config, optimizer_type (BO / DE), ST_threshold, constraints
initialize optimizer(config.initial_state)
repeat
    (Kp, Ki, Kd) ← optimizer.suggest()
    (ST, RT, OS) ← run_rotation(Kp, Ki, Kd)   // Execute 90° turn
    if (RT, OS) within constraints then
        optimizer.update(ST)                  // reward
    else
        optimizer.penalize()                  // discourage
until ST ≤ ST_threshold or max_iter reached
return best (Kp, Ki, Kd), ST

The executor stops early when the settling‑time (ST) objective is achieved, drastically reducing iteration count for promising configurations.

Experimental Results

We ran 24 distinct trials (2 initial states × 3 E–E levels × 2 optimizers) on each robot, repeating each trial 10 times for statistical significance. Table 1 summarises the best‑run metrics; Figures 3 & 4 visualise the settling‑time distributions.

Table 1. Experimental Results
Robot Type	E–E Config	Optimizer	Settling Time (ms)	Convergence %	Rise Time (ms)	Overshoot %	Iterations
Initial State 1
Differential Drive	Balanced	BO	1118	100%	421	32.95	3
Differential Drive	Balanced	DE	1406	100%	631	27.83	16
Omnidirectional	Balanced	BO	1225	90%	553	32.48	13
Omnidirectional	Balanced	DE	1535	100%	658	16.46	16
Differential Drive	Exploration‑Focused	BO	1378	100%	432	30.12	11
Differential Drive	Exploration‑Focused	DE	1418	100%	452	24.47	16
Omnidirectional	Exploration‑Focused	BO	1514	100%	553	20.45	15
Omnidirectional	Exploration‑Focused	DE	1552	100%	553	29.40	66
Differential Drive	Exploitation‑Focused	BO	1667	100%	492	32.88	19
Differential Drive	Exploitation‑Focused	DE	1417	100%	519	31.11	31
Omnidirectional	Exploitation‑Focused	BO	1503	80%	553	20.45	15
Omnidirectional	Exploitation‑Focused	DE	1522	100%	568	29.09	61
Initial State 2
Differential Drive	Balanced	BO	1151	90%	565	27.69	10
Differential Drive	Balanced	DE	1451	100%	485	29.98	31
Omnidirectional	Balanced	BO	1246	100%	590	33.21	9
Omnidirectional	Balanced	DE	1479	100%	538	19.38	26
Differential Drive	Exploration‑Focused	BO	1421	100%	517	26.78	15
Differential Drive	Exploration‑Focused	DE	1488	100%	487	27.88	16
Omnidirectional	Exploration‑Focused	BO	1420	100%	553	28.20	17
Omnidirectional	Exploration‑Focused	DE	1501	100%	553	32.80	17
Differential Drive	Exploitation‑Focused	BO	1609	90%	534	33.06	17
Differential Drive	Exploitation‑Focused	DE	1319	100%	648	29.31	11
Omnidirectional	Exploitation‑Focused	BO	1598	80%	553	28.20	17
Omnidirectional	Exploitation‑Focused	DE	1654	100%	533	28.41	56

KDE of settling times – Omnidirectional robot. — Figure 3. Settling‑time KDEs for omnidirectional robot across E–E levels (BO in blue, DE in red).

KDE of settling times – Differential‑drive robot. — Figure 4. Settling‑time KDEs for differential‑drive robot across E–E levels.

Discussion

RQ1 – Exploration vs. Exploitation

A balanced policy consistently delivered the fastest settling times and 100 % convergence on both platforms.
Exploration‑heavy settings occasionally discovered lower‑cost regions but required extra iterations to refine.
Exploitation‑heavy settings favoured DE, which capitalised on good initial gains without global search overhead.

RQ2 – Impact of Initial Gains

Initial State 1 (high P, low I/D) accelerated convergence on the differential‑drive robot by ~3 % relative to State 2.
The omnidirectional platform proved less sensitive to initial gains, owing to its holonomic kinematics.

RQ3 – BO vs DE

BO reached sub‑1.2 s settling in as few as three iterations but exhibited occasional non‑convergence when exploration was restricted.
DE achieved perfect reliability across 240 runs, albeit sometimes requiring >50 iterations to equal BO’s best times.

Conclusion

This study underscores the intertwined roles of initial PID gains and exploration‑exploitation strategy in auto‑tuning. A deliberately balanced search seeded with informed gains yields rapid, reliable performance, while DE remains a fail‑safe optimiser under tough conditions. Future work will fuse BO’s sample‑efficiency with DE’s robustness and extend the framework to multi‑objective tasks.

Acknowledgements

This work was supported in part by the National Science Foundation (NSF) CRII: CPS program under grant number 2347294.

Systematic Evaluation of Initial States and Exploration–Exploitation Strategies in PID Auto-Tuning:A Framework‑Driven Approach Applied on Mobile Robots