An energy and area eﬃcient, all digital entropy source compatible with modern standards based on jitter pipelining

. This paper proposes an energy and area eﬃcient entropy source, suitable for true random number generation, accompanied with a stochastic model in a 28 nm CMOS technology. The design uses a jitter pipelining architecture together with an increased timing resolution to achieve a maximal throughput of 298 Mbit / s and a best energy eﬃciency of 1 . 46 pJ / bit at a supply of 0 . 8 V. The generated random bits pass the NIST SP 800-90B IID tests with a min entropy rate of 0 . 933 bit / bit, which is more than required by the AIS-31 standard. The all digital design allows for eﬀortless transfer to other technology nodes, taking advantage of all beneﬁts related to further technology scaling.


Introduction
Modern cryptographic systems require a substantial amount of true random data (e.g. key material, masks, initialisation, etc.). This demand for cryptographic grade randomness only tends to increase in the near future with the emergence of post quantum secure cryptographic algorithms. By providing high quality randomness, True Random Number Generators (TRNGs) form a solid foundation that allows for the implementation of higher level algorithms and protocols.
Validating the performance of an Entropy Source (ES) (entropy generating component of a TRNG) is often done by solely assessing the quality of the generated random data [TRA21,KLK17]. Following an iterative approach (Fig. 1, top), the verifier generates a certain amount (determined by the selected test) of random data and applies that data to a series of statistical tests. The tests determine if the data can be regarded as "random" with a predefined significance level. If the data fails the tests, certain design parameters of the ES circuit are fine-tuned and the tests are run again, until the ES output passes. Previous examples [Dic03] have shown that this approach can lead to TRNGs that are prone to prediction attacks. Statistical tests that only work with the output of the ES/TRNG cannot differentiate between sequences generated by deterministic algorithms, TRNGs, or a combination of both [BBF09].
To overcome this concern, a new approach (Fig. 1, bottom) was proposed by international standardisation bodies [TBK + 18, KS11,ISO19]. The new workflow is centered on the existence of a stochastic model characterising the entropy extraction process taking place in the ES circuitry. Entropy requirements (by the standard and/or the application), model assumptions (e.g. the existence of a certain type of noise source), and platform parameters (e.g. the intrinsic gate delay) form the input to this model. From the model, an optimisation procedure can be determined to select the value of the ES design parameters • The design is implemented in a 28 nm CMOS technology and measurement results are available.

ES architecture
This section will provide a high-level description of the ES architecture and jitter pipeline principle, before heading to a more detailed mathematical analysis of the design in Sect. 3.

Jitter pipeline
The proposed ES architecture is depicted in Fig. 2. Three components can be differentiated: a DC, a TDC and a digitisation block. Both the DC and the TDC consist of two ROs: DC0, DC1 and TDC0, TDC1 respectively. Timing jitter will naturally accumulate in all four ROs, when left running for a specified accumulation time interval. The TDC ROs are used to resolve the timing jitter generated by the DC ROs with a resolution related to the period difference of the two TDC ROs. The resolvement action leads to a digital representation of the timing difference created by the DC. This representation is then used to construct a random output bit. To minimise idle time, the DC can already be restarted to accumulate jitter for the following output bit, during the resolvement phase of the current output bit. Timing jitter further accumulates during the resolvement phase, as the TDC contains free running ROs as well. Both phases therefore provide independent contributions to the output bit entropy, effectively creating a jitter pipeline, where jitter is being generated in a first (DC) stage, before being handed over to a second (TDC) stage, where it accumulates further. The pipelining principle is indicated by the shaded boxes in Fig. 3.
The concept of jitter pipelining is not limited to the jitter accumulation and resolvement using a DC and a TDC structure as showcased in this work, but should be regarded as a broader concept, that might be exploited in other ES architectures as well.

Architecture timing description
A start edge is applied to both DC ROs. Each DC RO consists of a chain of four delayconfigurable inverters. The Edge To Level (E2L) blocks in Fig. 2 react to the n-th positive edge generated by the DC ROs by disabling the DC ROs and outputting a positive edge (DC 0 and DC 1 ). The time it takes for the start edge to propagate through both DC ROs for n cycles, to the output of the E2L block is indicated as T n 0 and T n 1 , for DC0 and DC1 respectively at the timing diagram in Fig. 3. Random timing jitter variations make the timing difference: T n ∆ = T n 0 − T n 1 , a random variable over multiple evaluations. The E2L outputs (DC 0 and DC 1 in Fig. 2) enable the TDC ROs to start oscillating (T DC 0 and T DC 1 ).
Both TDC ROs are configured to have a slightly different oscillation period, which defines the TDC resolution: res = |P T DC0 − P T DC1 |. The ROs start with an initial phase difference determined by T n ∆ and keep oscillating until the phase difference is either 0 • or 180 • (π radians). The digitisation circuitry detects this phase synchronisation as the bottom RO (TDC1) will start to sample a different logic value from the top RO (TDC0) by means of an XOR gate. A T flip-flop will determine if during the phase synchronisation, TDC1 experienced an odd or an even amount of cycles. The output of this T flip-flop is used as the random output bit.

Stochastic model
This section elaborates a mathematical characterisation of the proposed circuit in Sect. 2 and quantifies the amount of entropy being extracted from the available timing jitter. The entropy estimation presented in this work, will be solely based on the existence of unmanipulatable thermal noise. Other noise sources will inevitably also be present. As we assume thermal noise is independent from all other sources of noise, the coexistence of these other noise sources will not lead to an entropy reduction and the estimation provided here is certainly a lower bound.

Model assumptions
To start off, four main assumptions made in the model are listed below: • Thermal noise is unmanipulatable and independent from other noise sources.
• DC and TDC ROs are all mutually independent oscillators, affected by thermal noise.
• RO phase affected by thermal noise behaves as a Wiener process with drift.

Notation
In this text, random variables and their realisations are denoted as uppercase and lowercase characters respectively. A stochastic process through time t ≥ 0 is represented as an uppercase function (e.g. {X(t)} t≥0 ) and a realisation as a lowercase function (e.g. {x(t)} t≥0 ). The Probability Mass Function (PMF) or Probability Density Function (PDF) for a discrete or continuous random variable respectively Y is denoted as f Y (·). The Cumulative Distribution Function (CDF) of a random variable Y is represented as F Y (·). The expected value and variance of a random variable Y are: The probability of an event E is noted as P[E]. A conditioned random variable X, given the knowledge of another random variable Y is denoted as: X|Y . A random variable Y following a distribution D, with parameters p i is represented as: Y ∼ D(p 1 , p 2 , · · · ). The distributions being used in this text are: • Gaussian distribution: N (a, b 2 ), with mean a and variance b 2 .
• Inverse Gaussian distribution: IG(a, b 2 ), with mean a and variance a 3 b 2 .
The PDF and CDF of a standard normal distributed variable (Gaussian distributed with mean 0 and variance 1) are denoted as ϕ norm (·) and Φ norm (·) respectively.

Description of a noisy oscillating signal
Some prerequisite knowledge on timing jitter in free running ROs is given.

Noiseless
The phase of an oscillating noiseless signal is a continuous linear function through time t: with µ defining the oscillation speed or angular frequency and ϕ 0 determining the phase at time zero. The phase of an oscillator cannot be explicitly observed. The observable waveform e(ϕ) (current flow or node voltage) is defined as a function of the implicit phase, some examples are: representing sinusoidal, square and sawtooth waveforms respectively, with amplitude A. The operator · mod a, is shorthand notation for · − · a a, or the positive remainder after division by a. The waveform can then be described as a composite function of time: w(t) = (e • ϕ)(t) Each of these waveforms has a period: P w = 2π µ , meaning that w(t + P w ) = w(t) for any t. The waveform frequency is the inverse of the period: F w = µ 2π .
Noisy In this work, we assume the phase of an oscillator affected by thermal noise, {Φ(t)} t≥0 , to behave as a Wiener process with drift: with ϕ 0 again the phase at time zero, µ the drift and σ 2 the infinitesimal variance. {W (t)} t≥0 represents a Wiener process without drift. The oscillator is assumed to start at  time zero, as the Wiener process is undefined for negative time.
The assumption is related to the fact that a Wiener process with drift describes the integration of currents with a white (thermal) noise component onto a load capacitor, as was explained by [Abi06]. Some example instances of this phase process are given in Fig. 4. At any moment in time t a , the value of the phase is Gaussian distributed: Noisy Ring Oscillator An RO is modelled using the square waveform with amplitude equal to one as the observable waveform phase function (e(ϕ)), the phase is modelled by a Wiener process with drift and zero initial phase: Φ(t) = µt + σW (t), as shown in Fig. 5. The half-period duration of the k-th half period is represented by the random variable X k . Due to the independent increment property of the Wiener process, each half period duration of the RO output is Independent and Identically Distributed (IID) compared to all other half periods and can be represented by a single random variable: X. This time duration is equal to the time it takes for the oscillator phase to reach a multiple of π. Again due to the independent increment property, we only look at the time required to reach a phase of π, starting from phase zero. All other half periods have identical distribution. The time required for a Wiener process with drift to hit a certain level, α, for the first time is inverse-Gaussian distributed: . The half-period duration distribution is then given as: From this, the expected value and variance for X can be calculated: E[X] = π µ and Var[X] = πσ 2 µ 3 . The jitter strength, controlling the rate at which jitter accumulates in the RO is then equal to: with units of time. In practical applications, this quantity is in the order of femtoseconds [YRG + 17].
Note that an assumption was made in case of positive drift µ and phase started at zero, the phase would not return back and cross zero into negative values. Zero however is also a multiple of π and will therefore produce an edge at the output when crossed. This is related to the fact that the inverse-Gaussian distribution only describes the first passage time. For small values of drift relative to the infinitesimal variance ( µ σ 1), the phase could pass a certain level multiple times, with each passage creating an edge at the output.
The assumptions made will therefore only hold when µ σ 1, which is true in most applications (F noise 1 s). The probability of the phase returning back to its starting value and crossing it is equal to: This probability diminishes rapidly in time when F noise 1 s. For very small time instances (t ≈ ( σ µ ) 2 or lower), the assumption will also not hold, as the RO output waveform cannot be seen as an ideal digital signal anymore.

Delay Chain time difference distribution
The DC consists of two noisy free running ROs. The RO phase is described as a stochastic process: The DCs start at time zero with a phase equal to zero. Both DCs run a prescribed number of periods n and cause the E2Ls to activate at times T n 0 and T n 1 respectively: The first passage time at a level n2π of a Wiener process with drift is, as before, described by the inverse-Gaussian distribution: The DC time difference distribution T n ∆ after n periods is equal to: which is defined by a subtraction of two independent random variables. Its CDF can be calculated by integrating the PDFs of T n 0 and T n 1 .
The PDF for T n ∆ is then equal to: Note that in this model, T n 0 and T n 1 are assumed to be independent. Effort was made in the design and layout of all four ROs to make sure coupling is minimised by introducing separate supply networks and placing each RO in its own N-well. If a dependency would still be present, this leads to a reduced jitter strength estimate in Sect. 4 and therefore reduces the entropy claim made by this model.

Time to Digital Converter run time distribution
The TDC oscillators start oscillating when the respective DC has finished running n cycles (times T n 0 and T n 1 respectively). Each TDC is a free running RO and the phases can be represented as a stochastic process: Note that both T n 0 and T n 1 are random variables, following the distributions from Eq. 12. Φ T DC0 (t) and Φ T DC1 (t) are therefore representing random Wiener processes with drift and a random starting time instance. TDC1 samples TDC0 by using a Data Flip Flop (DFF). Figure 6 shows the relation between the TDC phases and the sampling time instances. Whenever the sampled value (DFF output) toggles, the TDCs stop oscillating and the number of TDC1 periods is outputted. From this figure, it can be seen that the toggling of the DFF output happens whenever the two TDC phases have diverged by a value of more than π. The TDCs stop at the next positive edge of TDC1. The TDC phase difference Φ ∆ (t) is defined only for time instances after the second TDC has started: The notation: · + N (a, b 2 ) in Eq. 17 indicates the addition of a normal distributed variable X, such that X ∼ N (a, b 2 ). This normal distributed variable follows from the properties of Wiener processes and addition of normal variables: , for two independent normal distributed random variables. Here, three cases can be distinguished: Substituting this into Eq. 17: This can be further simplified into: Equation 19 shows that the TDC phase difference Φ ∆ (t), for t ≥ T n 0 , can be written as a new Wiener process with drift . This accumulated phase in TDC1 is independent of the Wiener process W T DC (t ), as this process only starts at time t = 0 or t = T n 0 and due to the nature of Wiener processes, phase accumulated over non-overlapping time intervals is independent.
2. T n ∆ < 0 or T n 0 < T n 1 : the clock TDC (TDC1) starts running last. The reasoning from the case T n ∆ > 0 can be repeated with a time shift: t = t − T n 1 . This will produce: Again, the phase accumulated in TDC0 (Φ T DC0 (T n 1 )) from T n 0 to T n 1 is independent of the Wiener process W T DC (t ), starting at time t = 0 or t = T n 1 .
3. T n ∆ = 0 or T n 0 = T n 1 : both TDCs start at exactly the same time. Note that this is only a theoretical case, as T n ∆ is described by a continuous probability density function and the probability of this case happening is effectively equal to zero.
To be complete however, the TDC phase difference is now equal to (with a time shift: In this case, the subtraction with the phase accumulated in one of the TDCs disappears, as no TDC had been running before the second one starts.  The shifted TDC phase difference Φ ∆ (t) is now introduced: The TDC phase difference starts off at a random variable, determined by the DC time difference T n ∆ : From the description of the random processes in Eq. 16, Φ 0 ∆ , conditioned to T n ∆ , is distributed as follows: with X ∼ 0 indicating that the variable X follows a degenerate distribution centred at zero, with PDF equal to the Dirac delta function δ(·). From the start on, the TDC phase difference Φ ∆ (t) will behave as a Wiener process with drift, added to this initial phase difference. The TDCs will stop oscillating, whenever the value of Φ ∆ (t) crosses for the first time a multiple of π at time T π : An example phase instance of the two TDCs is shown in Fig. 7. Because we are only interested in the first passage time of Φ ∆ (t) at a multiple of π, the initial phase difference can be reduced modulo π: BothΦ ∆ (t) and Φ ∆ (t) will have equal first passage times at a multiple of π: T π . Note that for the reduced phase differenceΦ ∆ (t) the first passage level at a multiple of π will be either zero or π:Φ ∆ (T π ) = 0 or π.
Depending on the sign of the phase drift difference (µ ∆ ), the reduced phase differencē Φ ∆ (t) will drift towards one of the two boundaries for µ ∆ > 0 or µ ∆ < 0, as shown in Fig. 7. When started too close to an opposite boundary 0 in case µ ∆ > 0 and π in case µ ∆ < 0, the phase difference could hit this boundary, prematurely ending the oscillations. We can now determine the CDF for T π , conditioned to Φ 0 ∆ : The condition for the oscillations to continue can be written more explicitly: Moving all deterministic parts to the outside: From the property of Wiener processes: W (a) ∼ √ aN (0, 1), we can rewrite the boundaries from Eq. 30: with X ∼ N (0, 1) (standard normal distributed). Substituting this result into Eq. 28 gives: Figure 8 depicts how these boundaries will evolve through time for different drift difference (µ ∆ ), infinitesimal difference variance (σ ∆ ), and initial phase difference condition (Φ 0 ∆ = ϕ). The conditional PDF for T π can then be obtained by differentiating the CDF.

Output bit probability distribution
The TDCs will stop oscillating at the first positive edge of TDC1 after time T π . The number of cycles of TDC1 will then be used to construct the output random bit. The number of completed cycles, R, is equal to: The term max(T n 0 , T n 1 ) is added to T π , as T π was defined for the shifted phase difference. This term represents the accumulated phase in case TDC1 was started first.  There exists a dependency between T π and the Wiener process determining Φ T DC1 (·). This makes deriving an analytical expression for the distribution of R not straightforward. To circumvent this issue, the noise contributing to Φ T DC1 (·) is neglected: Φ T DC1 (t) ≈ µ T DC1 (t − T n 1 ). This simplification is justified by the fact that the jitter strength in the phase difference signal is much larger than in a single TDC: . This is true for decently matched TDCs (µ T DC0 ≈ µ T DC1 and |µ ∆ | µ T DC1 ). Simulation results shown in Fig. 9 further justify the simplification as the introduced error in the phase Φ T DC1 is low. The relative error (average deviation for R) is 0.2 %. Using this simplification, Eq. 33 is transformed to: Depending on the sign of T n ∆ , we have: The term µ T DC1 T n ∆ can be more accurately replaced by −Φ 0 ∆ for T n ∆ > 0 from Eq. 23, as this term represents the accumulated phase in TDC1 before TDC0 was started. The conditional CDF for R can now be calculated: R is a discrete random variable: Removing the conditionals, to obtain the joint distribution: with f Φ 0 ∆ |T n ∆ (ϕ|t) and f T n ∆ (t) obtained from Eqs. 24 and 15 respectively. The random variables Φ 0 ∆ and T n ∆ are integrated out, to obtain the distribution for R: The produced random bit B is now equal to the least significant bit of R. From this, the bit probability can be calculated: As the entire system does not contain a state that is transferred from one bit generation to another, individual bits are IID by design.

Jitter strength measurement
The entropy estimate provided by the model in Sect. 3 is highly influenced by the platform dependent parameter: jitter strength (F noise ). This parameter determines the rate at which timing jitter will naturally accumulate in a free running RO. In contrast to model parameters (e.g. RO frequency), the value for the jitter strength cannot be measured out directly. As was proposed by [YRG + 17], jitter measurement should happen on-chip and preferably with a differential measurement setup, to minimise external (non-thermal noise) influences that might lead to an overestimation of the available timing jitter. It is important not to overestimate the jitter strength parameter, and use a conservative method here for the following two reasons: firstly, due to the nature of the measurement, measurement errors (e.g. external noise sources other than thermal noise that might be manipulable) will always manifest themselves as a positive bias. Intuitively, this can be explained by the fact that the jitter strength parameter will be determined based on observed measurement variance. If external, independent sources of error exist, they will always lead to an increase of observed measurement variance (adding two independent random variables will increase variance) and lead to an overestimation. Secondly, based on the estimated output entropy, a security claim will be made. If the available jitter strength was overestimated, this will lead to an overestimation of the produced output entropy, and therefore to an invalid security claim.
To get accurate results, the jitter measurement experiment was repeated on five separate devices (chips). The most conservative estimate will further be used to estimate entropy for all devices tested.

On-chip measurement setup
The proposed ES architecture in Sect. 2 allows for on-chip differential jitter strength measurement as well. A circuit diagram, showing only the relevant parts for the jitter measurement, together with a timing diagram are shown in Figs. 10 and 11. By configuring TDC0 and TDC1 to have a long and short oscillation period respectively (P T DC0 > 2P T DC1 ), it can be ensured that a positive edge of TDC1 will occur each half-period of TDC0. When each half period of TDC0 is sampled, the TDCs will stop oscillating as soon as both DCs have finished propagating. DC0 and DC1 are configured such that DC1 has a shorter propagation delay than DC0 (T n 0 > T n 1 ). Therefore, TDC1 will oscillate during the time interval when DC1 finished propagating, but DC0 did not. A counter, counting the number of oscillations of TDC1 during this time interval, therefore produces an output proportional to the propagation delay difference between DC0 and DC1. By observing the counter output variance over multiple evaluations, an estimation for the differential DC propagation variance and, therefore, also for the available jitter strength in DC0 and DC1 is obtained.

Theoretical jitter analysis
Based on the stochastic model from Sect. 3, an estimate for the observed counter output variance can be made dependent on the value of F noise . In this work, we choose the highest value for F noise , that will still lead to an underestimation of the observed variance, as the final jitter strength estimate. The timing jitter accumulated by TDC1, will also influence the counter output variance. The model from Sect. 3 is extended here, to get an estimate for the counter output variance. The delay chain timing difference distribution (T n ∆ ) is given by Eq. 14. Due to a hardware constraint, the TDCs are only allowed to stop oscillating after both have gone through two full periods. The jitter accumulation time interval is therefore given as: with T 2T DC0 a random variable describing the time required for TDC0 to oscillate for two full periods. From Sect. 3, T 2T DC0 is IG distributed: TDC1 will oscillate during this accumulation time interval T n A . The phase of TDC1 at the end of this interval (assuming it started with zero phase), conditioned on the accumulation time interval length, is Gaussian distributed: The condition to the accumulation time interval length can be removed similarly as was done in Sect. 3: The TDC oscillations will only stop after a positive edge has occurred in TDC1. All phases of TDC1 in the interval (2π(r − 1), 2πr] will therfore produce the same counter output: r. The probability of the counter output R being equal to a value r (PMF) is then given by: From this result, the variance Var[R] can be calculated and compared with the measurements.

Design parameter selection criteria
The proposed ES design has four design parameters that can be freely chosen by the designer: µ DC0 , µ DC1 , µ T DC0 and µ T DC1 . The infinitesimal variances (σ X ) are related to the phase drifts by the obtained jitter strength and Eq. 8. This section provides a selection strategy for these four parameters.

Pipeline balance
As for all pipelined architectures, balancing the propagation times for both stages is necessary. The propagation delay of the DC stage is given as: the slowest DC will determine when the TDCs can start resolving the timing difference. The maximal TDC resolving time is determined by the TDC resolution (res), defined as: Each period of TDC0, the TDC1 positive edge will have shifted with an amount of res compared to the positive edge of TDC0. The TDCs will stop oscillating as soon as TDC1 samples a different value from TDC0. This means at most P T DC 0 2res cycles of TDC1 are required. The maximal TDC resolving time is then given as: To make sure the TDCs finish resolving before the DCs finish accumulating jitter for the next output bit, the TDC resolving time should be smaller than the maximal DC propagation delay: d T DC < d DC . This constraint imposes an upper bound to the TDC resolution: . (49)

Entropy density
According to [KS11], a minimal Shannon entropy density of 0.997 bit/bit at the output is required. The stochastic model from Sect. 3 is used to determine the theoretical entropy density at the output. Timing jitter will accumulate proportional to a square root with respect to accumulation time (addition of independent variances). A maximal TDC resolution size is required to be able to extract the required entropy from the accumulated DC timing jitter. This observation leads to an upper bound on the required TDC resolution, given as: with α a constant related to the required entropy density and the shape of accumulated timing jitter distribution. The value of α can now be determined by evaluating the model from Sect. 3 for multiple values of DC accumulation time and searching for the upper bound on the required resolution. Figure 13 shows the model results. As can be observed, the obtained α is not perfectly constant. A lower bound (horizontal line in Fig. 13) is selected, such that Eq. 50 will always produce a valid upper bound for the TDC resolution. The value for α used in the remainder of this work equals 1.94.

ES Throughput
The ES throughput is related to the DC accumulation time:

Delay control circuit
To enable the throughput optimisation procedure, a fine control over the DC accumulation time (T n 0 and T n 1 ) and the TDC resolution is required. Figure 15 shows a circuit breakdown for both the DC and TDC ROs. Each DC RO consists of four stages and each stage contains five inverters of decreasing effective length.
Some inverters (indicated in Fig. 15) can be switched on/off by controlling a configuration input as shown at the right of Fig. 15. When an inverter is switched on, its output current is used to accelerate (dis)charge of the load capacitance, reducing the stage propagation delay. When the inverter is switched off, it still contributes to the load capacitance seen by the previous stage, further increasing the propagation delay.
Both TDC ROs consist of two stages and each stage contains eight identical minimal sized inverters that have the same on/off control. The inverters do not require analog voltages to be configured, they are either turned fully on or fully off. Having an all digital design removes the need for additional circuitry to generate analog voltages on chip.
Both DC and TDC ROs have 16 configuration bits each, driven by a controller circuit external to the device. Given the architecture in Fig. 15, for all devices tested, a configuration in the optimal region could always be found.  at nominal conditions: 20 • C environment temperature and 0.9 V supply voltage. For each device, the DC and TDC ROs are configured to obtain an operating point inside the optimal region, as explained in Sect. 5. This was achieved by scanning all DC and TDC RO frequencies and selecting an optimal combination.

IID claim verification
As claimed in Sect. 3.5, the output bits are by design IID. Two experiments are performed to verify this claim: a correlation analysis of the generated counter (R) values, and the NIST SP 800-90B IDD test [TBK + 18].

Correlation analysis
The sample correlation coefficient of 4096 consecutively generated counter samples (realisations of R) from chip 0 is calculated for sample lags ranging from 1 to 1024. The sample correlation coefficient is calculated as: with r i the i-th generated sample andr the sample mean. The results in Fig. 16 show no significant sample correlation, further strengthening the IID assumption.
NIST SP 800-90B IID test All five devices pass the NIST SP 800-90B IID test, using 1 Mbit of consecutively generated bits.

Entropy validation
As minimally required by [KS11], the estimated output bit min entropy should be larger than 0.91 per bit (equals 0.997 bits of Shannon entropy). In Sect. 5, the ES design parameters have been selected to output at least 0.91 bit of min entropy, higher entropy levels are possible at the cost of reduced throughput. Table 1 provides an overview of  the output min entropy estimates for all five devices tested, using 1 Mbit of consecutive data at nominal conditions. Each of the devices reach the required min entropy level. The entropy estimate obtained from the NIST SP 800-90B tests with 1 Mbit of data is even more conservative than the one obtained from the stochastic model, which is expected as these tests tend to underestimate the available entropy [Saa21]. The counter output (R) could be used as a health metric, indicating a possible entropy reduction.

Power and throughput
All five devices tested achieve a throughput of over 250 Mbit/s at nominal conditions, as can be seen in the left graph of Fig. 17. Process variations in the DC/TDC ROs can lead to some devices having better/worse performance. One device (chip 1) has been extensively tested at different voltage conditions. The experimental results in the middle graph of Fig. 17 show that for all supply voltage levels tested, the output bit entropy remained above the 0.91 bit/bit threshold. The right graph of Fig. 17 shows the power consumption breakdown and energy efficiency per generated bit. Best energy efficiency is achieved at 0.8 V supply: 1.46 pJ/bit, which is lower than previous reported. The power breakdown shows that the Core, DC and TDC consume 54.2 %, 8.2 % and 37.6 % of the total power consumption respectively at nominal conditions. The core module contains the digitisation and synchronisation circuitry.

Comparison
Compared to previous work in Table 2, the proposed design achieves best energy and second best area efficiency (throughput generated per unit of normalised area). The jitter pipelining architecture together with high TDC time resolutions allows for high throughput at a modest area and power requirement. A chip photo is depicted in Fig. 18. The ES circuitry (DC, TDC and core) occupies 750.7 µm 2 . Additional configuration flip-flops, to store the DC/TDC configuration (Conf) and interfacing logic (Send) are added to measure out the devices.

Conclusion
The proposed ES architecture was designed and verified following an approach compatible with modern standards. Thanks to the digital nature of the circuits used, this design gains all benefits related to digital CMOS, such as scaling and design integration. A stochastic model capable of estimating the output bit entropy is presented, together with an on-chip jitter measurement methodology to quantify the jitter strength platform parameter. An optimisation scheme is presented to guide the design parameter selection process and to ensure maximal throughput is obtained for the given platform parameters. The jitter pipelining structure allows for efficient (both in terms of area and energy usage) on-chip entropy generation.