ABE Squared: Accurately Benchmarking Efficiency of ABE

Measuring efficiency is difficult. In the last decades, several works have contributed in the quest to successfully determine and compare the efficiency of pairing-based attribute-based encryption (ABE) schemes. However, many of these works are limited: they use little to no optimizations, or use underlying pairingfriendly elliptic curves that do not provide sufficient security anymore. Hence, using these works to benchmark ABE schemes does not yield accurate results. Furthermore, most ABE design papers focus on the efficiency of one important aspect. For instance, a new scheme may aim to have a fast decryption algorithm. Upon realizing this goal, the designer compares the new scheme with existing ones, demonstrating its dominance in this particular aspect. Although this approach is intuitive and might seem fair, the way in which this comparison is done might be biased. For instance, the schemes that are compared with the new scheme may be optimized with respect to another aspect, and appear in the comparison consequently inferior. In this work, we present a framework for accurately benchmarking efficiency of ABE: ABE Squared. In particular, we focus on uncovering the multiple layers of optimization that are relevant to the implementation of ABE schemes. Moreover, we focus on making any comparison fairer by considering the influence of the potential design goals on any optimizations. On the lowest layer, we consider the available optimized arithmetic provided by state-of-the-art cryptographic libraries. On the higher layers, we consider the choice of elliptic curve, the order of the computations, and importantly, the instantiation of the scheme on the chosen curves. Additionally, we show that especially the higher-level optimizations are dependent on the goal of the designer, e.g. optimization of the decryption algorithm. To compare schemes more transparently, we develop this framework, in which ABE schemes can be justifiably optimized and compared by taking into account the possible goals of a designer. To meet these goals, we also introduce manual, heuristic type-conversion techniques where existing techniques fall short. Finally, to illustrate the effectiveness of ABE Squared, we implement several schemes and provide all relevant benchmarks. These show that the design goal influences the optimization approaches, which in turn influence the overall efficiency of the implementations. Importantly, these demonstrate that the schemes also compare differently than existing works previously suggested.


Introduction
Since attribute-based encryption (ABE) was introduced in 2005 by Sahai and Waters [SW05], much progress has been made in the development of pairing-based ABE schemes. As is common in the field of cryptography, whenever a new scheme is presented, its efficiency is compared to that of other state-of-the-art schemes. For ABE, the Charm framework [AGM + 13] is used in many cases [RW13,RW15,AC17a,ABGW17], which simplifies the prototyping of new pairing-based schemes and provides benchmarking tools. However, because Charm mainly aims at usability in this endeavor, it uses several abstraction layers between the schemes and the necessary arithmetic. As a result, not all available optimizations can be used in benchmarking efforts, even though these might be significant in any comparisons. Furthermore, by default, the Charm framework builds on the PBC library [Lyn13], which only supports outdated elliptic curves that have been proven not to provide 128 bits of security [KB16,BD19]. Consequently, many implementations and efficiency comparisons use these outdated curves. By extension, those implementations do not provide realistic estimates of computational costs in practice. When implemented for practice, curves that currently provide 128 bits of security should be used. Because these might provide different trade-offs in efficiency, the implementations may incur different computational costs than the curves used in the old benchmarks [Ara17,CDS20].
Oftentimes, works that do not use Charm in their efficiency analyses have similar issues. For instance, they may not use all, if any, optimized arithmetic or other lowerlevel optimization techniques [Zeu20, AHM + 16, TKN20,PRMV21]. Such techniques allow for faster computations of exponentiations, such as fixed-base exponentiations or multiple-base exponentiations [Sco11,Möl01]. These are often used when elliptic-curve schemes are deployed in practice, and provide a significant computational advantage over regular variable-base exponentiations. Other implementations may be targeted for specific platforms such as certain embedded devices [SR13, WZSI14, MTP + 21]. Hence, they are difficult to use in future efficiency comparisons without implementing the schemes for those specific devices. On the other hand, software implementations that do optimize the arithmetic used in the schemes [ZPM + 15] have implemented all underlying arithmetic for some specific elliptic curve, and are therefore difficult to adapt to other, more up-to-date, curves. This is problematic, since this particular curve, e.g., the BN254 curve [BN05], may turn out to provide e.g., only 100 bits of security [SKSW20].
Another common denominator of these implementations is the absence of a clearlydefined design rationale when the schemes are instantiated in pairing-friendly elliptic-curve groups. For instance, choosing a suitable pairing-friendly group providing 128 bits of security [Gui20b] is important for the overall efficiency [Sco11,CDS20]. However, not every curve may be a good choice for every scheme. Moreover, at the protocol level, schemes are often designed in the symmetric, type-I setting [Sco11]. That means that the used pairing operationê : G × H → G T -which maps two source groups G and H to a target group G T -is assumed to be symmetric, i.e., the two source groups are isomorphic, and thus, G = H. On the other hand, in practice, it is better to use asymmetric, type-III pairings due to their efficiency [GPS08] and security [Gal14], such that no efficiently computable isomorphism exists between the two source groups, i.e., e : G × H → G T with G = H. While such schemes can be converted from the type-I to the type-III setting [RCS12], existing works that facilitate this [AGH13, AGOT14, AGH15, AHO16] are often not used in the implementation of ABE schemes.
Nevertheless, such a design rationale determines the groups in which the computations are performed during key generation, encryption and decryption, and is therefore crucial when analyzing the efficiency of the scheme. This rationale heavily influences the choice of pairing-friendly groups and the type conversion, and by extension, the efficiency of the scheme. For instance, operations in G are generally more efficient than those in H [Ara17, AGM + , CDS20]. Consequently, if a designer places all ciphertext components in G, then the encryption efficiency is optimized at the expense of the key generation efficiency. Another designer might want to optimize the key generation efficiency, and therefore places all key components in G, and the ciphertext components in H. Because not all implementations take into account and specify these considerations, they cannot be effectively and meaningfully compared [VAH21]. In fact, a somewhat unethical cryptographer who, for instance, wants to promote their new scheme's fast encryption algorithm could place all ciphertext components of their own scheme in G, while they place the compared scheme's ciphertext components in H. As a result, their own scheme might outperform the other scheme, even though the other scheme would have outperformed the new scheme if its ciphertext components had also been placed in G. In summary, for various efficiency goals, a different distribution of the key and ciphertext components over the two source groups may be optimal.
In this work, we aim to resolve the aforementioned issues. In particular, we provide a framework for benchmarking and comparing efficiency of ABE schemes that takes into account important features such as optimized arithmetic and conversion techniques. Along the way, we introduce novel conversion techniques to obtain e.g., a type conversion with an optimized decryption algorithm. We also show how this framework can be applied to existing schemes by implementing and benchmarking them. Lastly, we illustrate how these benchmarks can be compared fairly, by comparing the variants that are optimized in the same way, e.g., the variants with an optimized decryption algorithm.

Our contribution
We set up ABE Squared, a general framework for accurately benchmarking efficiency of ABE. This framework describes optimization approaches in the implementation of ABE schemes based on various design goals, by unifying multiple established areas in optimization. By choosing one design goal, multiple schemes can be optimized in a uniform way, and thus be fairly compared. Concretely, • we identify four optimization layers that are important in the implementation of the schemes: the used arithmetic and group operations; -the choice of pairing-friendly curve; -the order of computations; -the type-conversion techniques; • we formulate various optimization approaches for several clearly defined design goals: optimized key generation; -optimized encryption; -optimized decryption; -balanced (in any combination of the algorithms, e.g., balanced key generation/encryption, balanced encryption/decryption); • as part of the optimization approaches, we introduce new heuristic, manual conversion techniques from the type-I to the type-III setting, which takes into account the other optimization layers. This is especially important for optimizing the decryption algorithm, for which the existing frameworks fall short.

Background
We provide further background information and motivate our choices and some of the features of the new framework.
Positioning our framework. Our framework aims to bridge a gap in the benchmarking of ABE schemes, as described above. Notable software implementations that are built on libraries such as RELIC and that provide benchmarking utilities-and that are still maintained-are Charm and OpenABE. However, their goals are arguably different from ours (see Figure 1). Charm and OpenABE are both focused on usability, albeit in different ways. Charm aims to be usable in the prototyping and benchmarking of schemes, so that cryptographers can implement new schemes without having to know implementation details. OpenABE aims to be usable for practical applications, providing ready-to-use ABE implementations for practitioners. As a result, neither of their implementations uses all available optimized arithmetic. In contrast, our framework, ABE Squared, focuses on optimization rather than usability, such that a more accurate view of the efficiency of ABE schemes can be obtained. Although the implementations can be used by any cryptographer to benchmark and compare the schemes, they are not immediately suitable for practical

Benchmarking Optimized Usability
OpenABE Charm ABE Squared applications like OpenABE. Furthermore, our implementations do not aim to provide a platform to readily implement new schemes, like Charm does, since it requires a significant engineering expertise and some familiarity with RELIC.

Notation
We use the following notations. If an element is chosen uniformly at random from some finite set S, then we denote this as x ∈ R S. We denote [a, b] = {a, a + 1, ..., b − 1, b}, and [b] = [1, b]. We use boldfaced variables A and v for matrices and vectors, respectively.

Access structures
In CP-ABE, the ciphertexts are associated with access policies, e.g., Boolean formulas consisting of the operators "AND" and "OR". To ensure that only authorized users can decrypt, the policies are converted into some suitable access structures.

Linear secret sharing schemes
For the definitions of the schemes, we represent policies A by linear secret sharing scheme (LSSS) matrices [GPSW06b], i.e., A = (A, ρ) is such that A ∈ Z n1×n2 p is a matrix, and ρ maps its rows to attributes. Then, for some random vector v = (s, v 2 , ..., v n2 ), the i-th share of secret s generated by this matrix is λ i = A i v , where A i denotes the i-th row of A. Let S be an attribute set, and define Υ = {i ∈ [n 1 ] | ρ(i) ∈ S}. If S satisfies A, then there exist ε i ∈ Z p for all i ∈ Υ such that i∈Υ ε i A i = (1, 0, ..., 0), and thus i∈Υ ε i λ i = s.

Implementing access structures
In our implementations, we represent the access structures as access trees [GPSW06a]. An access policy represented as a string is converted into a tree. The leaves correspond to the attributes and the nodes to OR, AND or (t, n)-threshold gates. In the full version [dlPVA22], we recap the algorithms to convert policies to access trees and LSSS matrices, and an algorithm to convert policies to more efficient LSSS matrices [LW10], which, as we show in Section 4.6.3, are more efficient than the access trees used by OpenABE.
• KeyGen(S, MSK) → SK S : The key generation takes as input a set of attributes S and the master secret key MSK, and outputs a secret key SK S .
• Encrypt(M, A, MPK) → CT A : The encryption takes as input a message M , an access policy A and the master public keys MPK. It outputs a ciphertext CT A .
• Decrypt(CT A , SK S ) → M : The decryption takes as input the ciphertext CT A with an access policy A, and a secret key SK S for a set of attributes S. It succeeds and outputs the message M if S satisfies A. Otherwise, it aborts.
A scheme is called correct if M = M .
Large-universe ABE. The universe of attributes U can be small or large [SW05]. If in the Setup, a public key is generated for each attribute, the universe is small. Conversely, if the size of the master public key does not depend on the size of the universe, the universe is large. In some large-universe schemes, the public key associated with some attribute is generated with e.g., a full-domain hash (FDH) [PTMW10,GPSW06a].
Multi-use ABE. The policies used during encryption may be restricted in the number of times that one attribute may occur. If an attribute may only occur once, we call the scheme one use. If it allows unlimited occurrences of one attribute, we call it multi use.

Pairings (or bilinear maps)
We define a pairing to be an efficiently computable map e on three groups G, H and G T of prime order p, i.e., e : G × H → G T , with generators g ∈ G, h ∈ H such that for all a, b ∈ Z p , it holds that e(g a , h b ) = e(g, h) ab (bilinearity), and for g a = 1 G , h b = 1 H , it holds that e(g a , h b ) = 1 G T , where 1 G denotes the identity of the associated group G (nondegeneracy). We refer to G and H as the two source groups, and G T as the target group.
If an isomorphism exists between G and H, i.e., G = H, we call the pairing symmetric or of type I. If G = H, we call the pairing asymmetric. Specifically, if an efficiently computable homomorphism exists from H to G (but not from G to H), we call the pairing of type II, and if there exists no such efficiently computable homomorphism, we call the pairing of type III. With respect to the efficiency and security, type-III pairing groups are preferred [GPS08,Gal14]. However, most ABE schemes are designed in the type-I setting; hence, they need to be converted to the type-III setting [AGOT14, AGH15, AHO16]. We useê for pairings in general, and e for type-III pairings.

Pairing-based ABE
Most ABE schemes follow the same structure, and can be captured in the pair encodings framework [Att14,Att16,AC17b], which considers only "what happens in the exponent" of the keys and ciphertexts, and clearly indicates in which group each component resides. Schemes that fit in this framework have a master public key, secret keys and ciphertexts that exist almost entirely in the two source groups. The only exception is one target group element in the master public key and ciphertexts, which is used to mask the message, e.g., M · e(g, h) αs . To decrypt, e(g, h) αs must be recovered by pairing and possibly exponentiating the appropriate ciphertext and secret key components. Note that, because the secret keys and ciphertexts consist mostly of components in the two source groups, no pairing operations are required during key generation and encryption. We also use a shorter notation derived from this framework. For example, the master public key component B att = g batt is denoted as [b att ] G . Similarly, the secret key component K = h α−rb is denoted as [α − rb] H and the ciphertext component C = M · e(g, h) αs is denoted as [m + αs] G T , where we assume that M = e(g, h) m for some m ∈ Z p . We symbolize the pairing operation as expected, e.g., e([s] G , [r] H ) = e(g, h) sr , and indicate an exponentiation with integer r as [b] r G , which then evaluates to [rb] G . As a rule, we use variables involving the letter b for the master public key, variables involving the letter r for the secret keys and variables involving the letter s for the ciphertext.

Our framework: ABE Squared
We introduce our framework, ABE Squared, in this section. Concretely, we identify several layers of optimization: the arithmetic and group operations, the pairing-friendly groups that are used, the order of the computations and the type conversion (see Figure 2). The goal of our framework is to optimize the theoretical description of the scheme. As such, we want to obtain a description of the scheme that directly yields the most efficient implementation. In this process, the design goal associated with a practical application is crucial: some applications may require an optimized encryption while others require an optimized decryption algorithm. In order to achieve this goal, these four layers need to be optimized. We do this by devising optimization approaches based on these design goals.
These optimization approaches consist of several steps. In particular, we first analyze the efficiency of the arithmetic and group operations used in the schemes by benchmarking their efficiency in the pairing-friendly groups that can be used. Subsequently, we show how the order of computations can be optimized, given the efficiency of the available algorithms for arithmetic in the chosen pairing-friendly groups. (Note, however, that the optimal order may depend on the choice of pairing-friendly groups and the distribution of the key and ciphertext components, i.e., in which groups these live. Because type conversion-which determines this distribution-is the next step, we may need to adjust the order at a later stage in the optimization approach.) Finally, we show how the schemes can be instantiated in these groups to obtain the best possible efficiency, given the design goal. To this end, we devise new manual and heuristic techniques to convert the scheme from the type-I to the type-III setting. Possibly, the choices that are made during this type conversion might require that we circle back to the choice of pairing-friendly groups or the order of computations. For instance, it may not be clear what the best choice of pairing-friendly group is without simply benchmarking the schemes for all of them.

Optimized arithmetic and group operations
We analyze the efficiency of the arithmetic that may be required to perform the algorithms of a scheme. Many efficient algorithms exist to perform arithmetic in groups G, H and G T , including optimized algorithms for certain combinations of arithmetic. Furthermore, depending on the fixed use of certain variables, the use of precomputation may significantly speed up the computations.
• Variable-base exponentiation (VBE): an exponentiation of the form g x , in which the variable base g varies in each execution of the algorithm; • Fixed-base exponentiation (FBE): an exponentiation of the form g x , in which the base g is fixed after the setup and is the same in each execution of the algorithm; The arrows have the following meaning: Figure 2: Overview of the ABE Squared framework and its relationship with ABE applications. In particular, the diagram describes the steps needed between the theoretical descriptions and the implementations of (possibly multiple) ABE schemes. Instead of moving from a design goal to the implementation of a scheme directly, we first optimize the theoretical description of the scheme for the chosen design goal.
• Multiple-base exponentiation (MBE): a product of multiple exponentiations [Möl01], typically of the form i∈I g xi i such that g i are bases and x i are exponents for each i ∈ I with |I| ≥ 2. Note that RELIC refers to these as simultaneous exponentiations 1 instead, and has two functions for this algorithm: _mul_sim, a two-base variant and _mul_sim_lot, a multi-base variant; • Multi-pairing: a product of pairing operations can be executed more efficiently [GS06]. In general, a pairing computation consists of a Miller loop [Mil04] and a final exponentiation. In a pairing product, the final exponentiation can be shared, i.e., it only needs to be performed once. In this way, only the Miller loop needs to be executed for each additional pairing operation in the product; • Fixed-argument pairing: a pairing operation can be computed more efficiently if the first argument is fixed. For instance, [CS10] speeds up the Miller loop by 37%. RELIC does not support fixed-argument pairings, however; • Hashing into the group: a mapping from the set of arbitrary-length strings {0, 1} * to a group. RELIC supports these, including a more optimized variant for the BLS12-381 curve [WB19].

Optimal choices of pairing-friendly groups
Another aspect that influences the efficiency of the algorithms is the choice of the pairingfriendly group [GPS08,Ara17]. In general, many pairing-friendly groups exist that provide 128 bits of security [Gui20a], currently the recommended minimum security level for cryptography [Bar20]. These groups typically consist of elliptic-curve groups, such as the BLS [BLS02] and BN [BN05] curves. Some of the curves listed in [Gui20a] provide more than 128 bits of security, and therefore, they will still likely yield sufficient security if the most novel attacks are slightly improved [KB16,BD19,Gui20a]. In contrast, other curves provide slightly fewer than 128 bits of security and may not provide sufficient security if these attacks are improved. RELIC [AGM + ] supports three curves with security levels in the [129, 135]-range, i.e., BN446, BLS12-446 and BLS12-455, and two curves in the [125, 128]-range, i.e., BLS12-381 and BN382. On the one hand, curves with a higher security level provide less efficient arithmetic [GPS08]. On the other hand, these curves provide more than 128 bits of security. This might also be beneficial, because most ABE schemes decrease a few bits in security as some of the parameters, e.g., the size of the access policies, grow [Wat11, RW13,AC17b]. If curves with a security level in the [125, 128]-range are used, the implementations of these schemes provide even fewer than 128 bits of security. For instance, BLS12-381 currently provides roughly 126 bits of security [GMT20], and the schemes that we have selected lose an additional 4 bits for the maximum policy sizes that we will use (and may even lose an additional 7 bits, see Section 4.1.4). Therefore, the implementations provide 122 bits of security. In contrast, BLS12-446 and BN446 provide 132 bits of security [GS19], and thus, the implementations provide 128 bits of security.

Benchmarks of the group operations on various curves
To choose a suitable curve, it is important to know how efficiently the group operations perform. Table 1 lists the performances of various algorithms on the elliptic curves that we will use in our benchmarks in Section 4. The table shows that, at the same security level, BLS12 curves outperform the BN curves in almost all algorithms except hashing and multiple-base exponentiations with large numbers of bases. Therefore, we expect that for most, if not all, schemes, the BLS12 curves are better choices than the BN curves. The table also shows that the arithmetic in G is generally faster than the arithmetic in H, which in turn is faster than the arithmetic in G T . Furthermore, performing an additional pairing operation-whose costs are slightly lower than the costs incurred by a Miller loop-is more costly than exponentiating in G and H, while it is less costly than exponentiating in G T .

Optimizing the order of computations
The order of the computations can also be optimized. The most notable example of an optimized order of computations is to share a pairing operation when several components share an argument on the other side of the pairing [PTMW10]. From this point forward, we will refer to this kind of product as a shared-argument pairing product. For instance, rather than computing i∈Υê (K, C i ), one can computeê(K, i∈Υ C i ), which only requires one pairing operation and |Υ| multiplications in one of the source groups instead of one |Υ|-multi-pairing. Similar optimizations can be done by allowing the key generation authority to generate components such as h ratt(b1att+b0)+rb by first computing r att (b 1 att + b 0 ) + rb in Z p and then exponentiating h with the result, rather than computing this as h ratt(b1att+b0)+rb = h rattb1att h rattb0 h rb . While the former only costs one exponentiation and three multiplications in Z p , the latter requires a three-base exponentiation, which is generally much less efficient. In optimizing the order of computations, it is important to know the efficiency of the operations in the various groups. For instance, i∈Υ (ê(K 1,i , C 1,i ) ·ê(K 2,i , C 2,i )) εi is often optimized to i∈Υê (K εi 1,i , C 1,i ) ·ê(K εi 2,i , C 2,i ), because two exponentiations in G are more efficient than one exponentiation in G T . While this is the case for the curves considered in this work, it might be the case that, for some curves, it is more efficient to do one exponentiation in G T instead of two in G. Furthermore, we show in Section 3.6 that the optimal order may depend on the distribution of the key and ciphertext components over the two source groups (e.g., Remark 3).

Our optimization approaches for specific design goals
In optimizing the ABE schemes, we consider various approaches based on specific real-world design goals. In particular, this influences the conversion of the schemes to the type-III setting, but possibly also the choice of an elliptic curve. For pairing-based schemes in general, such conversions from the type-I to the type-III setting were previously considered in [AGH13, AGOT14, AGH15, AHO16], which all automate this effort and which focus mostly on other predicate encryption primitives such as identity-based encryption [Sha84]. However, these frameworks only optimize the parameter sizes, and not necessarily the computational costs of the algorithms. While e.g., optimizing the ciphertext size also results in an optimized efficiency of the encryption algorithm, such approaches might not necessarily lead to an optimized decryption algorithm. Furthermore, depending on the application in which ABE is going to be deployed, a practitioner may prefer a more balanced approach, in which the total costs of e.g., the encryption and decryption algorithms are optimized rather than either one of them. To this end, we define the following optimization approaches based on design goals.
• Optimized key generation (OK): optimize the efficiency of the key generation algorithm; • Optimized encryption (OE): optimize the efficiency of the encryption algorithm; • Optimized decryption (OD): optimize the efficiency of the decryption algorithm; • Balanced key generation/encryption (BKE): optimize the average costs of the key generation and encryption algorithms; • Balanced encryption/decryption (BED): optimize the average costs of the encryption and decryption algorithms.
A practitioner can also devise optimization approaches for other design goals, e.g., "balanced key generation/decryption". In general, a practitioner can specify any goal in which the average/total costs of any subset of algorithms is minimized. Even though these may be useful in practice, the conversion techniques used in these approaches are likely similar to those used in the aforementioned approaches.
The importance of computational-efficiency focused approaches. In contrast to most conversion frameworks [AGOT14, AGH15, AHO16], we do not necessarily describe our approaches in terms of the sizes of the public keys, secret keys or ciphertexts, but rather in terms of the computational costs of the algorithms like [AGH13]. However, due to the fact that the smallest group G also provides the most efficient arithmetic, we estimate that our optimized encryption and key generation approaches coincide with the optimized ciphertext and secret key size approaches in [AGH15]. The other three design goals, on the other hand, do not seem to match with any of the approaches in these conversion frameworks [AGOT14, AGH15, AHO16], even though these may be of interest to practitioners. For instance, optimizing either the key generation or encryption algorithm may result in a heavy performance penalty on the other algorithms, while a balanced approach would ensure that an algorithm can perform efficiently while only requiring minimal sacrifice in efficiency on the other algorithms. Furthermore, we show that an optimal key or ciphertext size does not necessarily imply an optimal decryption cost, but requires a more intricate, in-depth analysis of the decryption algorithm and the arithmetic provided by the chosen groups and pairing. At a high level, our approach is thus closer to [AGH13], which focuses on optimizing the computational efficiency of the scheme, albeit in an automated way. Another common denominator between [AGH13] and our work is that the converted scheme is not automatically secure, though we argue that the converted schemes are secure nonetheless. An advantage of our type-conversion techniques over [AGH13] is that we take into account the costs of the arithmetic and group operations in our optimizations.

Our type-conversion methods
We describe our type-conversion methods, which can be used to convert a scheme from the type-I to the type-III setting given some specific design goal (as discussed in Section 3.5). We assume that the scheme to be converted is given in the type-I setting, and that it can be somewhat freely converted to the type-III setting without breaking its security. This is often the case: schemes are predominantly designed in the type-I setting [Sco11], but the symmetry of the pairing in many cases is not needed for their security [AGH13]. The symmetry is, on the other hand, to some extent important for the correctness. Specifically, the key and ciphertext components that are paired during decryption need to live in different source groups. If these two paired components live in the same group, then this yields incorrectness of decryption, for the simple reason that they cannot be paired. In addition, when full-domain hashes are used, we are slightly more limited, since the components involving these need to be placed in the same source groups (see Remark 1). In sum, while we have much freedom in how we convert from the type-I to the type-III setting, we are bounded by the correctness of the scheme.
Furthermore, as we mentioned, conversion from the type-I to the type-III setting is not trivial, as any conversion heavily influences the efficiency of a scheme. Hence, we ideally want to apply this conversion in the optimal way considering the optimization approach (associated with the chosen design goal) and the correctness of the decryption algorithm. For instance, if we want to convert some scheme to the type-III setting such that it has the most efficient encryption algorithm (i.e., as in the OE approach), then we attempt to place as many ciphertext components in the first group G as possible. This consequently means that the key components that are paired with these ciphertext components need to be placed in the second group H. For the other approaches, the conversion is often more intricate, and requires knowledge of the computational costs in the groups G, H and G T .
For each optimization approach, we follow the same steps: (1) We first list the secret key and ciphertext components, and order them in such a way, that it is clear which components are paired during decryption such that we can maintain correctness of decryption; (2) We specify for each key-ciphertext component pair whether they need to be exponentiated and whether they occur in a product during decryption; (3) We determine the computational costs, for each key and ciphertext component, of the key generation and encryption algorithm; (4) To determine the computational costs of the decryption algorithm, the order of the computations needs to be optimized (Section 3.4), which depends on the curve, and possibly the distribution of the components; (5) Based on this information, we can determine the best possible distribution of the key and ciphertext components over the two source groups for a specific optimization approach. We describe how this can be done below.
Remark 1 (Full-domain hashes). A full-domain hash (FDH) is a mapping H 1 : {0, 1} * → G that maps arbitrarily-long bit strings into the group. Because no hashes , we need to place the key and ciphertext components involving the FDH in the same group. As a consequence, we have less flexibility in optimizing the schemes using FDHs for its large-universeness.

Optimized encryption
Given the list of paired key-ciphertext components, the strategy is simple: we place as many ciphertext components in the first source group as possible. Because we need to place the ciphertext components involving the FDH in G, we also place the key components involving an FDH in G, and thus place the ciphertext components paired with these key components in H.

Optimized key generation
Similarly, given the list of paired key-ciphertext components, the strategy is simple: we place as many key components in the first source group as possible. Similarly as in the optimized encryption approach, we always place ciphertext components involving an FDH in G, and thus place the key components paired with these ciphertext components in H.

Optimized decryption
To optimize decryption, we need to take a more careful approach. First, we need to consider whether group elements need to be exponentiated during decryption, because they occur in a shared-argument pairing product (see Section 3.4), e.g., jê (K , C j ) εj =ê(K , j C εj j ). In this case, our conversion consists of placing the shared argument in H and the otherwhich needs to be exponentiated-in G. For all key-ciphertext component pairs that do not occur in a shared-argument pairing product, it does not matter whether the key or ciphertext component is placed in G, as long as any potential exponentiation happens in the first source group. In these cases, we will place, by default, the ciphertext component in G, as it is oftentimes more important to optimize the encryption algorithm than the key generation algorithm. If the application allows the use of precomputation tables for all key components, we may also choose to place the key component in G, and perform the exponentiations with a fixed-base exponentiation.
Remark 2 (Shared exponentiations in G T ). During decryption, the combination of a pairing operation and an exponentiation, e.g.,ê(K j , C j ) εj , may be part of a larger pairing product, in which multiple key-ciphertext component pairs are exponentiated with the same value, e.g.,ê(K εj 1,j , C 1,j )·ê(K εj 2,j , C 2,j )·ê(K εj 3,j , C 3,j ) = (ê(K 1,j , C 1,j ) ·ê(K 2,j , C 2,j ) ·ê(K 3,j , C 3,j )) εj . If we compute it in the first way, then we require a 3-multi-pairing and three exponentiations in G. If we compute it in the second way, then we require three pairing operations and one exponentiation in G T , which may be more efficient, depending on the curve. Furthermore, the second ordering of the pairing operations may allow the use of fixed-argument pairings [CS10,Sco11], in which case the key components need to be placed in G. In the first ordering, we cannot use a fixed-argument pairing operation without requiring that the exponentiation is placed in H, which likely negatively affects the computational costs (see also Remark 4). In conclusion, this illustrates that optimizing a scheme to attain the most efficient decryption is more intricate than previous conversion techniques suggested.

Balanced key generation/encryption
For a balanced efficiency of the key generation and encryption algorithms, we optimize the total key generation and encryption costs. We do this by considering the computational costs for each key-ciphertext component pair. For each pair, we place the component with the highest computational costs in G, and the other in H. For instance, if the pair (K, C) like in our example is such that the computation of K only requires a fixed-base exponentiation, and the computation of C requires a multi-base exponentiation, then we place C in G and K in H. In this approach, it is also important to consider whether the pairs occur in a shared-argument pairing product during decryption. In this case, we place the shared argument in H and the other components in G. Therefore, the shared argument incurs only a constant cost in H in the computation (during key generation or encryption), while it incurs a linear cost in G (during encryption or key generation), subsequently optimizing the total costs of these computations.

Balanced encryption/decryption
Similarly, for a balanced efficiency of the encryption and decryption algorithms, we optimize the total encryption and decryption costs. This may be a slightly more complicated endeavor than the balanced key generation/encryption approach due to the more complicated nature of the optimized decryption strategy. Like in this strategy, we need to take into account whether a key-ciphertext component pair occurs in a shared-argument pairing product or not. In this case, it is beneficial for the decryption costs to place the shared argument in H and the other components in G. However, this may more negatively affect the encryption costs than that it positively affects the decryption costs. For instance, suppose that the coefficients ε j are small, e.g., ε j ∈ {0, 1} like in [LW10]. Then, jê (C , K ρ(j) ) εj can be computed asê( j K εj ρ(j) , C ) to minimize the decryption costs, requiring a linear number of multiplications in G. However, this ensures that C is in H, and therefore likely costs at least one exponentiation in H instead of G (depending on the computational costs of C ). If the expected average costs incurred by the multiplications needed during decryption is lower than the costs incurred by computing C , we might want to place C in G and place K ρ(j) in H. Remark 3 (Optimizing decryption for the OE and OK approaches). The key-ciphertext component distribution that follows from applying the OE and OK approaches may not be optimal for the order of computations performed in the decryption. For instance, consider the case that several shared-argument pairings have shared exponentiations, as in Remark 2, e.g., εj . Then, due to the distribution of the key and ciphertext components, the left-hand side-which has an optimized order for the curves considered in this work-may be less efficient to compute than the right-hand side, for some curves. For instance, doing (2 + n)-multi-pairing operations, two n-multiple-base exponentiations in H and n exponentiations in G may be more costly for some n than doing 3n pairing operations and n exponentiations in G T . This is not the case for the schemes and curves in this work. In any case, it does illustrate that, after the type conversion has finished, we may have to circle back to the optimized order of computations to verify whether it is still optimized, given the distribution of components.

Example: type-converting Wat11
We explain our type-conversion techniques through an example: by converting the CP-ABE scheme by Waters (Wat11) [Wat11] from the type-I to the type-III setting. We first show how to convert the small-universe version of Wat11, and then argue how these conversions translate to the large-universe version of Wat11.

Wat11-I: the small-universe variant
In the type-I setting, the Wat11-I scheme [Wat11] is defined as follows: Definition 2 (The Wat11-I-SYM scheme [Wat11]). The small-universe CP-ABE scheme by Waters is defined in the type-I (or: symmetric) setting as follows.
• Setup(λ): Taking as input the security parameter λ, the setup generates two groups G, G T of prime order p with generator g ∈ G, and chooses a pairingê : G × G → G T . The universe of attributes is U. The setup also generates random integers α, b, b att ∈ R Z p for all att ∈ U. It outputs MSK = (α, b, {b att } att∈U ) as its master secret key and publishes the master public key as • KeyGen(MSK, S): On input a set of attributes S, the algorithm generates random integers r ∈ R Z p and computes the secret key as .., n 1 } → U by generating random integers s,s i ,v j ∈ R Z p for all i ∈ [n 1 ] and j ∈ [2, n 2 ], and computing the ciphertext as

Listing the key and ciphertext components
To convert the scheme to the type-III setting, we first consider which key components need to be paired with which ciphertext components (see Table 2). In this way, we can ensure that each pair has exactly one component in each source group. Exponentiation

Optimized encryption and key generation
For the optimized encryption and key generation approaches, it is clear in which source groups the components need to be placed. Because the scheme does not involve hashing into the group, we have much freedom. For the optimized encryption approach, we can simply place all ciphertext components in G, and the key components in H. Conversely, for the optimized key generation approach, we can place all key components and G and the ciphertext components in H.

Balanced key generation/encryption
For a more balanced approach in the efficiency of key generation and encryption, we take into account the number of components on the "other side of the pairing" during decryption.
For instance, if one places K in G, then all C 1,j need to be placed in H, blowing up the encryption costs considerably. Hence, we place K in H and C 1,j in G to make the key generation and encryption costs more balanced. For the (K att , C 1,j ) key-ciphertext component pair, there is no such trade-off, as both cost one fixed-base exponentiation.
In this case, we favor the encryption algorithm (as mentioned in Section 3.6), as it is probably run more often than the key generation algorithm. (Note, however, that one may want to take a different approach, and favor the key generation over the encryption algorithm instead.) For this reason, we place C 1,j in G and K att in H. Thus, the optimized encryption and the balanced key generation/encryption efficiency approaches yield the same constructions, since all key components are placed in H.

Optimized decryption
To optimize the decryption algorithm, we need to consider the best order of the operations performed during decryption, i.e., Because a pairing operation is usually one of the most expensive operations, we want to minimize the use of these. Consequently, we use a shared-argument pairing and place the exponentations in G (Section 3.4). To ensure this, it is therefore better to put C 1,j in G and K in H. For the other product of pairing operations, i.e., j∈Υê C εj 2,j , K ρ(j) , it does not matter in which groups K ρ(j) and C 2,j live, as we can exponentiate in G, regardless of whether K ρ(j) or C 2,j is in it. If, on the other hand, one is willing to use precomputation tables for all key components K att , then we can speed up decryption by placing K att in G. Because this may require a large amount of precomputation space, this may, however, not be desirable in practice. Hence, we do not use precomputation, and, as mentioned in Section 3.6, we choose to favor the encryption efficiency over the key generation efficiency. We thus place the ciphertext component C 2,j in G and K att in H. The distributions of the key and ciphertext components of Wat11-I over the groups G and H, for each optimization approach, i.e., optimized encryption (OE), optimized key generation (OK), optimized decryption (OD), balanced key generation/encryption (BKE) and balanced encryption/decryption (BED).
Remark 4 (Fixed-argument pairings). We can hardly speed up-if we can, at all-the decryption algorithm by using a fixed-argument pairing operation, which decreases the Miller loop costs by 37% [CS10]. This would however require us to place the key components in the first source group, and in the case ofê( j∈Υ C εj 1,j , K )-which would thus be computed as e(K , j∈Υ C εj 1,j ) in the type-III setting-this may slow down the computation, as we would require |Υ| exponentiations in H instead of in G. Depending on the expected number of exponentiations, the decrease in computational costs required by performing the pairing operation may be outweighed by the additional costs incurred by the exponentiations. For the pairings involving C 2,j and K ρ(j) , we might be able to benefit from using a fixed-argument pairing, on the condition that we can do the exponentiation in G as well. Otherwise, the speed-up that is obtained from using a fixed-argument pairing may be outweighed by additional overhead that the exponentiation in H incurs over an exponentiation in G. However, it is unclear if it is possible to adjust existing algorithms [CS10] to facilitate both using a fixed-argument pairing and doing an exponentiation in G. Furthermore, because precomputation needs to be done for each K att , this requires the storage of possibly thousands of points.

Balanced encryption/decryption
Because the type conversion is the same for the optimized encryption and decryption approaches, it is, by extension, also the same for the balanced encryption/decryption approach. This is because, for each key-ciphertext component pair, we chose the best distribution of the two source groups with respect to the encryption and decryption efficiency. This therefore also yields the best efficiency trade-offs for the two.

Overview of the distributions for each optimization approach
In Table 3, we summarize the distributions of the key and ciphertext components for each approach. In particular, it shows that the distributions are the same for the optimized encryption, optimized decryption, balanced key generation/encryption and balanced encryption/decryption approaches.

Wat11-IV: the large-universe variant
The large-universe variant of the Waters scheme (Wat11-IV) [Wat08] replaces the generator g batt by the output of a hash function, i.e., H(att), where H : {0, 1} * → G denotes a hash function that maps arbitrary strings randomly in the group G. The advantage of this is that the scheme can support any arbitrary string as attribute without requiring to change the master public key to be updated. Compared to the original, small-universe variant of the scheme, little needs to change. However, the use of a hash into the group gives us a little less freedom in the conversion from the type-I to the type-III setting. By Remark 1, we necessarily place the key and ciphertext components involving the hash in the same source group. For the optimized encryption and optimized key generation approaches, it is evident that these therefore need to be placed in the first source group (see Table 1). For optimized decryption, it follows from the (K , C 1,j ) key-ciphertext component pair-which occur in a shared-argument pairing product-that the components involving the hash need to be placed in the first source group. That is, K needs to be placed in H because it is the shared argument in the shared-argument pairing product, while C 1,j -which involves the hash-needs to be placed in G. By extension, this requires K att to be placed in G as well, because it involves a hash. The distribution of the key and ciphertext components is therefore almost entirely fixed for all optimization approaches. We can only choose the distribution of components K and C . Because these incur a constant cost, the key generation, encryption and decryption costs are almost entirely fixed as well. Table 4 describes the distributions of the components over the two source groups of Wat11-IV. As it shows, the distributions are the same for the pairs (K , C 1,j ) and (K att , C 2,j . For the pair (K, C ), the distribution is the same as for Wat11-I.

Selecting the best elliptic curve for a specific goal
To fully optimize a scheme, it is important that the best curve is selected for each scheme and for each design goal. In general, this may not be the same curve for each scheme and each design goal, as the different choices of curves provide different trade-offs in efficiency. For instance, BN382 provides efficient hashing in the two source groups, while BLS12-381 provides efficient exponentiations and pairing operations [Ara17]. It may therefore be the case that ABE schemes that require many hashing operations are more efficient on the BN382 curve, while schemes that do not require these perform better on BLS12-381 curves. More generally, curves exist that provide more efficient arithmetic in G [CDS20] or that provide more efficient products of pairings [GF16]. However, these are unfortunately not supported by RELIC.
To determine the optimal curve for each scheme and each design goal, we compare the efficiency of the scheme on several curves providing the same level of security. To this end, we compare the computational costs of several ABE schemes on the curves providing the same level of security supported by RELIC.

Benchmarking
We show how our framework can be applied to several existing ABE schemes.

The schemes
In this work, we analyze and implement several selectively secure ciphertext-policy ABE schemes. We have motivated our choice to implement CP-ABE schemes in Section 1.2, and we will motivate the choice to implement selectively secure schemes below. The schemes that we implement are the Waters schemes (the previously considered small and large universe variants called Wat11-I and Wat11-IV) [Wat11,Wat08], the Rouselakis-Waters large-universe scheme without random oracles (RW13) [RW13] and the Agrawal-Chase multi-use scheme (AC17) [AC17b].

The Wat11 schemes
The Wat11 schemes [Wat11,Wat08] are the ciphertext-policy variants of the first ABE schemes [SW05,GPSW06a] and the selectively secure and more efficient variants of its fully secure counterpart [LOS + 10]. In general, the structure of the scheme is important, as it provides the structure for many follow-up schemes, e.g., [Att14,Wee14,KW19], which provide better security guarantees than Wat11. Furthermore, Wat11-IV, i.e., the large-universe variant using an FDH, is also implemented in OpenABE [Zeu20]. We have chosen to analyze this scheme mainly for its popularity.

The RW13 scheme
The RW13 scheme [RW13] is the selectively secure and ciphertext-policy counterpart of the fully secure KP-ABE scheme by Lewko and Waters [LW11b]. Like Wat11-IV, it supports large universes. However, instead of using an FDH to generate a public key for each attribute string, it uses a special type of hash, first introduced in identity-based encryption by Boneh and Boyen [BB04]. In particular, this hash does not need to be modeled as a random oracle [BR93] in the security proof. Much like Wat11, it is an important scheme due to its many follow-up schemes, e.g., [Att14, HW14, CGKW18, KW19, Att19]. Yet, despite its theoretical popularity, it is not often considered in efficiency comparisons with other schemes. This is one of the main reasons why we analyze its efficiency.

The AC17 scheme
The AC17 scheme [AC17b] is the multi-use variant of the second CP-ABE scheme by Waters in [Wat11] (Wat11-II) in the selective-security setting. In the full-security setting, AC17 is the multi-use variant of the one-use scheme by Attrapadung [Att14]. A somewhat related scheme is FAME [AC17a], which is single use, supports large universes and is derived from the small-universe CP-ABE scheme by Chen, Gay and Wee [CGW15]. The advantage of these single-use schemes is that they require fewer pairing operations during decryption, making decryption more efficient than e.g., Wat11-I. This is one of the main reasons why standardization institutes such as ETSI (European Telecommunications Standards Institute) [ETS18] have expressed interest in this scheme. However, the drawback of these single-use schemes is that each attribute may only occur once in each access structure. To support both multi-use access structures and to benefit from an efficient decryption algorithm, AC17 combines the techniques of Wat11-I and Wat11-II. In this way, AC17 is more efficient than Wat11-I and more flexible than Wat11-II and FAME. Much like for the Wat11-I scheme, we also consider the large-universe variant of AC17 using an FDH (which subsequently yields security in the random oracle model [BR93]) in our analysis. We have chosen to analyze this scheme because of its flexibility and efficiency.

On the security of these schemes
We briefly discuss the security of the implemented schemes.
Selective versus full security. As mentioned, we consider the selectively secure variants of Wat11, RW13 and AC17, because this yields a cleaner comparison of the schemes on a structural level. In contrast, many fully secure variants of these schemes exist, which are similar to Wat11, RW13 and AC17 on a structural level, but these differ in the underlying groups. For instance, LOSTW10 [LOS + 10] is a fully secure variant of Wat11 and is instantiated in composite-order groups. For the same security level, LOSTW10 performs one to two orders of magnitude worse than Wat11, which can be instantiated in a primeorder group [Gui13]. Other fully secure variants of Wat11 [Att19, KW19], which allow for instantiation in prime-order groups, might simply use different underlying group structures, which may affect the efficiency as well. However, this difference in efficiency might then be (partially) attributed to the choice of underlying groups, and not necessarily the different structures of the schemes. For a fair comparison of two structurally different schemes, one could first compare the efficiency of the selectively secure variants, instantiated in prime-order groups. Then, one can extrapolate the comparison to the full-security setting by considering the efficiency of the chosen underlying groups, which can be chosen the same if all the compared schemes have a fully secure counterpart in the same framework, e.g., [Att14,Wee14,CGW15,Att16,Att19]. Note that this is the case for Wat11, RW13 and AC17, which all have instantiations in the pair encodings framework [AC17b,Att19].
Security in idealized models. The security of the implemented schemes depends on idealized models, such as the random oracle model (ROM) [BR93] and the generic group model (GGM) [Sho97]. In particular, the large-universe variants of Wat11 and AC17 model the FDH as a random oracle in the proofs. Furthermore, the security of all three schemes depends on a q-type assumption [BBG05,Boy08]. These are assumptions that are parametrized in some parameter q, which in turn depends on one or more system parameters of the scheme. Typically, the q-type assumptions that are frequently used in selective-security proofs grow stronger as q increases. Specifically, Cheon [Che06] has shown that the security strength decreases by roughly log 2 ( √ n 2 ) bits, where n 2 denotes the maximum number of columns in an LSSS access structure, used during encryption. For instance, if the maximum policy size is 100 (like in this work), then we lose at least 4 bits (due to Cheon's attack) and at most 11 bits of security (in GGM, due to Boneh, Boyen and Goh's asymptotic lower bound [BBG05,Boy08] On the security of the type-converted schemes. An additional advantage of considering the selectively secure variants of the schemes is that their security proofs carry over to their type conversions as well. That is, in all of their selective security proofs [Wat11, RW13,AC17b], the inputs to a q-type assumption are embedded in the key and ciphertext components, such that the q-type assumption can be broken if the scheme's security can be broken. This q-type assumption is typically shown to be generically secure [BBG05,Boy08]. By extension, any type-converted variant of this q-type assumption with the same number of parameters is also generically secure with the same security loss in the GGM. In the worst case, they have twice as many parameters, and thus lose at most one additional bit of security in the GGM [Boy08].

The optimized type-converted constructions in short notation
Because we have five optimization approaches and five schemes-two small-universe and three large-universe schemes-we potentially obtain twenty-five different constructions to analyze. To effectively highlight the differences, we use the representation of ABE as introduced in Section 2.5, which only consists of the main differences: the master public key, the secret keys, the ciphertexts and decryption. For each scheme, we use λ j = A j v as the j-th share of the secret s, where A is the n 1 × n 2 LSSS matrix and v = (s, v 2 , ..., v n2 ) ∈ R Z n2 p is a vector with random entries, both used during encryption. During decryption, we use Υ = {j ∈ [1, n 1 ] | ρ(j)} and {ε j } j∈Υ such that j∈Υ ε j λ j = s (Section 2.2.1). We also distinguish between variables that the authority-that generates the keys-does and does not know. By placing a bar above a variable, e.g.,b att , we indicate that the authority does not know b att . For instance, by using an FDH to generate b att , the implicit exponent b att in H(att) = g batt is unknown, and is thus represented as [b att ] G .
In this section, we use the following naming convention. For schemes that already have an implementation, we append the name of the scheme with the suffix "CP", i.e., ciphertext-policy, to distinguish the schemes from their potential key-policy and other counterparts. For our type conversions, we use as suffix the appropriate acronym associated with the applied optimization approach. For example, Wat11-IV-OE is the name of the variant of Wat11-IV that is optimized with respect to the encryption algorithm. For conciseness, we do not use "CP" in the suffix of the names of our optimizations.

Wat11-IV
We obtain two different type-converted constructions of Wat11-IV, which is the largeuniverse variant of Wat11. As we mentioned in Section 3.7.8, the type conversion is essentially fixed for almost all variables, except for the pair ([α − rb], [s]).

Wat11-IV-OE.
We define Wat11-IV-OE as the type-converted variant of Wat11-IV with the most optimized encryption and decryption algorithms, and the most balanced key generation/encryption and encryption/decryption algorithms as follows.
• Master public key:

Wat11-IV-OK.
We define Wat11-IV-OK as the type-converted variant of Wat11-IV with the most optimized key generation algorithm as follows.

RW13
We obtain two different type-converted constructions of RW13. In general, owing to the lack of an FDH to obtain the large-universe property, RW13 is more flexible to convert. For example, for an optimized encryption (resp. key generation) algorithm, all ciphertexts (resp. keys) should be placed in G. For an optimized decryption, we need to observe the order of computations and the efficiency of the group operations to determine the best conversion.

RW13-OK.
We define RW13-OK as the type-converted variant of RW13 with the most optimized key generation algorithm as follows.

RW13-OE.
We define RW13-OE as the type-converted variant of RW13 with the most optimized encryption algorithms as follows. We show that it also has the most efficient decryption algorithm, as well as the most balanced key generation/encryption and encryption/decryption.
• Master public key: Note that the decryption algorithm is already optimized. Specifically, e( j∈Υ [λ j b + Moreover, this variant is also the most balanced. Because this variant is already optimized with respect to its encryption and decryption efficiency, it logically also optimizes the total encryption and decryption costs, and thus also provides the variant with the most balanced encryption-decryption efficiency. Furthermore, it has the most balanced key generation-encryption efficiency, because each key component can be generated with a single fixed-base exponentiation. In contrast, the associated ciphertext component costs at least a single fixed-base exponentiation, and is thus at least as expensive.

AC17
We obtain three different type-converted constructions of the small-universe variant of AC17, and two type-converted constructions of the large-universe variant of AC17. During encryption, we also include an additional function τ : [n 1 ] → [µ], where µ denotes the maximum number of uses of each attribute, such that τ is a mapping where for all j, j ∈ [n 1 ] for which ρ(j) = ρ(j ), it holds that τ (j) = τ (j ), i.e., each occurrence of the same attribute is mapped to a different integer in [µ]. For the small-universe variants, we also include the universe of attributes U in the setup. Note that, in the conversions of the small-universe variants, we have much freedom in how we convert the schemes. In contrast, in the large-universe variant, the FDH almost entirely fixes the conversion, much like in the Wat11-IV scheme.

AC17-OE.
We define AC17-OE as the type-converted variant of the small-universe variant of AC17 with the most optimized encryption algorithm as follows. Much like in the Wat11-I scheme, we place all ciphertext components in G.
• Master public key:

AC17-OK.
We define AC17-OK as the type-converted variant of the small-universe variant of AC17 with the most optimized key generation algorithm as follows. Much like in the Wat11-I scheme, we place all key components in G.
• Master public key:

AC17-OD.
We define AC17-OD as the type-converted variant of the small-universe variant of AC17 with the most optimized decryption algorithm as follows. We also show that this is the variant with the most balanced key generation/encryption and encryption/decryption efficiency. To optimize the decryption efficiency of AC17, we first consider the order of computations in the decryption algorithm of AC17-OK, i.e., • Master public key: Note that this variant also has the most balanced key generation-encryption, and most balanced encryption-decryption efficiency. For the BKE variant, this follows quite simply.

For the BED variant, this follows simply for the pairs ([α − rb], [s]) and ([r], [λ
, which have the same distributions in the optimized encryption and optimized decryption variants, and are thus also optimized in the total costs. For the pairs ([rb ρ(j) ], [s l ]), we consider the associated encryption and decryption costs. For encryption, we require m fixed-base exponentiations. The decryption costs can be upper bounded by |Υ| exponentiations, and lower bounded by one |Υ|-base exponentiation (for µ = 1), in the group in which the key component lives. Because, generally, we assume that |Υ| > µ, the costs of the |Υ|-base exponentiation dominate those of the µ fixed-base exponentiations. To minimize the total costs, we thus place [rb ρ(j) ] in G and [s l ] in H.

AC17-LU-OE.
We define AC17-LU-OE as the type-converted variant of the large-universe variant of AC17 with the most optimized encryption as follows. This variant also has the most efficient decryption algorithm, and the most balanced key generation/encryption and encryption/decryption efficiency. Much like for Wat11-IV (Section 3.7.8), the distribution of AC17-LU is almost entirely fixed because of the FDH.
• Master public key: Note that, in general, the only non-fixed pair is ([α − rb], [s]). For the optimized encryption variant, we place the ciphertext component [s] in G, and for the optimized key generation variant, we place the key component in G. For the decryption efficiency, the distribution does not matter, and therefore, we place the ciphertext component in G for the optimized decryption approach, but also for the most balanced encryption-decryption approach. Similarly, the total costs of the key generation and encryption algorithms are the same for the two possible distributions, and thus, we also place the ciphertext component in G for the balanced key generation-encryption approach.

AC17-LU-OK.
We define AC17-LU-OK as the type-converted variant of the large-universe variant of AC17 with the most optimized key generation as follows.
• Master public key:

Comparing our type conversions with existing implementations
We can now compare all the various optimization approaches described above with the implementations in the literature. The type-converted schemes considered in this work have been previously implemented in Charm [AGM + 13] and OpenABE [Zeu20]. Specifically, Wat11-I and RW13 have been implemented in Charm, and Wat11-IV has been implemented in OpenABE. Furthermore, FAME [AC17a]-which is related to the AC17 scheme-was previously implemented in Charm as well. We briefly compare their type conversions with ours, such that we can determine with respect to which optimization approach these implementations could be interpreted to be optimized.

Wat11-I-CP.
The small-universe variant of Wat11 as presented in Charm 2 is defined as follows.
• Master public key: Note that this construction does not match any of our type-converted constructions of Wat11-I. (It is, however, similar to Wat11-IV-OE and Wat11-IV-OK without the use of a hash to achieve large-universeness.) The designers [AC17a] explain that their type conversions aim to balance the total work fairly between encryption and key generation. This is indeed the case, as the total costs of the key generation and encryption algorithms are the same as the total costs of our variant with a balanced key generation-encryption efficiency. The difference between the two conversions can be attributed to our "deterministic" decisions in the conversion: for each key-ciphertext component pair that incur equal costs, we place the ciphertext component in G. In general, for such pairs, the distribution does not affect the total costs of the key generation and encryption algorithms as they are equal for the two choices.
Wat11-IV-CP. The large-universe variant of Wat11 as presented in OpenABE 3 is defined as follows.  • Ciphertexts: Note that this construction is exactly the same as Wat11-IV-OE. Therefore, the implementation of OpenABE is optimized with respect to its encryption and decryption efficiency rather than its key generation efficiency.

RW13-CP.
The variant of RW13 as presented in Charm 4 is defined as follows.
• Master public key: AC17-CP. The small-universe variant of AC17 with a similar key-ciphertext component distribution as FAME [AC17a] in Charm 5 is defined as follows.
• Master public key: • Secret keys: • Ciphertexts: • Decryption: Note that this construction is similar to our construction AC17-OD. Therefore, this construction is optimized with respect to the decryption algorithm.
AC17-LU-CP. The large-universe variant of AC17 with a similar key-ciphertext component distribution as FAME [AC17a] in Charm is defined as follows.
• Master public key: Note that this construction is similar to our construction AC17-LU-OK. Therefore, this construction is optimized with respect to the key generation algorithm.

Variants of Wat11-I with bad distributions of components
We also include two variants of Wat11-I, Wat11-I-BAD-I and Wat11-I-BAD-II, with a bad distribution of components in our analysis. This will illustrate that the choice of conversion technique does matter in any analysis, even if the arithmetic and group operations are optimized. The two implementations of this type conversion differ in whether they use optimized arithmetic or not. In particular, Wat11-I-BAD-II uses optimized arithmetic, while Wat11-I-BAD-I does not. This will illustrate that the use of optimized arithmetic speeds up the algorithms significantly. The converted scheme is defined as follows.
• Master public key: Note that the distribution of this scheme is bad, because it maximizes the number of operations in H. In particular, all algorithms require a linear number of operations in H. In contrast, Wat11-I-OE and Wat11-I-OK require a linear number of operations in G in the encryption and decryption, and key generation and decryption, respectively.

Implementation details
We rely on the RELIC toolkit [AGM + ] version 0.5 6 to implement the schemes described in Section 4.2. Multiple versions of the library were generated for the x86-64 architecture, i.e., for curves BN256, BN382, BN466, BLS12-381, and BLS12-446. We use the gcc compiler version 10.3.0 using a Linux distribution with kernel version 5.11.0-18 and the following flags -O3 -funroll-loops -fomit-frame-pointer -finline-small-functions -march=native -mtune=native.
RELIC is a cryptographic library that provides efficient arithmetic implementations of prime and binary fields, bilinear maps and extension fields. Since it implements architecturedependent code, it aims at optimizing speed. The rationale of using RELIC against other alternatives has to do with the current contributions from academia. Examples are the recent integration of the fast constant-time GCD algorithm of Bernstein and Yang [BY19] and the records RELIC has set [AFK + 12]. In addition, of all libraries, RELIC supports the most curves providing at least 128 bits of security [SKSW20]. Therefore, it allows us to compare the efficiency of ABE schemes with regard to various curves. The default compilation options of RELIC for field arithmetic consist of the integrated modular addition, multiplication, squaring, Montgomery reduction and sliding window modular exponentiation. The extension field arithmetic options are based on the integrated lazyreduced extension field arithmetic of RELIC. Finally, the compilation flags for the bilinear pairing implementation select the optimal Ate pairing [Ver10]. We use the library to perform, for instance, the optimal Ate pairing of two group elements in a parametrized elliptic curve of embedding degree 12, and to perform multi-pairing operations and simultaneous exponentations of group elements.
To optimize the number of clock cycles of our implementations, we rely on the multipairing operations of RELIC (pp_map_sim_oatep), fixed-base exponentiation via precomputation tables (_mul_fix), simultaneous exponentiation of multiple elements from the same group (_mul_sim_lot) and simultaneous exponentiation of two elements of the same group (_mul_sim). Note that, in the default configuration of RELIC, most of these algorithms do not run in constant time.
Our implementations contain a correctness check at the end of the decryption operation. We measure the number of clock cycles for each algorithm, i.e., the setup, key generation, encryption, and decryption. We also measured the number of clock cycles of the necessary arithmetic, for which we provided benchmarks in Section 3.3. We calculate the average of the number of clock cycles in each case over 10,000 iterations per operation.
We use the implementation of the access structures based on access trees (Section 2.2) provided by OpenABE [Zeu20]. We replace it in Section 4.6.3 by precomputed LSSS matrices (see the description of the more efficient LSSS matrices in the full version [dlPVA22]). Finally, we benchmark the implementations based on an increasing number of attributes i.e., 1, 5, 10, 20, 30, ..., 100. For encryption and decryption, we use an AND policy that increases linearly with the number of attributes. We measure our implementations using the AMD Ryzen 7 PRO 4750 processor with power management disabled (one single core) and throttle at max. frequency (4.1 GHz).

A note on the _mul_sim_lot function
In the decryption algorithm, we use the _mul_sim_lot function. Since the coefficients used are typically very close to the group order, their encodings in RELIC are very small, and thus, the efficiency of this function in our implementations is not the same as depicted in Table 1. Instead, it is much faster. Furthermore, for the BLS and BN curves, the GLV [GLV01] recoding (used in the _mul_sim_lot function) of the exponent is different, and yields more efficient encodings for BN curves than for BLS curves. The costs of _mul_sim_lot for our specific inputs are listed in Table 5. In future implementations, it may be better to replace the _mul_sim_lot by a custom function, or replace the access trees by the more efficient LSSS matrices as considered in Section 4.6.3. Converting Boolean formulas into these matrices yields exponentiations with ε j ∈ {0, 1}.

Memory footprint
We analyze the memory footprint of the data structures in our implementation of the setup, key generation, encryption and decryption algorithms by providing upper bounds on their memory consumption. This analysis illustrates that, for computationally powerful devices such as computers and smartphones, the memory footprint is reasonable. Roughly, our implementations use two kinds of data structures that are loaded in working memory: precomputation tables, and regular storage costs (such as the keys and ciphertexts, and policies and sets of attributes). In addition, some of our computations use temporary data structures to store intermediate results, but the overhead incurred by these is small, i.e., less than one KiB. The maximum overhead incurred by the RELIC functions depends mostly on the regular storage costs, and is thus upper-bounded by those costs.

Regular storage costs
A large part of the memory consumption is attributed to the regular storage costs, i.e., the sets, access policies, keys and ciphertexts. The algorithms use the following structures: • Setup: MPK, MSK; • Key generation: MPK, MSK, SK, S; • Encryption: MPK, CT, A; • Decryption: MPK, SK, S, CT, A.
The sizes of the keys and ciphertexts can be observed directly from their descriptions in Section 4.2 and the used elliptic curve. The sizes of the sets and policies depend on the representation of the attributes. Furthermore, we analyze the size of the access structures (i.e., trees or matrices) associated with the policies in more detail below.

Access trees.
For almost all schemes, we use the OpenABE access tree structure. In OpenABE 7 , one node in the tree contains the following information: node type (i.e., whether it is a leaf or a gate), the threshold value, the number of children, pointers to the children (we assume in this case the number of children is always two, each pointer being 8 bytes in a 64-bit architecture), node identifiers (i.e., prefix, label and index) and two auxiliary variables. Let node ≈ 48 bytes denote the size of the node, then the total costs incurred by the access tree are roughly (2 · |A| − 1) · node , because each operator (e.g., AND) and each attribute in the policy are assigned to a node. LSSS matrices. The size of the LSSS matrix in RW13-OE-LSSS (Section 4.6.3) depends on the number of attributes and operators in the policy. To obtain a realistic upper bound on the matrix size, we assume that all operators are AND-which yields the largest matrices-and each entry is represented by as an element in Z p . The maximum size is thus |A| 2 · Zp , where Zp denotes the length of an element in Z p .

Precomputation tables
Our implementations use precomputation tables for the generators of the source groups G and H to speed up the exponentiations. For the RW13 implementations, we also generate precomputation tables for the (four) other public keys. In RELIC, the default configuration for fixed-base exponentiation uses the single-table comb method [LL94], and yields precomputation tables of 16 points (in affine coordinates) for all curves. Table 6 summarizes our analysis of the memory footprint. In particular, Table 6a lists the storage costs for each implemented scheme. Table 6b provides upper bounds on the total memory consumption for the BLS12-381 and BN446 curves, of which the latter yields the highest memory consumption. Note that all implementations fit easily in RAM of computationally powerful devices such as computers and smartphones.

Performance analysis of our implementations
We analyze the performance of the implementations in our framework and compare it with those in existing frameworks such as Charm [AGM + 13] and OpenABE [Zeu20]. Our goal with this comparison is not necessarily to illustrate that our implementations are faster than those of Charm and OpenABE. Rather, we want to show that the choice of optimization approach as well as the use of all available optimized arithmetic influences this analysis. Not consistently and systematically using these may result in an unfair comparison of two schemes.
There are two main differences between our implementations and those in Charm and OpenABE. First, in contrast to Charm and OpenABE, we have removed all abstractions between the scheme and the used arithmetic. Therefore, we can use all available optimizations to accelerate the scheme as much as possible. This also includes constructing and evaluating the access structures, which we have optimized as well. Note that this is not a concern for our comparisons in this section, as the additional overhead incurred by the construction and evaluation of access structures is the same for each scheme, while it considerably simplifies the implementation of the schemes. In this way, we evaluate the efficiency of the schemes on a structural level. The additional overhead incurred by the construction and evaluation of the access structures can then be measured separately to obtain a more complete understanding of the implementation's efficiency in practice. Second, another difference is that our implementations follow a clearly articulated design rationale: we have optimized each scheme with respect to several optimization approaches. This may heavily influence the efficiency of a scheme, and thus the subsequent comparison.

Linear costs of the schemes
We show that the computational costs of all the schemes are linear (see Figure 3). Hence, for any efficiency analysis, it suffices to compare the computational costs for small sets of attributes, e.g., 1 and 10, and for a large set of attributes, e.g., 100. This makes any analysis more compact, as we can place the results in tables, rather than in graphs. Furthermore, because the setup is only performed once, they do not matter much in the overall efficiency comparison. Therefore, we do not consider those in the following comparisons. Table 6: The analysis of the memory footprint for all implemented schemes and optimization approaches (OA). Table 6a lists the general storage costs of the master public key MPK, the master secret key MSK, the secret key SK, the set S, and the ciphertext CT, where |att| denotes the length of a single attribute (in our implementations: att ∈ Z p ), A denotes the policy length, |S| denotes the set size, and G denotes the length of a single element in group G . Based on this and the rest of our analysis, Table 6b provides upper bounds on the memory consumption for the BLS12-381 and BN446 curves, where we distinguish the regular costs (R) from the precomputation tables (P) and the total costs (T).

Comparing our framework with Charm and OpenABE
One of the main goals of our framework is to fully optimize the efficiency of all the schemes with respect to the same design goal. To show how effective this is compared to existing works,  Table 7 show that our implementations greatly improve on the implementations of Charm, the costs being at least one order of magnitude lower. For decryption, our implementations perform even a factor 100-300 faster than the Charm implementation. This is a huge and important speed-up that can be noticed in practice. For instance, Table 7b shows that Charm takes several seconds to execute decryption for large policies with RW13, which increases even further if more up-to-date curves such as BLS12-381 are used. In contrast, our implementations never require more than 15 milliseconds to execute. Compared to the OpenABE implementation of Wat11-IV, our implementations perform roughly equally efficient in the key generation, a factor 1.6 faster in the encryption algorithm, and a factor 4 faster in the decryption algorithm. In addition, Table 7 illustrates that comparing optimized implementations of the schemes yields a different comparison of two schemes. For instance, the Charm implementations Table 7: Comparison of the computational costs of the Charm and OpenABE implementations of Wat11 and RW13, and our implementations, on the BN254 curve. For the costs with 100 attributes, we also provide the factor (×) by which the costs are lower than the least efficient variant of the scheme. For each scheme, the lowest costs are typeset in bold.  RW13 show that Wat11-I outperforms RW13 in all algorithms: its key generation is 154% faster, its encryption is 13% faster and its decryption is 222% faster. In contrast, our implementations of the same schemes, optimized with respect to the encryption algorithm compare differently. These implementations also illustrate that Wat11-I outperforms RW13, but the differences in costs are distinct: key generation is only 103% faster, encryption is even 33% faster, and decryption is only 91% faster. This means that, in contrast to what the Charm implementations suggest, Wat11-I encryption is actually even faster than RW13, and RW13 decryption does not perform as badly compared to Wat11-I. However, note that the schemes also have different properties, the most notable difference being that Wat11-I is a small-universe construction, while RW13 is a large-universe construction. We show in Section 4.7 that Wat11-IV-the large-universe variant of Wat11-I-is actually slower than Wat11-I, and is even outperformed by RW13 in the key generation and encryption algorithms for the OK and OE approaches, respectively. This difference in results illustrates that it is important to compare two implementations of schemes with the same properties, which are subsequently optimized with respect to the same goals. If this is not done, then one might unjustifiably draw the conclusion that Wat11-IV is a more efficient scheme (given any design goal) than RW13.

Comparing Wat11-I optimized approaches with bad approaches
We also compare the computational costs of our optimizations of Wat11-I with the badly optimized variants Wat11-I-BAD-I (Section 4.2.5), and Wat11-I-BAD-II (Section 4.2.5). By investigating the difference between Wat11-I-BAD-II and our optimizations, we can analyze the advantage of strategically placing the key and ciphertext components in the two source groups in the type conversion. By comparing Wat11-I-BAD-I and Wat11-I-BAD-II, we can determine the advantage of using all optimized arithmetic and group operations. Table 8 shows that, indeed, our optimizations are generally better than Wat11-I-BAD-II. That is, the key generation and decryption algorithms of Wat11-I-OE variant perform comparably to Wat11-I-BAD-II, while its encryption is much faster (i.e., by 116%). Furthermore, encryption of Wat11-I-OK is slightly slower (i.e., by 14%) than Wat11-I-BAD-II, but its key generation is faster by 140%. In addition, the table shows that the use of optimized arithmetic matters much, as the computational costs of Wat11-I-BAD-I increase compared to Wat11-I-BAD-II: key generation by 33%, encryption by 2% and decryption by 369%.

Comparing different optimizations
For each of the five schemes, we list the computational costs on the curves: two curves in the [125, 128]-bit security range, BLS12-381 and BN382, and two curves in the [129, 135]-bit security range, BLS12-446 and BN446. By doing this, we illustrate that the efficiency of the scheme depends heavily on the chosen optimization approaches. Furthermore, this allows us to determine the best choice of curve, depending on the optimization approach. These should thus be clearly specified and explained in any benchmarking efforts.

Comparing the schemes on different curves
First, we find the best curves for each scheme and optimization approach. In Table 9, we compare the efficiency for each optimization of each scheme on two curves: the BLS12-381 and BN382 curves, which are two curves with roughly the same security level.  decryption is the most efficient on BN446, while the other algorithms are more efficient on BLS12-446 (with the exception of key generation of AC17-LU for 100 attributes).

Comparing schemes for different optimizations
In Section 3.5, we explained that the chosen design goal influences the optimization approach, including the conversion strategy from the type-I to the type-III setting. To this end, we have converted each scheme with respect to the different design goals. We illustrate the trade-offs incurred by the conversions in Table 10. It shows that, indeed, the variant of a scheme that is optimized with respect to a specific algorithm also outperforms the other variants in this algorithm. Note that, as expected, for the schemes without an FDH, these differences are much more pronounced than the schemes with an FDH.

Comparing access trees with LSSS matrices
We also analyze the computational costs of RW13 for two variants of the scheme: one using the access trees like in the rest of this work, and one using more efficient LSSS matrices as mentioned in Section 2.2 (see the full version [dlPVA22] for a description). These LSSS matrices were also used in the Charm implementation of FAME [AC17a]. However, they do not compare it with the use of access trees used in OpenABE and our other implementations. Therefore, we investigate the computational advantage of using LSSS matrices. In particular, our comparison suggests that the choice of access structure does matter in the performance analysis as well. As Table 11 shows, the use of LSSS matrices barely has an effect on the key generation and encryption efficiency. Nevertheless, the decryption costs-which are typically the highest-can be decreased considerably by using LSSS matrices. That is, the decryption costs of RW13 with access trees is up to 45% more costly than of RW13 with LSSS matrices. Hence, we recommend that LSSS matrices are used in practice.

Proof of concept: comparison of different schemes
We also show how the benchmarks can be used in the comparison of different schemes. We do this by comparing the computational costs of the schemes for the same optimization approaches. Specifically, we compare the large-universe schemes, Wat11-IV, AC17-LU and RW13, with one another to investigate which of the three performs best with respect to some chosen optimization approach. We also compare the large-universe variants Wat11-IV and AC17-LU with their small-universe counterparts, Wat11-I and AC17, to investigate the sacrifice in efficiency that the large-universe property requires.

Comparing the large-universe schemes
In Table 12, we compare the computational costs of the large-universe schemes, i.e., Wat11-LU, RW13 and AC17-LU. It shows that AC17-LU outperforms the other two in almost all optimizations and the subsequent implementations of the algorithms. The only exception is the optimized key generation approach, where RW13 provides the most efficient key generation algorithm, outperforming the other two schemes by a factor 2. It therefore seems that, currently, RW13 is the best choice when the design goal is to have an optimized key generation algorithm. For the other approaches, it is best to use AC17-LU. Notably, RW13 outperforms Wat11-IV in the OE, OK and BKE approaches, and thus constitutes not only a scheme that is interesting for its theoretical properties, but also for its efficiency.

Comparing small-universe schemes with their large-universe variant
In Table 13, we compare the large-universe variants of Wat11 and AC17 with their smalluniverse counterparts. This illustrates the sacrifice in efficiency incurred by the FDH that is instantiated to achieve the large-universe property. Concretely, the table shows that, for each optimization approach, the optimized algorithms of the small-universe variant outperform the large-universe variant. For the optimized key generation and encryption approaches, the small-universe variants perform overwhelmingly better: at least a factor 1.7, and at most a factor 4. For the optimized decryption approaches, the decryption costs are similar for the small-universe and large-universe variants. The reason why the trade-off in efficiency is so high is the FDH: not only does it limit us in the conversion from the type-I to the type-III setting, but we also need to perform a hash operation and use a variable-base exponentiation. This is also why RW13 outperforms Wat11-IV in Table 12.

Future work
This work provides the basis for further research at various levels.

Automating our framework
Our type-conversion methods are heuristic and manual. The reason why they are heuristic is because the conversion methods are inextricably intertwined with the efficiency of the arithmetic not only provided by the chosen curve, but also by the order of the computations and the implementation of the arithmetic (which might in turn depend on the architecture of the processor). Currently, automating our conversion techniques is not trivial. Due to the heuristic nature of our given methods (which requires us to circle back to earlier design choices to see if these need to be adjusted), it may be more difficult to automate than like in [AGOT14, AGH15, AHO16]. Instead, one could take a different approach. First, one could make a theoretical estimation of the computational costs of the arithmetic and group operations, for all possible pairing-friendly curves at the desired security level. Second, one could make a list of all possible distributions of the key and ciphertext components over the source groups. For each distribution and pairing-friendly curve, one can then determine the most efficient algorithms for the group operations and optimize the order of computations (which is not a trivial effort either). Given some optimization goal, the most efficient distribution can then be selected to be the optimal type conversion.

More pairing-friendly curves
In our analysis, we have only considered two curves in the [125, 128]-bit security range and two curves in the [129, 135]-bit security range. As we mentioned in the introduction, many pairing-friendly groups exist that provide at least 128 bits of security [Gui20a]. Notably, the KSS16 curves [KSS08] provide efficient arithmetic in the first source group and efficient multi-pairing operations [GF16,Ara17,CDS20]. These might be especially beneficial for schemes such as RW13, which provide much freedom with respect to their type conversion. In order to improve the benchmarks in this framework, more curves need to be supported by RELIC. Alternatively, a framework or library can be set up with the estimated efficiency of frequently-used arithmetic of the curves providing 128 bits of security listed at [Gui20a]. This would also help in any automated efforts.

Improving usability, validity and verifiability
To simplify the accurate comparison of schemes even further, it is important to make the framework more usable for ABE designers. As a result, cryptographers can easily compare their new scheme with existing ones in a transparent way without requiring a deep understanding of cryptographic engineering. One could make the framework more usable by providing a functionality that allows designers to specify e.g., an encoding of a scheme (rather than a full-fledged description). In this way, it also becomes easier to analyze the schemes with respect to other metrics, such as validity and verifiability, e.g., either manually [VA21] or even automatically [ABGW17].

Implementing fully secure ABE
We have implemented only the selectively secure variants of Wat11, RW13 and AC17. While this provides a reliable comparison of the structure of the schemes, in practice, we require the use of a fully secure scheme. Since several frameworks exist that provide efficient generic conversions to the full-security setting [Wee14,Att14], it would be important to benchmark the underlying groups used in such conversions. In this way, the most efficient security-conversion technique can be selected. Note that those generic conversions are also compatible with the aforementioned encodings (Section 5.3).

Using other platforms
We have implemented and benchmarked the chosen schemes on an x64-based platform with a single core and significant computational resources (e.g., with a large RAM and high clock frequency). Possibly, other platforms allow for faster implementations, while more resourceconstrained devices may perform slower e.g., due to their lower clock frequencies, or simply cannot even store the scheme's parameters in memory. Additionally, using single instruction multiple data (SIMD) extensions has been shown to significantly speed up elliptic-curve and pairing-based cryptography, e.g., with NEON for ARM-based [BS12, SR13, SLG + 14] platforms and AVX2 for x64-based platforms [FL15, FLD19, CGT + 20]. Furthermore, using multiple cores to parallelize the implementations has yielded improvements as well [FSV07,FSV08,GGP08], and may be especially useful in the implementation of the key generation and encryption algorithms owing to their already parallelized nature.

Using other algorithms for group operations
We have used RELIC for the implementations of the group operations used in the ABE schemes. As mentioned in Section 3.1, RELIC does not support all available algorithms, e.g., it does not support fixed-argument pairings. Furthermore, using precomputation tables in multiple-base exponentiations may significantly speed up the encryption algorithm [Möl01]. Conversely, implementing ABE in resource-constrained devices may require the use of different optimizations [FA17].

Expanding to other pairing-based ABE, and related primitives
Our methods are mostly targeted at optimizing pairing-based ABE of the specific structure described in Section 2.5. While this covers many ABE schemes, some schemes exist that do not have this exact structure, e.g., [LW11a,RW15]. Furthermore, our methods are also applicable to other pairing-based primitives that satisfy the targeted structure [AC17b, Att19], e.g., identity-based encryption [Sha84,BF01] and identity-based broadcast encryption [Del07]. Possibly, our framework can be expanded to cover pairing-based cryptography for other structures (and primitives) as well.

Conclusion
We have presented ABE Squared, a framework for accurately benchmarking efficiency of attribute-based encryption. Concretely, this framework aims to optimize the theoretical descriptions of ABE schemes for some chosen design goal by considering four optimization layers. These layers consider the arithmetic and group operations, the chosen pairingfriendly group, the order of the computations and the conversion techniques. By taking into account all layers during the optimization of a theoretical description, we are able to attain more efficient implementations. More specifically, we have devised several optimization approaches that aim to accomplish some chosen design goal, e.g., optimized key generation, encryption or decryption. By optimizing multiple schemes with respect to the same goal, they can be compared more fairly. Because existing conversion techniques did not allow us to e.g., optimize the decryption algorithm, we have given new heuristic and manual techniques that facilitate this. Unlike other existing works, these conversion techniques take into account the other three optimization layers.
To show the effectiveness of our framework, we have optimized and implemented five schemes: Wat11-I, Wat11-IV, RW13, AC17 and AC17-LU. These implementations show that, indeed, the efficiency of the schemes depends heavily on the design goals and subsequent optimization approaches. For example, Charm shows that Wat11-I is generally faster than RW13. This may result in the idea that Wat11-IV, the large-universe variant of Wat11-I, is also faster than RW13, because it is similar. In contrast, we have shown that RW13 outperforms Wat11-IV with respect to the optimized encryption and optimized key generation approaches. This illustrates clearly that taking into account any such design goals in the implementation and benchmarks is crucial in the comparisons as well. Therefore, the ABE Squared framework provides an instrumental contribution in the benchmarking-and eventually, in the deployment-of ABE.