Memory-based side-channel attacks can use information like data-access patterns to deduce the contents stored in RAM, even in the presence of an enclave solution like Intel SXG. Previous work has suggest Oblivious-RAM as a mechanism for making access patterns to memory indistinguishable to an observer, and has demonstrated success in improving the security of such memory systems.
However ORAM backed solutions suffer from high performance impact and scalability problems due to the underlying data-structures that need to be maintained in this new mapping. TrustOre instead proposes a hybrid cpu-fpga solution, that places in-memory storage inside an external trusted component (FPGA). The FPGA retains its own memory related units that are physically isolated from untrusted software, preventing such software from observing the hardware directly. Evaluation of this approach showed significant performance gains at higher block sizes in comparison to related work, and demonstrated that trustOre is a practical solution that can be adapted to existing architectures.
This work1 makes the following contributions:
It is possible to observe the access patterns of memory via a memory-based side channel to deduce the contents held in memory, even if encryption is used, such as in the case of Intel SGX. Nullifying any confidentiality gained from using such an enclave scheme for securely storing data. Oblivious RAM (ORAM) solves this problem by attaching an interface to a CPU that acts as an intermediary to all memory i/o operations in a way that makes access patterns indistinguishable to the hardware. ORAM has been successfully applied to general data-structures like arrays and to subsystems like the file-system. While improving the security properties and resistance to memory-based side channel attacks, ORAM suffers from poor performance and scalability due to the management of extra data-structures required with this solution.
TrustOre present a hybrid CPU-FPGA approach that transfers memory-management functions to an external FPGA physically isolated from untrusted sources. The external unit is considered secure from memory-based side-channel attacks as it does not hold generally exploited memory artifacts like caches and branch predictions, and instead maintains its own memory-related-units that only it has access to. Their solution has two components, a Trustlib driver in host memory, and a TrustMod that is loaded as a bitstream onto the FPGA fabric.
TrustMod is a secure storage service ran on the FPGA fabric. Attestation is supported by a public-key scheme that embeds the attestation private key inside manufacturer signed TrustMod bitstream loaded onto an FPGA. TrustLib can then verify the loaded bitstream by using the provided public key, as the corresponding private key is embedded in the signed bitstream and cannot be forged by the FPGA itself. TrustMod also contains On-Chip Memory for storing data-blocks and is supported by an On-Chip Memory Allocation Table (OCMAT) containing (block-id, enclave-id, size, base) tuple. This ensures only the originating enclave can perform operations (e.g. dealloc) on any memory stored on the FPGA (as the EID is derived from a dedicated MMIO address unique to each enclave).
TrustLib for the most part is considered as part of the host, and not the TCB, and exists to 1) verify the integrity of the TrustMod component at boot/run-time via attestation, 2) act as a mediator between user-code and the TrustMod by exporting a simple POSIX style interface. TrustLib will establish secure communication with TrustMod using a diffie-hellman key-exchange mechanism to derive a shared session key.
TrustOre prevents data-access pattern side-channels by all data-transfer operations are performed on a dedicated and fixed MMIO addresses. This means that regardless of execution context, the enclave will always access the secure storage in the same way, preventing a side-channel that monitors memory access patterns from discerning any information about the data-transferred.
Authors provided a (comparative) performance evaluation using an experimental method. They evaluated their workload against three previous studies:
Across all experiments they found slower performance compared to native implementations. However comparison with related work showed comparable (but slower) performance at lower block sizes (approx < 2mb), but better performance higher block sizes. Performance scaling showed constant time performance in comparison to related work which often showed exponential growth in their experiments.
This difference at lower block sizes presumably reflects the overhead for PCI-e management at the beginning of an i/o operation in trustOre. The improved performance at higher-block sizes demonstrates the value provided by the trustOre service as memory can still be accessed and sent sequentially, unlike ORAM approaches which will still have to traverse an ever-growing tree data-structure.
Paper Link - https://dl.acm.org/doi/10.1145/3372297.3417265 ↩