Gigacage

JavaScript engines have long been a preferred target for attackers. In this article I will introduce Gigacage, an implementation of heap isolation technique in JavaScriptCore, WebKit’s JavaScript engine.

Some JavaScript objects can be easily manipulated to become very powerful read and write primitives. An example of those can be TypedArrays which are data structures that give the user precise control over the memory of their underlying storage buffer. If an attacker can exploit some bug to get a write primitive on the pointer of the buffer of a TypedArray, they can easily enhance that primitive into a more powerful one that allows arbitrary read and write, fake objects and leak memory addresses. That’s exactly what Gigacage tries to mitigate.

Gigacage divides different types of objects into different classes, HeapKinds, where each kind has a separate heap. Memory access to objects in these heaps is verified and modified to ensure that cross heaps access will not be possible.

As of writing this article, there are 3 HeapKinds:

  1. Primary heap, representing regular allocation that are not protected by Gigacage
  2. PrimitiveGigacage for primitive contiguous memory arrays
  3. JSValueGigacage for Butterflies

During WebKit initialization Gigacage::ensureGigacage() is called, which takes care of allocating the heaps. It calls tryVMAllocate() which calls mmap(2) internally to create maskable memory regions for every HeapKind. This way, the mapped addresses can be used as base addresses for their heap allocations. The address of every allocated heap is stored in a global structure called g_gigacageBasePtrs to allow quick access to the base address of every heap.

Adjacent to every heap lays a memory range of 32GB, called gigacageRunway. This memory region is set to have no permission by calling mprotect(2) with PROT_NONE. Therefore, every attempt to access this memory region will cause the kernel to generate a SIGSEGV signal and crash the process.

The rationale behind the runway is that JavaScriptCore uses unsigned 32-bit integers as indices to objects that support indexing, and the maximum size of each object is 8 bytes (232 * 8 = 32GB). Therefore, even if an out-of-bounds access on a gigacaged object is achieved, it will land within the cage or runway.

Since runways are intended to mitigate cross-heaps accesses, it only makes sense to place them between heaps, and since there are only two HeapKinds protected by Gigacage, there is only one runway. If we look at g_gigacageBasePtrs we will be able to observe those base addresses of the allocated heaps. Be advised, the following example was taken on an x86-64 platform, sizes may vary on other platforms (e.g. ARM64, if you’re debugging on iOS).

(lldb) p/x (*(Gigacage::BasePtrs*)&g_gigacageBasePtrs)
(Gigacage::BasePtrs) $2 = (
    reservedForFlags = 0x0000000000000001, 
    primitive = 0x0000000800000000, 
    jsValue = 0x0000001800000000
)

Later, when gigacaged objects are created, they are allocated with a special allocator that uses the formerly allocated heaps, so each address can be treated as relative to its’ heap base address.

When used, the address of a gigacaged object is being treated as an offset from the base address of the HeapKind it belongs to. This is done by masking off the higher bits of the address and adding the resulting number to the matching base address from g_gigacageBasePtrs.

If a pointer to a gigacaged object has been corrupted and replaced with an address that does not belong to the same heap, a memory access on that gigacaged object will affect an address in the original heap, or land in the runway and crash the process.

BINLINE T* caged(Kind kind, T* ptr)
{
    BASSERT(ptr);
    void* gigacageBasePtr = basePtr(kind);
    if (!gigacageBasePtr)
        return ptr;
    return reinterpret_cast<T*>(
        reinterpret_cast<uintptr_t>(gigacageBasePtr) + (
            reinterpret_cast<uintptr_t>(ptr) & mask(kind)));
}

Any class that wants to protect one of its’ data members using Gigacage, should use the CagedPtr template, with the chosen HeapKind in its’ definition, and have that data member allocated from the heap of that HeapKind.

For the example, we can look at |o|, a JSObject that is backed by a Butterfly and |ua|, a Uint8Array that is backed by a vector.

>>> describe(o)
Object: 0x1088bc040 with butterfly 0x18b48fe1e8 (Structure 0x1088b47e0:[Object, {}, NonArrayWithContiguous, Proto:0x1088c0000, Leaf]), StructureID: 12678

(lldb) p/x ((JSC::JSObject*)0x1088bc040)->m_butterfly
(JSC::AuxiliaryBarrier<JSC::Butterfly *>) $3 = (m_value = 0x00000018b48fe1e8)

>>> describe(ua)
Object: 0x1088e83a0 with butterfly 0x0 (Structure 0x1088b4a10:[Uint8Array, {}, NonArray, Proto:0x1088c01d0, Leaf]), StructureID: 17809

(lldb) p/x ((JSC::JSArrayBufferView*)0x1088e83a0)->m_vector
(JSC::JSArrayBufferView::VectorPtr) $4 = {
  m_barrier = {
    m_value = (m_ptr = 0x0000000825cfc000)
  }
}

By comparing the addresses of m_butterfly and m_vector, the gigacaged backing objects of |o| and |ua|, to the corresponding gigacage base addresses (shown previously within g_gigacageBasePtrs global structure), we can see that o->m_butterfly has been allocated from JSValueGigacage and that ua->m_vector has been allocated from PrimitiveGigacage.

                                         Gigacage::BasePtrs                  
                                        /                                    
                    +------------------+                                     
                    | reservedForFlags |                                     
                    +------------------+                                     
                    |    primitive     |-------->+--------------------------+
                    +------------------+         |           ...            |
                    |     jsValue      |----+    +--------------------------+
                    +------------------+    |    |       ua->m_vector       |
                                            |    +--------------------------+
                                            |    |           ...            |
                                            |    +--------------------------+
                                            |    |                          |
                                            |    |                          |
                                            |    |          Runway          |
                                            |    |                          |
                                            |    |                          |
                                            +--->+--------------------------+
                                                 |           ...            |
                                                 +--------------------------+
                                                 |      o->m_butterfly      |
                                                 +--------------------------+
                                                 |           ...            |
                                                 +--------------------------+

The objects that were chosen to be protected by Gigacage are considered highly valuable for attackers. Therefore, when PAC (Pointer Authentication) was introduced in Apple A12 processors, it made a lot of sense to use Gigacage’s infrastructure to sign and authenticate pointers, making gigacaged pointers forgery even harder.