Skip to content

fastrpc_munmap/remote_mem_unmap does not seem to remove DSP-side mapping immediately #137

@haozixu

Description

@haozixu

Dear developers,

I'm trying to implement some sort of dynamic swapping-like mechanism when using Hexagon cDSP.
There are several large rpcmem buffers (dma_buf) allocated ahead-of-time whose total size exceeds 2GiB. Since Q6 cDSP only have 32-bit virtual address space, limited number of shared buffers can be mapped to cDSP at one time (but fortunately that's enough for the actual offloaded computation tasks). To manage this, I'm using fastrpc_mmap/fastrpc_munmap interfaces to maintain an active working set of less than 2GiB.

But things didn't go on as I expected. Although I manumally keep (sizeof(mapped buffers) - sizeof(unmapped buffers)) under 1GiB, fastrpc_mmap failed to map more buffers when the size of all previously mapped buffers exceeds 2GiB. I also found that when I released some buffers using rpcmem_free, fastrpc_mmap would be able to map more buffers.

Here's a snippet of code that emulates my program's behavior. The sliding window can only hold a total of 1GiB buffers. When fastrpc_mmap fails, I will de-allocate some buffers and try again.

void my_test() {
  const int N = 16;
  const int WINDOW = 4;
  const size_t size = 256UL * 1024 * 1024;

  static int fds[N];
  static void *ptrs[N] = {};

  for (int i = 0; i < N; ++i) {
    void *p = rpcmem_alloc(RPCMEM_HEAP_ID_SYSTEM, RPCMEM_DEFAULT_FLAGS, size);
    int fd = rpcmem_to_fd(p);

    if (!p || fd < 0) {
      printf("init failed, ptr %p, fd %d\n", p, fd);
    }
    fds[i] = fd, ptrs[i] = p;
  }

  int n_active = 0;
  int recycle_index = 0;
  for (int i = 0; i < N; ++i) {
    if (i >= WINDOW) {
      int j = i - WINDOW;
      int e = fastrpc_munmap(CDSP_DOMAIN_ID, fds[j], ptrs[j], size);
      printf("unmap buffer %d, ret %d\n", j, e);
    }

    int e;
    do {
      e = fastrpc_mmap(CDSP_DOMAIN_ID, fds[i], ptrs[i], 0, size, FASTRPC_MAP_FD);
      printf("map buffer %d, ret %d\n", i, e);
      if (e) {
        int k = recycle_index++;
        rpcmem_free(ptrs[k]);
        --n_active;

        printf("free buffer %d, n_active = %d\n", k, n_active);
      }
    } while (e != 0);
    ++n_active;
  }
}

This code on my 8gen2 device produces the following result:

map buffer 0, ret 0
map buffer 1, ret 0
map buffer 2, ret 0
map buffer 3, ret 0
unmap buffer 0, ret 0
map buffer 4, ret 0
unmap buffer 1, ret 0
map buffer 5, ret 0
unmap buffer 2, ret 0
map buffer 6, ret 0
unmap buffer 3, ret 0
map buffer 7, ret 1
free buffer 0, n_active = 6
map buffer 7, ret 0
unmap buffer 4, ret 0
map buffer 8, ret 1
free buffer 1, n_active = 6
map buffer 8, ret 0
unmap buffer 5, ret 0
map buffer 9, ret 1
free buffer 2, n_active = 6
map buffer 9, ret 0
unmap buffer 6, ret 0
map buffer 10, ret 1
free buffer 3, n_active = 6
map buffer 10, ret 0
unmap buffer 7, ret 0
map buffer 11, ret 1
free buffer 4, n_active = 6
map buffer 11, ret 0
unmap buffer 8, ret 0
map buffer 12, ret 1
free buffer 5, n_active = 6
map buffer 12, ret 0
unmap buffer 9, ret 0
map buffer 13, ret 1
free buffer 6, n_active = 6
map buffer 13, ret 0
unmap buffer 10, ret 0
map buffer 14, ret 1
free buffer 7, n_active = 6
map buffer 14, ret 0
unmap buffer 11, ret 0
map buffer 15, ret 1
free buffer 8, n_active = 6
map buffer 15, ret 0

Does this indicate that fastrpc_munmap does not immediately destroy the DSP-side virtual memory mapping? Or are there any hidden deferred/lazy semantics in the implementation?

I also tried remote_mem_map/remote_mem_unmap and they gave similar results. I skimmed through the code and arrived at src/fastrpc_ioctl.c where ioctl is used with two different set of commands, but I can't tell what's the real difference between MEM_MAP/MEM_UNMAP and MMAP/MUNMAP.

Is this unmapping issue potentially related to DSP-side MMU-related implementations, or would it be necessary to dive into the kernel code for further investigation? Are there any workarounds for this problem?

Thank you!

device: OnePlus Ace 3 (SD 8gen2, kernel version 5.15), Hexagon SDK version 6.0.0.2

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions