How to allocate memory#

Linux and all remote cores must agree on how memory is divided between cores.

Default memory carveouts#

Linux memory carveouts#

The Linux remoteproc driver requires at least 2 sections of system memory to be carved out for each remote processor core. The first section is used for IPC with the Linux core. The second section is reserved for the remote processor’s code/data section needs. The default memory carveouts (DMA pools) are shown below.

See the M4F devicetree bindings documentation for more details about the memory carveouts. It is in the Linux Processor SDK under https://git.ti.com/cgit/ti-linux-kernel/ti-linux-kernel/tree/Documentation/devicetree/bindings/remoteproc/ti,k3-m4f-rproc.yaml?h=ti-linux-6.1.y.

+------------------+--------------------+---------+----------------------------+
| Memory Section   | Physical Address   | Size    | Description                |
+==================+====================+=========+============================+
| M4F Pool         | 0x9cb00000         | 1MB     | IPC (Virtio/Vring buffers) |
+------------------+--------------------+---------+----------------------------+
| M4F Pool         | 0x9cc00000         | 14MB    | M4F externel code/data mem |
+------------------+--------------------+---------+----------------------------+
| DM R5F Pool      | 0x9da00000         | 1MB     | IPC (Virtio/Vring buffers) |
+------------------+--------------------+---------+----------------------------+
| DM R5F Pool      | 0x9db00000         | 12MB    | R5F externel code/data mem |
+------------------+--------------------+---------+----------------------------+

root@am62xx-evm:~# dmesg | grep 'Reserved'
[    0.000000] Reserved memory: created CMA memory pool at 0x00000000f7600000, size 128 MiB
[    0.000000] Reserved memory: created DMA memory pool at 0x000000009c800000, size 3 MiB
[    0.000000] Reserved memory: created DMA memory pool at 0x000000009cb00000, size 1 MiB
[    0.000000] Reserved memory: created DMA memory pool at 0x000000009cc00000, size 14 MiB
[    0.000000] Reserved memory: created DMA memory pool at 0x000000009da00000, size 1 MiB
[    0.000000] Reserved memory: created DMA memory pool at 0x000000009db00000, size 12 MiB

By default the 1MB IPC pool is used for the Virtio and Vring buffers used to communicate with the remote processor core. The “external code/data mem” pool is used for the remote core external memory (program code, data, etc).

Note

The resource table entity (which is used to pass RPMsg initialization information between Linux and the remote cores) needs to be at the beginning of the “external code/data mem” memory section.

MCU+ Memory Carveouts#

MCU+ projects define the memory used by the MCU+ core in the linker command file. The memory allocations in the Linux devicetree file must align with the memory allocation in the MCU+ linker files.

For more information about the linker command file, reference MCU Academy section mcu_linker_cmd_file.

Note

The MCU+ linker command files only define memory regions that are directly used by the MCU+ project. However, the Linux devicetree file defines all memory regions that are in DDR or on-chip SRAM (OCSRAM), regardless of whether Linux actually uses that memory region. Why is that?

The MCU+ code will never try to use memory that was not specifically given to it in the project settings. However, the Linux operating system assumes that it can use any part of DDR or SRAM that is not specifically reserved. Thus, we must define all DDR or SRAM regions that are used by the remote cores in the Linux devicetree. This prevents Linux from overwriting the data in those memory regions.

Changing the default memory map#

As an example, let’s select a core, move the “external code/data mem” section by 0x10_0000, and reduce the size of the “external code/data mem” section to 0x40_0000.

The address and size of the DMA memory carveouts needs to match with external memory section sizes in the remote core’s linker command file.

Changing the Linux memory map#

Take the AM62x Starter Kit EVM as an example:

linux/arch/arm64/boot/dts/ti/k3-am62x-sk-common.dtsi

First, locate the memory carveouts.

As an example, let’s take a look specifically at the memory allocation set aside for the M4F (mcu_m4fss). The DDR memory allocations are listed under the “reserved-memory” node:

        reserved-memory {
                #address-cells = <2>;
                #size-cells = <2>;
                ranges;
...
          mcu_m4fss_dma_memory_region: m4f-dma-memory@9cb00000 {
                  compatible = "shared-dma-pool";
                  reg = <0x00 0x9cb00000 0x00 0x100000>;
                  no-map;
          };

          mcu_m4fss_memory_region: m4f-memory@9cc00000 {
                  compatible = "shared-dma-pool";
                  reg = <0x00 0x9cc00000 0x00 0xe00000>;
                  no-map;
          };

The memory regions are assigned to specific cores further down in the devicetree file:

&mcu_m4fss {
        mboxes = <&mailbox0_cluster0 &mbox_m4_0>;
        memory-region = <&mcu_m4fss_dma_memory_region>,
                        <&mcu_m4fss_memory_region>;
};

Where <core_name>_dma_memory_region contains the RPMsg Virtio and Vring buffers, and <core_name>_memory_region contains the resource table, as well as any remote core external memory (program code, data, etc).

Now let’s move the location of the “external code/data mem” section by 0x10_0000, and reduce the size of the “external code/data mem” section to 0x40_0000:

          mcu_m4fss_memory_region: m4f-memory@9cc00000 {
                  compatible = "shared-dma-pool";
/* previous values */
/*                reg = <0x00 0x9cc00000 0x00 0xe00000>; */
                  reg = <0x00 0x9cd00000 0x00 0x400000>;
                  no-map;
          };

Changing the remote core memory map#

The project’s linker command file defines what data goes into which memory regions. Let’s modify the ipc_echo_linux example so that the linker.cmd file aligns with the updated Linux devicetree file.

examples/drivers/ipc/ipc_rpmsg_echo_linux/<board>/<core>/ti-arm-clang/linker.cmd.

Section Default memory carveouts said that the first memory carveout was reserved for Linux IPC, while the second memory carveout was used for the remote core external memory (program code, data, etc). The resource table needs to be placed at the very beginning of the second memory carveout.

The memory carveout that is reserved for Linux IPC is not listed in the linker.cmd file - instead, Linux passes that information to the remote core through the resource table. So we will just be looking for the memory carveout that is for the “external code/data mem” section.

Let’s take a look at examples/drivers/ipc/ipc_rpmsg_echo_linux/am62x-sk/m4fss0-0_freertos/ti-arm-clang/linker.cmd. In the original Linux devicetree, the “external code/data mem” was at 0x9cc00000. So the linker.cmd file should place the resource table at DDR address 0x9cc00000:

MEMORY
{
    ...
    /* when using multi-core application's i.e more than one R5F/M4F active, make sure
     * this memory does not overlap with other R5F's
     */
    /* Resource table must be placed at the start of DDR_0 when M4 core is early booting with Linux */
    DDR_0       : ORIGIN = 0x9CC00000 , LENGTH = 0x1000
    ...
}

We can see that .resource_table is placed in DDR_0, which is at ORIGIN = 0x9CC00000. And .resource_table is the only data section placed in DDR_0. That ensures that .resource_table is always exactly at the beginning of the “external code/data mem” memory section.

SECTIONS
{
    ...
    GROUP {
        /* This is the resource table used by linux to know where the IPC "VRINGs" are located */
        .resource_table: {} palign(4096)
    } > DDR_0
    ...
}

Now that we have verified that the existing linker.cmd file matched the existing Linux devicetree file, let’s move the location of the “external code/data mem” section by 0x10_0000, and reduce the size of the “external code/data mem” section to 0x40_0000:

The current AM62x project does not place any data in DDR. But if we wanted to place some data in DDR, we could define DDR_1 to fill the rest of the memory that Linux has allocated for the remote core:

MEMORY
{
    ...
    /* when using multi-core application's i.e more than one R5F/M4F active, make sure
     * this memory does not overlap with other R5F's
     */
    /* Resource table must be placed at the start of DDR_0 when M4 core is early booting with Linux */
/* previous value */
/*  DDR_0       : ORIGIN = 0x9CC00000 , LENGTH = 0x1000 */
    DDR_0       : ORIGIN = 0x9CD00000 , LENGTH = 0x1000
    DDR_1       : ORIGIN = 0x9CD01000 , LENGTH = 0x3FF000
    ...
}

Adding new memory allocations#

Additional memory allocations can be added if a core needs additional memory, if a shared memory region is defined to pass large amounts of information between cores, etc.

For an example of how to add a shared memory region to pass large amounts of data between cores, reference https://git.ti.com/cgit/rpmsg/rpmsg_char_zerocopy/

Optimizing Memory Usage#

Optimizing DDR Usage#

Reduce the size of existing memory allocations#

A .map file is generated when building an MCU+ project, as described in the MCU+ SDK docs, section Build a Hello World example.

This .map file can be used to look at all of the allocated memory sections, and check to see how much of the memory allocation is actually used.

For example, let’s say we are using a remote core with a default Linux DDR memory allocation of 1MB at 0x9cb00000 (for the IPC Virtio/Vring buffers), and 14MB at 0x9cc00000 (for the externel code/data mem). Let’s say that we are using RPMsg IPC to allow the remote core to communicate with Linux during runtime.

Now, let’s say we built the MCU+ project, and the top of our .map file looks like this:

******************************************************************************
            TI ARM Clang Linker Unix v2.1.3
******************************************************************************
>> Linked Tue Oct  3 11:25:21 2023

OUTPUT FILE NAME:   <ipc_rpmsg_echo_linux.release.out>
ENTRY POINT SYMBOL: "_c_int00"  address: 00008a15


MEMORY CONFIGURATION

         name            origin    length      used     unused   attr    fill
----------------------  --------  ---------  --------  --------  ----  --------
  M4F_VECS              00000000   00000200  00000140  000000c0  RWIX
  M4F_IRAM              00000200   0002fe00  00015450  0001a9b0  RWIX
  M4F_DRAM              00030000   00010000  0000f120  00000ee0  RWIX
  IPC_VRING_MEM         9c800000   00300000  00080000  00280000  RWIX
  DDR_0                 9cc00000   00001000  00001000  00000000  RWIX

This tells us that we are only using 0x1000 of memory out of the 14MB “external code/data mem” section!

And if we check the linker.cmd file, we see that the remote core firmware does not even define the rest of the “external code/data mem” section for usage:

MEMORY
{
    M4F_VECS : ORIGIN = 0x00000000 , LENGTH = 0x00000200
    M4F_IRAM : ORIGIN = 0x00000200 , LENGTH = 0x0002FE00
    M4F_DRAM : ORIGIN = 0x00030000 , LENGTH = 0x00010000

    /* when using multi-core application's i.e more than one R5F/M4F active, make sure
     * this memory does not overlap with R5F's
     */
    /* Resource table must be placed at the start of DDR_0 when M4 core is early booting with Linux */
    DDR_0       : ORIGIN = 0x9CC00000 , LENGTH = 0x1000


    IPC_VRING_MEM: ORIGIN = 0x9C800000, LENGTH = 0x00300000
}

In this case, we can reduce the Linux allocation at 0x9cc00000 to be just 0x1000 to contain the resource table.

Optimizing remote core local memory#

The Linux RPMsg IPC modules can take up multiple kilobytes of space in a remote core’s instruction RAM (IRAM) and data RAM (DRAM).

If the remote core project does not actually use RPMsg with Linux, then RPMsg IPC can be disabled in the remote core firmware’s sysconfig settings in order to free up memory.

However, additional steps are required if the remote core needs to be loaded by Linux. Linux requires a resource table to exist before it will initialize a remote core. For information about how to manually add a resource table to a remote core project, reference section How to create remote core firmware that can be initialized by Linux.