How to allocate memory#

Linux and all remote cores must agree on how memory is divided between cores.

DDR: Default memory carveouts#

Linux memory carveouts#

Carveouts per remote core#

If the Linux remoteproc driver is initializing a remote processor core, or if RPMsg IPC is used to communicate between Linux and the remote core, then the Linux remoteproc driver requires at least 2 sections of system memory to be carved out for each remote processor core. The first section has the memory buffers that are used by RPMsg IPC. The second section is reserved for a resource table, and any remote processor’s code or data that needs to be stored in DDR.

For more information on creating an MCU+ SDK project that can be initialized by the Linux remoteproc driver, refer to Application Development on Remote Cores.

For more information on IPC, refer to How to Develop with RPMsg IPC.

The default DDR memory carveouts for each remote core are shown below. Memory carveouts are called “DMA memory pools” in the Linux devicetree file.

See the R5F devicetree bindings documentation for more details about the memory carveouts. It is at https://git.ti.com/cgit/ti-linux-kernel/ti-linux-kernel/tree/Documentation/devicetree/bindings/remoteproc/ti,k3-r5f-rproc.yaml?h=ti-linux-6.1.y.

See the C7x devicetree bindings documentation for more details about the memory carveouts. It is at https://git.ti.com/cgit/ti-linux-kernel/ti-linux-kernel/tree/Documentation/devicetree/bindings/remoteproc/ti,k3-dsp-rproc.yaml?h=ti-linux-6.1.y.

+------------------+--------------------+---------+----------------------------+
| Memory Section   | Physical Address   | Size    | Description                |
+==================+====================+=========+============================+
| C7x Pool         | 0x99800000         | 1MB     | IPC (Virtio/Vring buffers) |
+------------------+--------------------+---------+----------------------------+
| C7x Pool         | 0x99900000         | 31MB    | C7x externel code/data mem |
+------------------+--------------------+---------+----------------------------+
| MCU R5F Pool     | 0x9b800000         | 1MB     | IPC (Virtio/Vring buffers) |
+------------------+--------------------+---------+----------------------------+
| MCU R5F Pool     | 0x9b900000         | 15MB    | R5F externel code/data mem |
+------------------+--------------------+---------+----------------------------+
| DM R5F Pool      | 0x9c800000         | 1MB     | IPC (Virtio/Vring buffers) |
+------------------+--------------------+---------+----------------------------+
| DM R5F Pool      | 0x9c900000         | 30MB    | R5F externel code/data mem |
+------------------+--------------------+---------+----------------------------+

root@am62axx-evm:~# dmesg | grep 'Reserved'
[    0.000000] Reserved memory: created CMA memory pool at 0x00000000c0000000, size 576 MiB
[    0.000000] Reserved memory: created DMA memory pool at 0x0000000099800000, size 1 MiB
[    0.000000] Reserved memory: created DMA memory pool at 0x0000000099900000, size 30 MiB
[    0.000000] Reserved memory: created DMA memory pool at 0x000000009b800000, size 1 MiB
[    0.000000] Reserved memory: created DMA memory pool at 0x000000009b900000, size 15 MiB
[    0.000000] Reserved memory: created DMA memory pool at 0x000000009c800000, size 1 MiB
[    0.000000] Reserved memory: created DMA memory pool at 0x000000009c900000, size 30 MiB
[    0.000000] Reserved memory: created DMA memory pool at 0x00000000a1000000, size 32 MiB
[    0.000000] Reserved memory: created DMA memory pool at 0x00000000ae000000, size 288 MiB

By default the 1MB IPC pool is used for the Virtio and Vring buffers used to communicate with the remote processor core. The “external code/data mem” pool is used for the remote core external memory (program code, data, etc).

Note

The resource table entity (which is used to pass RPMsg initialization information between Linux and the remote cores) needs to be at the beginning of the “external code/data mem” memory section.

Other memory carveouts#

Several other memory carveouts are also included by default in TI’s devicetree files.

The “edgeai_<name>” memory carveouts enable the AM62Ax EdgeAI example.

MCU+ Memory Carveouts#

MCU+ projects define the memory used by the MCU+ core in the linker command file. The memory allocations in the Linux devicetree file must align with the memory allocation in the MCU+ linker files.

For more information about the linker command file, reference MCU Academy section Linker command file.

Note

The MCU+ linker command files only define memory regions that are directly used by the MCU+ project. However, the Linux devicetree file defines all memory regions that are in DDR or on-chip SRAM (OCSRAM), regardless of whether Linux actually uses that memory region. Why is that?

The MCU+ code will never try to use memory that was not specifically given to it in the project settings. However, the Linux operating system assumes that it can use any part of DDR or SRAM that is not specifically reserved. Thus, we must define all DDR or SRAM regions that are used by the remote cores in the Linux devicetree. This prevents Linux from overwriting the data in those memory regions.

Changing the default memory map#

As an example, let’s select a core, move the “external code/data mem” section by 0x10_0000, and reduce the size of the “external code/data mem” section to 0x40_0000.

The address and size of the DMA memory carveouts needs to match with external memory section sizes in the remote core’s linker command file.

Changing the Linux memory map#

Take the AM62Ax Starter Kit EVM as an example:

linux/arch/arm64/boot/dts/ti/k3-am62a7-sk.dts

First, locate the memory carveouts.

As an example, let’s take a look specifically at the memory allocation set aside for the MCU domain R5F core (mcu_r5fss0_core0). The DDR memory allocations are listed under the “reserved-memory” node:

        reserved-memory {
                #address-cells = <2>;
                #size-cells = <2>;
                ranges;
...
                mcu_r5fss0_core0_dma_memory_region: r5f-dma-memory@9b800000 {
                        compatible = "shared-dma-pool";
                        reg = <0x00 0x9b800000 0x00 0x100000>;
                        no-map;
                };

                mcu_r5fss0_core0_memory_region: r5f-dma-memory@9b900000 {
                        compatible = "shared-dma-pool";
                        reg = <0x00 0x9b900000 0x00 0x0f00000>;
                        no-map;
                };

The memory regions are assigned to specific cores further down in the devicetree file:

&mcu_r5fss0_core0 {
        mboxes = <&mailbox0_cluster2 &mbox_mcu_r5_0>;
        memory-region = <&mcu_r5fss0_core0_dma_memory_region>,
                        <&mcu_r5fss0_core0_memory_region>;
};

Where <core_name>_dma_memory_region contains the RPMsg Virtio and Vring buffers, and <core_name>_memory_region contains the resource table, as well as any remote core external memory (program code, data, etc).

Now let’s move the location of the “external code/data mem” section by 0x10_0000, and reduce the size of the “external code/data mem” section to 0x40_0000:

                mcu_r5fss0_core0_memory_region: r5f-dma-memory@9b900000 {
                        compatible = "shared-dma-pool";
/* previous values */
/*                      reg = <0x00 0x9b900000 0x00 0x0f00000>; */
                        reg = <0x00 0x9ba00000 0x00 0x0400000>;
                        no-map;
                };

Changing the remote core memory map#

The project’s linker command file defines what data goes into which memory regions. Let’s modify the ipc_echo_linux example so that the linker.cmd file aligns with the updated Linux devicetree file.

examples/drivers/ipc/ipc_rpmsg_echo_linux/<board>/<core>/ti-arm-clang/linker.cmd.

Section DDR: Default memory carveouts said that the first memory carveout was reserved for Linux IPC, while the second memory carveout was used for the remote core external memory (program code, data, etc). The resource table needs to be placed at the very beginning of the second memory carveout.

The memory carveout that is reserved for Linux IPC is not listed in the linker.cmd file - instead, Linux passes that information to the remote core through the resource table. So we will just be looking for the memory carveout that is for the “external code/data mem” section.

Let’s take a look at

examples/drivers/ipc/ipc_rpmsg_echo_linux/am62ax-sk/mcu-r5fss0-0_freertos/ti-arm-clang/linker.cmd.

In the original Linux devicetree, the “external code/data mem” was at 0x9b900000. So the linker.cmd file should place the resource table at DDR address 0x9b900000:

MEMORY
{
...
    DDR_IPC_RESOURCE_TABLE_LINUX     : ORIGIN = 0x9B900000 LENGTH = 0x400      /* For resource table   */
    DDR_CODE_DATA                    : ORIGIN = 0x9BA00000 LENGTH = 0xE00000   /* For code and data    */
    DDR_IPC_TRACE_LINUX              : ORIGIN = 0x9B900400 LENGTH = 0xFFC00    /* IPC trace buffer     */

We can see that .resource_table is placed in DDR_IPC_RESOURCE_TABLE_LINUX at ORIGIN = 0x9b900000. And .resource_table is the only data section placed in DDR_IPC_RESOURCE_TABLE_LINUX. That ensures that .resource_table is always exactly at the beginning of the “external code/data mem” memory section.

GROUP {
    /* This is the resource table used by linux to know where the IPC "VRINGs" are located */
    .resource_table: {} palign(1024)
} > DDR_IPC_RESOURCE_TABLE_LINUX

Now that we have verified that the existing linker.cmd file matched the existing Linux devicetree file, let’s move the location of the “external code/data mem” section by 0x10_0000, and reduce the size of the “external code/data mem” section to 0x40_0000:

MEMORY
{
...
 /* previous values */
 // DDR_IPC_RESOURCE_TABLE_LINUX     : ORIGIN = 0x9B900000 LENGTH = 0x400      /* For resource table   */
 // DDR_CODE_DATA                    : ORIGIN = 0x9BA00000 LENGTH = 0xE00000   /* For code and data    */
 // DDR_IPC_TRACE_LINUX              : ORIGIN = 0x9B900400 LENGTH = 0xFFC00    /* IPC trace buffer     */
 /* new values */
    DDR_IPC_RESOURCE_TABLE_LINUX     : ORIGIN = 0x9BA00000 LENGTH = 0x400      /* For resource table   */
    DDR_CODE_DATA                    : ORIGIN = 0x9BB00000 LENGTH = 0x300000   /* For code and data    */
    DDR_IPC_TRACE_LINUX              : ORIGIN = 0x9BA00400 LENGTH = 0xFFC00    /* IPC trace buffer     */

Adding new memory allocations#

Additional memory allocations can be added if a core needs additional memory, if a shared memory region is defined to pass large amounts of information between cores, etc.

For an example of how to add a shared memory region to pass large amounts of data between cores, reference https://git.ti.com/cgit/rpmsg/rpmsg_char_zerocopy/

Optimizing Memory Usage#

Optimizing DDR Usage#

Reduce the size of existing memory allocations#

A .map file is generated when building an MCU+ project, as described in the MCU+ SDK docs, section Build a Hello World example.

This .map file can be used to look at all of the allocated memory sections, and check to see how much of the memory allocation is actually used.

For example, let’s say we are using a remote core with a default Linux DDR memory allocation of 1MB at 0x9cb00000 (for the IPC Virtio/Vring buffers), and 14MB at 0x9cc00000 (for the externel code/data mem). Let’s say that we are using RPMsg IPC to allow the remote core to communicate with Linux during runtime.

Now, let’s say we built the MCU+ project, and the top of our .map file looks like this:

******************************************************************************
            TI ARM Clang Linker Unix v2.1.3
******************************************************************************
>> Linked Tue Oct  3 11:25:21 2023

OUTPUT FILE NAME:   <ipc_rpmsg_echo_linux.release.out>
ENTRY POINT SYMBOL: "_c_int00"  address: 00008a15


MEMORY CONFIGURATION

         name            origin    length      used     unused   attr    fill
----------------------  --------  ---------  --------  --------  ----  --------
  M4F_VECS              00000000   00000200  00000140  000000c0  RWIX
  M4F_IRAM              00000200   0002fe00  00015450  0001a9b0  RWIX
  M4F_DRAM              00030000   00010000  0000f120  00000ee0  RWIX
  IPC_VRING_MEM         9c800000   00300000  00080000  00280000  RWIX
  DDR_0                 9cc00000   00001000  00001000  00000000  RWIX

This tells us that we are only using 0x1000 of memory out of the 14MB “external code/data mem” section!

And if we check the linker.cmd file, we see that the remote core firmware does not even define the rest of the “external code/data mem” section for usage:

MEMORY
{
    M4F_VECS : ORIGIN = 0x00000000 , LENGTH = 0x00000200
    M4F_IRAM : ORIGIN = 0x00000200 , LENGTH = 0x0002FE00
    M4F_DRAM : ORIGIN = 0x00030000 , LENGTH = 0x00010000

    /* when using multi-core application's i.e more than one R5F/M4F active, make sure
     * this memory does not overlap with R5F's
     */
    /* Resource table must be placed at the start of DDR_0 when M4 core is early booting with Linux */
    DDR_0       : ORIGIN = 0x9CC00000 , LENGTH = 0x1000


    IPC_VRING_MEM: ORIGIN = 0x9C800000, LENGTH = 0x00300000
}

In this case, we can reduce the Linux allocation at 0x9cc00000 to be just 0x1000 to contain the resource table.

Optimizing remote core local memory#

The Linux RPMsg IPC modules can take up multiple kilobytes of space in a remote core’s instruction RAM (IRAM) and data RAM (DRAM).

If the remote core project does not actually use RPMsg with Linux, then RPMsg IPC can be disabled in the remote core firmware’s sysconfig settings in order to free up memory.

However, additional steps are required if the remote core needs to be loaded by Linux. Linux requires a resource table to exist before it will initialize a remote core. For information about how to manually add a resource table to a remote core project, reference section How to create remote core firmware that can be initialized by Linux.