### Introduction <br> This guide walks through how to benchmark an application using the Cycle Counter API provided in the SDK. Under the hood this API either uses the Performance Management Unit (PMU) inside the R5F core or the Debug Watchpoint and Trace Unit (DWT) of the M4F core to return precise cycle counts. We will be benchmarking the execution time of a 1024-point Complex Fast Fourier Transform (CFFT) on the R5F core in this exercise. We will be starting from the Empty example project and adding the CFFT and Cycle Counter to it. <hr> <br> ### Step 1: Import the Empty Project We will be using the Empty project for the R5F0-0 core as the starting point. <br> <hr> #### a. Launch CCS Desktop <hr style="width:50%; margin: auto;" /> #### b. In CCS, go to *View &rarr; Resource Explorer* to open Resource Explorer <hr style="width:50%; margin: auto;" /> #### c. In Resource Explorer, navigate to *MCU+ SDK &rarr; Examples &rarr; Development Tools &rarr; {Board} &rarr; empty &rarr; r5fss0-0_nortos &rarr; empty* <hr style="width:50%; margin: auto;" /> #### d. Click "Import" to import the project into your workspace <hr> <br> ### Step 2: Add the CFFT code We will be using the CFFT functions from the Arm CMSIS DSP library found in the SDK at `${MCU_PLUS_SDK_PATH}/source/cmsis/DSP` <br> <hr> #### a. First, right-click on the project and rename it to `CFFT_Benchmark` Renaming the project allows you to re-import the empty project to the workspace in the future if needed. <hr style="width:50%; margin: auto;" /> #### b. Rename `empty.c` to `cfft.c` and `empty_main()` to `cfft_main()` <hr style="width:50%; margin: auto;" /> #### c. `#Include` the header files for the CFFT Add the following `#includes` to your `cfft.c` file ```c #include "arm_math.h" #include "arm_const_structs.h" ``` <hr style="width:50%; margin: auto;" /> #### d. Add the search paths to the header files In the project's properties window, go to *Build > Arm Compiler > Include Options* and add the following paths. `${MCU_PLUS_SDK_PATH}/source/cmsis/DSP/Include` `${MCU_PLUS_SDK_PATH}/source/cmsis/Core/Include` <hr style="width:50%; margin: auto;" /> #### e. Initialize the CFFT input data ```c float32_t cfftInData[2048]; // CFFT Input Array /* initialize FFT complex array */ uint16_t i; /* initialize FFT complex array */ for (i = 0; i < 2048; i += 2) { cfftInData[i] = arm_sin_f32((float) i * 7.5); cfftInData[i + 1] = 0; } ``` <hr style="width:50%; margin: auto;" /> #### f. Add a call to the CFFT module, this is the function we will be benchmarking ```c void cfft_main(void *args) { . . . /* Process the data through the CFFT/CIFFT module */ arm_cfft_f32(&arm_cfft_sR_f32_len1024, cfftInData, 0, 1); ``` <hr> <br> ### Step 3: Add the Cycle Counter Now we need to add the Cycle Counter to benchmark the execution time of the CFFT <br> <hr> #### a. `#Include` the Cycle Counter header file ```c #include <kernel/dpl/CycleCounterP.h> ``` <hr style="width:50%; margin: auto;" /> #### b. Reset the Cycle Counter `CycleCounterP_reset()` will enable and reset the Cycle Counter. ```c void cfft_main(void *args) { . . . CycleCounterP_reset(); ``` <hr style="width:50%; margin: auto;" /> #### c. Calculate the Cycle Counter overhead Calculate the overhead of getting the Cycle Counter count so that we can subtract it from the benchmark measurement. ```c void cfft_main(void *args) { . . . uint32_t start, end, overhead; /* Calculate overhead */ CycleCounterP_reset(); start = CycleCounterP_getCount32(); end = CycleCounterP_getCount32(); overhead = end - start; DebugP_log("Total Overhead: %d Cycles\r\n", overhead); ``` <hr style="width:50%; margin: auto;" /> #### d. Benchmark the CFFT function Wrap the CFFT function in `CycleCounterP_getCount32()` calls to measure its execution time. ```c . . . CycleCounterP_reset(); start = CycleCounterP_getCount32(); /* Process the data through the CFFT/CIFFT module */ arm_cfft_f32(&arm_cfft_sR_f32_len1024, cfftInData, 0, 1); end = CycleCounterP_getCount32(); DebugP_log("Start: %d\r\n", start); DebugP_log("End: %d\r\n", end); DebugP_log("Total: %d Cycles = %d microseconds @ 800MHz\r\n", end - start - overhead, (end - start - overhead) / 800); ``` <hr style="width:50%; margin: auto;" /> #### e. Your `cfft.c` should now look like this ```c #include <stdio.h> #include <kernel/dpl/DebugP.h> #include "ti_drivers_config.h" #include "ti_drivers_open_close.h" #include "ti_board_open_close.h" #include "arm_math.h" #include "arm_const_structs.h" #include <kernel/dpl/CycleCounterP.h> float32_t cfftInData[2048]; void cfft_main(void *args) { /* Open drivers to open the UART driver for console */ Drivers_open(); Board_driversOpen(); uint32_t start, end, overhead; uint16_t i; /* initialize FFT complex array */ for (i = 0; i < 2048; i += 2) { cfftInData[i] = arm_sin_f32((float) i * 7.5); cfftInData[i + 1] = 0; } /* Calculate overhead */ CycleCounterP_reset(); start = CycleCounterP_getCount32(); end = CycleCounterP_getCount32(); overhead = end - start; DebugP_log("Overhead: %d Cycles\r\n", overhead); CycleCounterP_reset(); start = CycleCounterP_getCount32(); /* Process the data through the CFFT/CIFFT module */ arm_cfft_f32(&arm_cfft_sR_f32_len1024, cfftInData, 0, 1); end = CycleCounterP_getCount32(); DebugP_log("Start: %d\r\n", start); DebugP_log("End: %d\r\n", end); DebugP_log("Total: %d Cycles = %d microseconds @ 800MHz\r\n", end - start - overhead, (end - start - overhead) / 800); Board_driversClose(); Drivers_close(); } ``` <hr> <br> ### Step 4: Build and Run the Benchmark Project Now that we have created the benchmark application, let's build and run it. <br> <hr> #### a. Click **Debug** to build and load the application onto the R5F0-0 core ![Debug Icon](./resources/debug_icon.png) <hr style="width:50%; margin: auto;" /> #### b. Click **Resume** (F8) to run the program. You should see the benchmark output in the CCS Console ```txt Overhead: 9 Cycles Start: 7 End: 77552 Total: 77536 Cycles = 96 microseconds @ 800MHz ``` <hr> ### Cycle Counter Max Duration <br> `CycleCounterP_getCount32()` is a 32-bit counter so the maximum number of cycles before the counter rolls over is `2^32 - 1` or `0xFFFFFFFF`. So for an R5F core running at 800MHz, the maximum duration before the Cycle Counter rolls over is `(2^32 - 1)/800MHz`, or about 5.3 seconds. The application can add logic to handle overflow. For example, here is a snippet from the SDK User Guide that shows how to handle one overflow condition: ```c uint32_t cycleCountBefore, cycleCountAfter, cpuCycles; /* enable and reset CPU cycle coutner */ CycleCounterP_reset(); cycleCountBefore = CycleCounterP_getCount32(); /* call functions to profile */ cycleCountAfter = CycleCounterP_getCount32(); /* Check for overflow and wrap around. * * This logic will only work for one overflow. * If multiple overflows happen during the profile period, * then CPU cycles count will be wrong. */ if (cycleCountAfter > cycleCountBefore) { cpuCycles = cycleCountAfter - cycleCountBefore; } else { cpuCycles = (0xFFFFFFFFU - cycleCountBefore) + cycleCountAfter; } ``` <br> If measurement of longer durations is needed, it is recommended to use the `ClockP_getTimeUsec()` API from the Clock module. <hr> ### Congratulations! <br> You have just learned how to use the Cycle Counter to benchmark an MCU+ SDK application. For another example application of using the Cycle Counter, take a look at the Benchmark Demo in the SDK at `examples/motor_control/benchmark_demo` . <br/> <hr> ### []{ } Knowledge Check <br> **1. True or False:** `CycleCounterP_reset()` should be called before using the Cycle Counter. [quiz] v True --> Correct! x False --> Incorrect [quiz] <br> **2. True or False:** The Cycle Counter driver automatically handles counter rollover. [quiz] x True --> Incorrect v False --> Correct! [quiz] <hr> ### Additional Reading <br> ##### MCU+ SDK User Guide: [Cycle Counter](https://software-dl.ti.com/mcu-plus-sdk/esd/AM243X/latest/exports/docs/api_guide_am243x/KERNEL_DPL_CYCLE_COUNTER_PAGE.html) <br> <hr> {{r> [Back to Home](../overview.html)}} {{r **Was this helpful? Let us know here:** [mcu_plus_academy_feedback@list.ti.com](mailto:mcu_plus_academy_feedback@list.ti.com)}} <div align="center" style="margin-top: 4em; font-size: smaller;"> <a rel="license" href="https://creativecommons.org/licenses/by-nc-nd/4.0/"><img alt="Creative Commons License" style="border-width:0" src="..//web_support/cc_license_icon.png" /></a><br />This work is licensed under a <a rel="license" href="https://creativecommons.org/licenses/by-nc-nd/4.0/">Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License</a>.</div>