Projects

ESP32 Media Streaming

with a Twist of Multicore

Gamal Labib

Issue 76, November 2023

This article includes additional downloadable resources.
Please log in to access.

Log in

ESP32 MCU encompasses a dual-core CPU, however, only a single core is utilised by most published applications. Let's see how dual cores empower multimedia applications on ESP32.

The ESP32 is a powerful MCU-based device family from ESPRESSIF1 famous for its wireless capabilities that are WiFi and Bluetooth connectivity. However, the potential of the device goes far beyond its connectivity for its CPU architecture, e.g. single or dual cores, clock speed, memory size, ADC and DAC interfaces2.

ESP32-CAM3 on the other hand, is a product of Ai-Thinker based on ESP32-S chip, a multitasking capable MCU due to its architecture that encompasses dual cores, identified as Core-1 and Core-0. The ESP-CAM introduces many additional features to the standard ESP32 devices such as the extended memory, the integrated SD-Card, and the OV2640 camera. Such features support a wide spectrum of applications to the ESP-CAM including surveillance and AI.

Unfortunately, the conventional code sketches we write and find for the ESP32 use by default Core-1, where the setup() and loop() functions in Arduino-C represent a single task that run on that core. On the other hand, Core-0 remains idle throughout the runtime of the MCU resulting in 50% loss of processing power of the microcontroller.

In this article, I shall demonstrate how to run several tasks simultaneously on both cores and measure the speed up achieved by doing so compared to the regular single task applications. But firstly, we need to review some processing terminology that is pertinent to the main theme of this article.

Multitasking

Multitasking is a term indicating running several tasks on a CPU simultaneously. The reality behind this terminology is that for a single-core CPU, the CPU may run a privileged task that assigns a fixed time slot to run each task in a queue of user tasks, and continues to cycle around those tasks till they finish processing.

The privileged task, called the scheduler, is capable of interrupting a user task while being served by the CPU to allow other queued tasks to take their share of CPU time. This technique of multitasking is called preemptive.

Another technique delegates CPU switching among user tasks to the tasks themselves. This is called cooperative multitasking in which user tasks are developed in a sense to give up their acquisition of the CPU voluntarily whenever they encounter lack of resources or require user intervention. The performance of this technique is not as fair as the preemptive one and does not give the user the sense of parallel execution of tasks as preemption does.

Multiprocessing is another term that indicates running tasks on different CPUs, or cores, simultaneously. Luckily, there exists a Real-Time Operating System (RTOS) for MCUs that enables running both modes of operation, e.g. multitasking and multiprocessing and FreeRTOS4 is the library that normally being installed on Arduino IDE for that purpose.

When working with ESP32 MCUs, a FreeRTOS version is automatically installed with the ESP32 hardware platform, so no further installation or library inclusion is required for your sketches. So, adding this URL to the boards manager of Arduino IDE will do the job.

https://raw.githubusercontent.com/espressif/arduino-esp32/gh-pages/package_esp32_index.json

In order to illustrate how tasks run simultaneously on the two cores of the ESP32-CAM, I built a main task, Task0 running on the long-forgotten Core-0, for capturing images using the built-in camera and for streaming them using the WiFi connectivity to a web client, namely an internet web browser. The task also has the capability of saving captured images on the built-in SD card. The task would run a web server that accepts input from the web client specifying either to capture or save an image.

Captured images are propagated directly to the web client and at the same time stored in temporary file on the SD card subject to committing at user request. By doing this, I am actually building a surveillance camera system that would traditionally run on the default Core-1 of the ESP32-CAM.

I built a secondary dummy task, Task1, to run instead on the default Core-1 that has the potential of introducing a reasonable workload on that core and the device's memory. This task loops around incrementing variables for batches of 100,000,000 rounds each and printing a message on the serial monitor of Arduino's IDE following each batch. When running this project, the user will experience seamless operation of both tasks without any noticeable effect on streaming images. This demonstrates the potential of multiprocessing on multi-core CPUs in which ESP32-CAM excels.

It is worth noting that Task1 printout was scrolling too fast to be able to notice Task0 messages to the serial monitor, despite the 100 million rounds between its messages. To overcome this nuisance, I adopted the concept of Semaphores normally used to control tasks access to shared resources, such as global variables. I implemented a rather primitive case in which Task0 would set shared variables flag1 and flag2 in order to suspend Task1 execution. Task1 would check if both flags are cleared in order to continue its computational loops otherwise it would wait for the flags to reset.

Task0 uses flag1 when writing the temporary image file to the SD card, and uses flag2 when writing the committed image file. I wrote the code to make sure that only Task0 changes the flags values, while task1 only reads their values.

When diving deeper into FreeRTOS, the user will find a variety of semaphores to include in the code but I chose not to in order to keep the project as simple as possible.

Figure 1. The Capture command URL showing the streamed image.

Figure 1 illustrates the display of a captured image following the submission of the URL: http://esp32-cam-IP/cam-lo.jpg which invokes the handleJpgLo() function for capturing a low-resolution image and saving it to a temporary file.

Figure 2. The Save command issued to save captured image to SD card.

Figure 2 shows the URL command, URL: http://esp32-cam-IP/save, for saving the image in a sequentially numbered file on the SD card.

Figure 3. Tasks startup printout on serial monitor.

Both tasks will declare themselves on reboot as shown in Figure 3.

Figure 4. The output of media streaming Task0 while holding of Task1 execution.

Figure 4 illustrates the mechanism of suspending the computational work of Task1 till Task0 completes image capturing and saving then prints out its relevant messages.

I used four distinct FreeRTOS statements for setting up and running the streaming and the counting tasks that would otherwise have run sequentially on a single-core environment.

First, the object type of a task TaskHandle_t which declares Task0 and Task1, then in the setup() section of the sketch I invoke both tasks configurations using the xTaskCreatePinnedToCore() functions which specify the core on which the task is due to run and the task's stack size and priority.

The third statement is the TaskXcode() where X is set to 0 or 1 to affiliate it to either Task0 or Task1 respectively. This statement depicts the executable code of the task identified by the for (;;) { } segment which causes the code between the curly parenthesis to run continuously as long as the task is activated. That segment replaces the code that would have been located in the loop() section of the sketch in absence of FreeRTOS. That explains the empty section of the loop() function which normally contains code run on Core-1.

Snippet 1 (below) shows the loop code of the streaming Task0 which is a sole statement for serving user requests sent to the ESP32's running web server. That very statement would be placed in the loop() section as argued earlier.

  for (;;) {
    server.handleClient();
  }

Snippet 2 illustrates the loop coding of Task1 and shows how the semaphores are applied to control its computation loops. I intentionally delayed the task by 4 seconds when put on hold by Task0 to stop its fast scrolling messages while visualizing Task0 printout.

  for (;;) {
    if (flag1 == 1 ||flag2 == 1) {
      Serial.print("<Task1> ");
      Serial.println("Task1 is set on hold by Task0!");
      delay(4000);
    }
    while (counter2 < 1000000.0 && flag1 == 0 && flag2 == 0) {
      while (counter1 < 100000000.0) {
        ++counter1;
      }
      counter1 = 0;
      ++counter2;
      Serial.print("<Task1> ");
      Serial.println(counter2);
    }
    counter2 = 0;
  }

The third statement may also contain in its function any task-specific resource setup. For example, the streaming task Task0 initializes the WiFi connectivity, the camera and the SD card. If those resources are initialized in the sketch's setup() section, Task0 will not be able to access them.

On the contrary, common resources such as the serial interface can be initialized in the setup() section and become shared between both tasks. The forth statement is the xPortGetCoreID() function which returns the core number on which it is being run. I used this function to identify the source core of messages printed on the serial monitor.

It is worth noting that when the SD card is initialized on the ESP32-CAM device, the integrated flash, connected to GPIO4, is disabled. I added to the project the capability of setting the flash on or off using the URL: http://esp32-cam-IP/flash-on and URL: http://esp32-cam-IP/flash-off respectively. However, instead of accessing GPIO4, I set the code to GPIO33 which controls a built-in red LED on the back of the ESP32 module.

Another thing to note is that Task0 starts a little bit sluggish and takes between 5-10 seconds to start processing user commands. I chose low resolution image capture to speed up streaming but high resolution capture works as well.

Conclusion

In this project, I demonstrated the potential of ESP32 dual core strength and working with FreeRTOS enabled me to exploit the device potential without delving deeply into
RTOS.

Multi-core processors revolutionised the CPU market. Performance increased dramatically while clock speeds (a previously important indicator of performance) barely shifted. Multi-core processors now domainate the market. In fact, there's some wild multi-core options out there including the Cerebras CS-1, a 400,000-core processor computer targeted at AI applications. Don't expect to find it in your favourite computer store though.

Object Tracking?

For another project using the ESP32-CAM, refer to our Object Tracking with an ESP32-CAM project from Issue 46. We show you how easy it is to set up a portable web server and to perform object tracking using an ESP32-CAM with its digital camera and onboard processing.

https://diyode.io/046nkfb