I found a crazy bug in the unofficial Arduino support for the RP2040 v 4.01!
This is a strange tale... I've been using the excellent Raspberry Pi Pico Arduino Core, managed by Earle F. Philhower III, from its beginnings. It has been recently updated to support the new Pico2 and RP2350.
As part of my preparations for testing the new Pico2 (received it yesterday), I was playing with some old microcontroller benchmarks. When I tried a Whetstone benchmark in the old Pico the program just stopped somewhere in the middle of the calculations, and here starts the tale.
double X; // declared here so the multiplication is not removed by optimization void setup() { pinMode(LED_BUILTIN, OUTPUT); digitalWrite(LED_BUILTIN, LOW); double T = millis(); X = -1.3 * T; digitalWrite(LED_BUILTIN, HIGH); } void loop() { delay(100); }
This code forces the bug!
The first messages from the full code appeared in the Serial Monitor, and then no activity. Trying to load new software using the Arduino bootloader would not work, I had to manually enter boot mode to try a new version.
I should have been more methodical. It all started with what I thought would be a quick test, re-running software that had worked fine in the past. I grabbed one of my Pico boards and a USB cable and started testing. After having to unplug and reconnect the USB cable to enter boot mode a few times, I should have moved to a protoboard with a reset button attached to the Pico board, but I never came around to it.
The first thing I did was place a few Serial.print()s. The program crashed somewhere in the middle of the first floating point calculations. I tried a few changes with no results.
I then reverted to version 3.9 and everything worked fine!
A stated trying to isolate a minimum program that showed the problem. The problem disappeared with some changes, but I could make no sense why. After a few hours, I put it aside and went to sleep.
The next day I took a look at the generated code, and found out that the error occurred on the first floating point operation. It disappeared when the compiler optimization took out the operation! I ended up with the code shown at the start, where I even removed the serial outputs and used the LED to show that the code stopped at the floating point multiply.
Out of ideas, I opened an issue at the project. To my surprise, one hour later Earle confirmed that it was bug, by looking at the logs (why didn't I do that?). Turns out that a single typo had resulted in the RP2040 ROM FPU calls not being added in the Pico library.
What "ROM FPU calls" you ask? You see, the RP2040 does not have a Floating Point Unit (FPU), so floating point calculations must be done by software routines. Raspberry Pi have written some very optimized routines (better than the ones in the gcc compiler) and placed them in ROM. Thanks to the typo, the compiler was not using the ROM routines nor the non-optimized routines - it just panicked.
The new version should be generally available soon.
Many thanks to Earl for taking the time to test my crazy error report and quickly finding the case and fixing it.
Comments
Post a Comment