Skip to content

fix(stm32f4): Hangs during flashing firmware#7415

Open
richardclli wants to merge 1 commit into
mainfrom
richardclli/fix-f4-flash-hangs
Open

fix(stm32f4): Hangs during flashing firmware#7415
richardclli wants to merge 1 commit into
mainfrom
richardclli/fix-f4-flash-hangs

Conversation

@richardclli
Copy link
Copy Markdown
Member

@richardclli richardclli commented Jun 1, 2026

Trying to fix the problem by chance the flashing will stop in the middle. Using opencode + deepseek v4 to discover the fix

Summary by CodeRabbit

  • Bug Fixes
    • Enhanced flash memory operation reliability by protecting critical erase and program sequences from interruption, ensuring more stable and dependable performance during firmware programming and flash storage operations. Improves overall device stability during flash-related tasks.

@richardclli richardclli self-assigned this Jun 1, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jun 1, 2026

Review Change Stack

📝 Walkthrough

Walkthrough

Flash erase and program operations in the STM32 driver are wrapped with interrupt masking. __disable_irq() gates the unlock/erase/program/lock sequences, and __enable_irq() restores interrupts after the critical section completes, preventing IRQ handlers from executing during volatile flash state changes.

Changes

Flash Operation Interrupt Masking

Layer / File(s) Summary
Interrupt masking in erase and program critical sections
radio/src/targets/common/arm/stm32/flash_driver.cpp
stm32_flash_erase_sector and stm32_flash_program now disable interrupts before unlock/erase/program operations and re-enable them after lock, ensuring flash state transitions are not interrupted by IRQ handlers.

🎯 1 (Trivial) | ⏱️ ~3 minutes

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check ❓ Inconclusive The description is vague and lacks required template structure. It mentions using AI tools but doesn't clearly explain what problem exists, why the fix works, or provide a proper issue reference. Add a clear problem statement, explain why disabling interrupts prevents hangs, reference the issue number (Fixes #), and follow the repository template structure.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly identifies the main change: fixing hangs during STM32F4 firmware flashing, which aligns with the actual code changes that disable interrupts during flash operations.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch richardclli/fix-f4-flash-hangs

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@richardclli richardclli added this to the 2.11.7 milestone Jun 1, 2026
@richardclli richardclli marked this pull request as draft June 1, 2026 00:41
@richardclli richardclli changed the title fix(stm32): disable interrupts during flash erase/program to prevent … fix(stm32f4): Hangs during flashing firmware Jun 1, 2026
@richardclli richardclli force-pushed the richardclli/fix-f4-flash-hangs branch from 3ff0440 to b14c674 Compare June 1, 2026 05:05
@richardclli richardclli changed the base branch from 2.11 to main June 1, 2026 05:10
@richardclli richardclli marked this pull request as ready for review June 1, 2026 05:11
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
radio/src/targets/common/arm/stm32/flash_driver.cpp (1)

179-192: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Same issue: restore previous interrupt state instead of unconditionally enabling.

Apply the same PRIMASK save/restore pattern here for consistency and correctness.

Proposed fix
   int ret = 0;
+  uint32_t primask = __get_PRIMASK();
   __disable_irq();
   stm32_flash_unlock();
   while (address < end_addr) {
     if (_FLASH_PROGRAM(address, p_data) != HAL_OK) {
       ret = -1;
       break;
     }

     address += sizeof(uint32_t) * FLASH_PROG_WORDS;
     p_data += FLASH_PROG_WORDS;
   }

   stm32_flash_lock();
-  __enable_irq();
+  __set_PRIMASK(primask);
   return ret;
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@radio/src/targets/common/arm/stm32/flash_driver.cpp` around lines 179 - 192,
The block currently unconditionally calls __disable_irq() and later
__enable_irq(); change it to save and restore the prior interrupt state using
the PRIMASK pattern: capture the current PRIMASK (via __get_PRIMASK() or
equivalent) before disabling, call __disable_irq(), perform
stm32_flash_unlock(), the programming loop using _FLASH_PROGRAM, and
stm32_flash_lock(), then restore the saved PRIMASK (via __set_PRIMASK(saved) or
equivalent) instead of calling __enable_irq() directly so the interrupt state is
preserved; update the surrounding code in flash_driver.cpp where
__disable_irq(), stm32_flash_unlock(), stm32_flash_lock(), and __enable_irq()
are used.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@radio/src/targets/common/arm/stm32/flash_driver.cpp`:
- Around line 148-155: The code unconditionally calls __enable_irq() after using
__disable_irq(), which can enable interrupts that were previously disabled by
the caller; change the pattern to save and restore the caller's PRIMASK instead:
at function entry (around where __disable_irq() is currently called) read the
current PRIMASK via __get_PRIMASK(), then call __disable_irq(), perform
stm32_flash_unlock(), HAL_FLASHEx_Erase(...), stm32_flash_lock(), and finally
restore the original interrupt state by calling __set_PRIMASK(saved_primask)
instead of unconditionally calling __enable_irq(); update usage around
stm32_flash_unlock()/stm32_flash_lock()/HAL_FLASHEx_Erase to use this
save/restore PRIMASK pattern.

---

Outside diff comments:
In `@radio/src/targets/common/arm/stm32/flash_driver.cpp`:
- Around line 179-192: The block currently unconditionally calls __disable_irq()
and later __enable_irq(); change it to save and restore the prior interrupt
state using the PRIMASK pattern: capture the current PRIMASK (via
__get_PRIMASK() or equivalent) before disabling, call __disable_irq(), perform
stm32_flash_unlock(), the programming loop using _FLASH_PROGRAM, and
stm32_flash_lock(), then restore the saved PRIMASK (via __set_PRIMASK(saved) or
equivalent) instead of calling __enable_irq() directly so the interrupt state is
preserved; update the surrounding code in flash_driver.cpp where
__disable_irq(), stm32_flash_unlock(), stm32_flash_lock(), and __enable_irq()
are used.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 5d211f90-8d5f-4c97-b58a-b508b0fe27d7

📥 Commits

Reviewing files that changed from the base of the PR and between 00d7544 and b14c674.

📒 Files selected for processing (1)
  • radio/src/targets/common/arm/stm32/flash_driver.cpp

Comment on lines +148 to +155
__disable_irq();
stm32_flash_unlock();
if (HAL_FLASHEx_Erase(&eraseInit, &sector_errors) != HAL_OK) {
ret = -1;
}

stm32_flash_lock();
__enable_irq();
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Unconditional __enable_irq() may unexpectedly alter caller's interrupt state.

If interrupts were already disabled before entering this function, __enable_irq() will enable them unexpectedly. Use PRIMASK save/restore pattern to preserve the original interrupt state.

Proposed fix
   int ret = 0;
   uint32_t sector_errors = 0;

+  uint32_t primask = __get_PRIMASK();
   __disable_irq();
   stm32_flash_unlock();
   if (HAL_FLASHEx_Erase(&eraseInit, &sector_errors) != HAL_OK) {
     ret = -1;
   }

   stm32_flash_lock();
-  __enable_irq();
+  __set_PRIMASK(primask);
   return ret;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
__disable_irq();
stm32_flash_unlock();
if (HAL_FLASHEx_Erase(&eraseInit, &sector_errors) != HAL_OK) {
ret = -1;
}
stm32_flash_lock();
__enable_irq();
uint32_t primask = __get_PRIMASK();
__disable_irq();
stm32_flash_unlock();
if (HAL_FLASHEx_Erase(&eraseInit, &sector_errors) != HAL_OK) {
ret = -1;
}
stm32_flash_lock();
__set_PRIMASK(primask);
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@radio/src/targets/common/arm/stm32/flash_driver.cpp` around lines 148 - 155,
The code unconditionally calls __enable_irq() after using __disable_irq(), which
can enable interrupts that were previously disabled by the caller; change the
pattern to save and restore the caller's PRIMASK instead: at function entry
(around where __disable_irq() is currently called) read the current PRIMASK via
__get_PRIMASK(), then call __disable_irq(), perform stm32_flash_unlock(),
HAL_FLASHEx_Erase(...), stm32_flash_lock(), and finally restore the original
interrupt state by calling __set_PRIMASK(saved_primask) instead of
unconditionally calling __enable_irq(); update usage around
stm32_flash_unlock()/stm32_flash_lock()/HAL_FLASHEx_Erase to use this
save/restore PRIMASK pattern.

@richardclli
Copy link
Copy Markdown
Member Author

richardclli commented Jun 1, 2026

Tested to flash > 10 times using my PL18U, no more hangs. However, the probability of seeing hangs during flashing is quite low. Maybe need more to test to confirm if it is gone.

As the flash operation can still works properly. No harm to merge anyway.

@pfeerick
Copy link
Copy Markdown
Member

pfeerick commented Jun 3, 2026

Maybe need more to test to confirm if it is gone.

Probably quite a few more... I think we were counting something like 1 in 30 flashes... very strange gremlin

@pfeerick pfeerick added backport/2.11 To be backported to a 2.11 release also. backport/2.12 To be backported to a 2.12 release also. labels Jun 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport/2.11 To be backported to a 2.11 release also. backport/2.12 To be backported to a 2.12 release also.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants