[BugFix][Target][LLVM] Route sinh/cosh/atan/asinh/erf through libm extern#19568
Open
swjng wants to merge 3 commits into
Open
[BugFix][Target][LLVM] Route sinh/cosh/atan/asinh/erf through libm extern#19568swjng wants to merge 3 commits into
swjng wants to merge 3 commits into
Conversation
…tern Five LLVM legalize rules used inline mathematical identities that fail on representable inputs because the intermediate computation overflows or cancels, even though the true result is in range: - `sinh`/`cosh`: `(exp(x) ± exp(-x)) / 2` — `exp(89) > FLT_MAX` so `exp(x)` itself overflows. True `sinh(89) ≈ 2.24e38` fits in float32. (apache#19559) - `atan`: `asin(x / sqrt(x*x + 1))` — `x*x` overflows for `|x| > sqrt(FLT_MAX) ≈ 1.84e19`, then `sqrt(inf) = inf`, `x / inf = 0`, `asin(0) = 0`. (apache#19560) - `asinh`: `log(x + sqrt(x*x + 1))` — same `x*x` overflow. True `asinh(3e22) ≈ 52.45`. (apache#19561) - `erf`: Abramowitz–Stegun `1 - poly(t) * exp(-x*x)` — for small `|x|`, `poly * exp(-x*x) ≈ 1` and the subtraction cancels to 0 in float32, flushing `erf(3e-12)` to 0 instead of the true `~3.4e-12`. (apache#19562) All four route through the existing `DispatchPureExtern<FloatSuffix>` helper (i.e. `sinhf`, `coshf`, `atanf`, `asinhf`, `erff`), the same pattern already used by `asin`/`acos`. ULP-grade accuracy across representative ranges; `Atan` is re-enabled in `test_unary` since the overflow that previously broke it is fixed. Note for reviewers: if the inline identities were a deliberate fast-path (e.g. for autovectorization or to avoid extern call overhead in tight loops), please flag it and I'll switch to stable inline forms (`exp(x − ln 2) ± exp(−x − ln 2)` for sinh/cosh, range-reduced asinh, small-x Taylor for erf, etc.). I could not find evidence of such intent in the git history. Acosh shows the same `sqrt(x*x − 1)` overflow pattern but is not covered by any of the listed issues; happy to include it as a follow-up if maintainers want. Fixes apache#19559. Fixes apache#19560. Fixes apache#19561. Fixes apache#19562.
Contributor
There was a problem hiding this comment.
Code Review
This pull request refactors the LLVM legalization rules for several intrinsic operations, including cosh, sinh, atan, asinh, and erf, by replacing manual mathematical expansions with a unified DispatchPureExtern call. Additionally, the Atan operator has been enabled in the ONNX frontend tests. The reviewer pointed out that the using namespace intrin; statements, local call variable assignments, and manual null checks are redundant in the new implementation because DispatchPureExtern is fully qualified and performs its own internal validation.
`acosh` has the same `sqrt(x*x - 1)` overflow pattern as `asinh`: intermediate `x*x` overflows float32 for `|x| > sqrt(FLT_MAX) ≈ 1.84e19`, so `sqrt(inf) = inf`, `log(x + inf) = inf`, while the true result `acosh(3e22) ≈ 52.45` is well within range. No issue was filed for this op but the bug is identical to apache#19561 and the fix is the same.
…rappers Per review feedback: with the inline math identities gone, the `DispatchPureExtern<FloatSuffix>` call is fully qualified (`::tvm::codegen::intrin::...`) and the `using namespace intrin;` line inside each lambda no longer brings anything into scope. Drop it from the six ops touched in this PR (sinh, cosh, atan, asinh, acosh, erf). The `CallNode* call` ICHECK is kept for parity with the rest of the file (every legalize lambda in this translation unit performs that check).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Six LLVM legalize rules in
src/target/llvm/intrin_rule_llvm.ccuse inline mathematical identities that fail on representable inputs because the intermediate computation overflows or cancels, even though the true result is infloat32range:sinh/cosh(#19559)(exp(x) ± exp(-x)) / 2exp(89) > FLT_MAX, intermediate isinfsinh(89) ≈ 2.24e38atan(#19560)asin(x / sqrt(x²+1))x²overflows for `asinh(#19561)log(x + sqrt(x²+1))x²overflow →log(inf)=infasinh(3e22) ≈ 52.45erf(#19562)1 − poly(t)·exp(−x²)poly·exp(−x²) ≈ 1for tiny `acosh(no issue)log(x + sqrt(x²−1))x²overflow →infacosh(3e22) ≈ 52.45acoshwas not in the original issue cluster but shows the identical bug pattern toasinh; folding it in keeps this PR's scope consistent ("naive math identity → libm extern"). Happy to split it out if reviewers prefer.Fix
Route all six through the existing
DispatchPureExtern<FloatSuffix>helper — i.e.sinhf,coshf,atanf,asinhf,acoshf,erff— the same patternasin/acosuse after #19567. ULP-grade accuracy across the reported ranges.Atanis re-enabled intest_unary; the overflow that previously broke it is fixed.Notes for reviewers
Inline-vs-extern decision. If the inline identities were a deliberate fast-path (e.g. for autovectorization or to avoid extern-call overhead in tight loops), please flag it and I'll switch to stable inline forms instead —
exp(x − ln 2) ± exp(−x − ln 2)for sinh/cosh, range-reduced asinh/acoshsign(x)·log(2|x|)for large|x|, small-|x|Taylor branch for erf, etc. I could not find evidence of such intent in the git history (sinh/cosh: original commit; atan/asinh/acosh: #17945 / #17969 follow-ups; erf: #18104 was framed as "more precise than tanh-approx", not "fast inline").Fixes #19559.
Fixes #19560.
Fixes #19561.
Fixes #19562.