-
Notifications
You must be signed in to change notification settings - Fork 130
Add missing tests and fixes from 4647 #4751
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
97 commits
Select commit
Hold shift + click to select a range
4efdd1a
Add channelwise conv
pfultz2 a0c6b07
Format
pfultz2 efeafca
Use shared memory
pfultz2 4498934
Format
pfultz2 1792edb
Update slice functions
pfultz2 0304972
Format
pfultz2 1389ae5
Update to use slices instead
pfultz2 9c9b9a5
Format
pfultz2 207e5d6
Add reduce_schedule for outer batches
pfultz2 cdae8f4
Format
pfultz2 b51b82f
Use pooling_reduce
pfultz2 b5f4f0f
Format
pfultz2 15fd39f
Some refactoring to use tiling
pfultz2 b61daa3
FOrmat
pfultz2 c9d258f
Access directly
pfultz2 6d979f5
Format
pfultz2 ecbce52
Add join
pfultz2 4bd6556
Update tuning
pfultz2 d1da333
Format
pfultz2 9cc6906
Add multi-output
pfultz2 0942c87
Format
pfultz2 ca147d2
Add spatial tiler
pfultz2 3b17a09
Format
pfultz2 037d10f
Avoid bounds check when there is no padding
pfultz2 7bc6d78
Remove lines
pfultz2 e3077b8
Use functions instead of variables
pfultz2 414aab4
Format
pfultz2 e56c4f1
Inine methods
pfultz2 b51c74f
Format
pfultz2 3d4bfe4
Update quick tuning list
pfultz2 a362a19
Format
pfultz2 208c7ad
Add another config
pfultz2 f2daa29
Add more configs
pfultz2 36110cf
Format
pfultz2 882fe3b
Add pointwise fusion
pfultz2 24a2645
Format
pfultz2 28e32af
Only enable for float and navi
pfultz2 e35373c
Format
pfultz2 f69d9bb
Fix tidy
pfultz2 fb48be7
Format
pfultz2 ef923a8
Fix tidy
pfultz2 513fafc
Update year
pfultz2 ec3c657
Fix cppcheck
pfultz2 5d8051b
Format
pfultz2 99c896c
Use std algos
pfultz2 9f0903d
Format
pfultz2 680328b
Move in_bounds function
pfultz2 1120309
Rename type
pfultz2 7645792
Format
pfultz2 32b5894
Fix compilation failure
pfultz2 2141264
Format
pfultz2 19cf173
Simplify some more
pfultz2 b39416e
Format
pfultz2 6c990fd
Use std::transform
pfultz2 90638f8
Precompute slices
pfultz2 053bf4f
Format
pfultz2 ffaa5c3
Update src/targets/gpu/kernels/include/migraphx/kernels/slice.hpp
pfultz2 8a06baf
Change the navi check
pfultz2 a3fd388
Merge branch 'channelwise-conv2' of github.com:ROCmSoftwarePlatform/A…
pfultz2 258af41
Split verify classes
pfultz2 bcd468d
Revert the reduce and index changes
pfultz2 7ba2cca
Revert pooling changes
pfultz2 61f6ffb
Use signed integer
pfultz2 2a770dd
Merge branch 'develop' into channelwise-conv2
pfultz2 b5cad75
Update year
pfultz2 5b49459
Format
pfultz2 dc7f7e5
Fix merge conflicts
pfultz2 9eb50da
Merge branch 'develop' into channelwise-conv2
TedThemistokleous 18a7efa
Support padding
pfultz2 c23a8e8
Format
pfultz2 747292c
Fix selection
pfultz2 ad9b8d1
Fix padding
pfultz2 77dac35
Cleanup
pfultz2 3c3e0ac
Merge
pfultz2 362ce5f
Merge branch 'develop' into channelwise-conv2
pfultz2 c47b394
Use generate_array instead
pfultz2 604d408
Use generate array
pfultz2 5fc446a
Format
pfultz2 21442c4
Add padding tests
pfultz2 371f79b
Format
pfultz2 df1676f
Merge branch 'channelwise-conv2' of github.com:ROCmSoftwarePlatform/A…
pfultz2 8949117
Update is_padded() check
pfultz2 be32bda
Format
pfultz2 4f5221e
Add unit tests
pfultz2 b0e4634
Format
pfultz2 de0d67a
Merge branch 'develop' into channelwise-conv2
pfultz2 eecf785
Fix tidy
pfultz2 7949e82
Merge branch 'develop' into channelwise-conv2
pfultz2 a3b61a2
Fix cppcheck warnings
pfultz2 1457b47
Format
pfultz2 7a5abf9
Update year
pfultz2 b68efb5
Merge branch 'develop' into channelwise-conv2
causten 003f033
Merge branch 'develop' into channelwise-conv2
pfultz2 1f22234
Merge branch 'develop' into channelwise-conv2
pfultz2 7d9e876
Fix tile miscompilation
pfultz2 c19c8b5
update license
kahmed10 75f1529
Merge branch 'develop' of https://github.com/ROCm/AMDMIGraphX into ch…
kahmed10 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is_padded()now returns true whenever convolution padding is present (becausetotal_padding()is included in the equality check). This forcesfor_each()to do per-element output bounds checks even when the output tile exactly covers the output (i.e., only input halo needs padding checks). Consider splitting this into two constexpr predicates (e.g.,is_output_padded()for tile overhang vsneeds_input_padding()for halo/padding), so output bounds checks remain compiled out for the common “exact tiling + conv padding” case.