test(workflow-operator): add unit test coverage for source operator enums#6042
test(workflow-operator): add unit test coverage for source operator enums#6042aglinxinyuan wants to merge 2 commits into
Conversation
…nums (FileAttributeType, FileDecodingMethod, DecodingMethod, RedditSourceOperatorFunction)
Automated Reviewer SuggestionsBased on the
|
There was a problem hiding this comment.
Pull request overview
This PR adds ScalaTest unit coverage in common/workflow-operator to pin the JSON “wire” values and related helper behaviors of several existing source-operator enums, ensuring Jackson serialization/deserialization and enum contracts remain stable without changing production code.
Changes:
- Add enum contract tests for
FileDecodingMethodandFileAttributeTypein the scan source operator package. - Add enum contract tests for
DecodingMethodin the fetcher source operator package. - Add enum contract tests for
RedditSourceOperatorFunctionin the Reddit source API package.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| common/workflow-operator/src/test/scala/org/apache/texera/amber/operator/source/scan/FileDecodingMethodSpec.scala | Adds tests for wire-name mapping and Jackson round-trip of FileDecodingMethod. |
| common/workflow-operator/src/test/scala/org/apache/texera/amber/operator/source/scan/FileAttributeTypeSpec.scala | Adds tests for wire-name mapping, core AttributeType mapping, isSingle, and Jackson round-trip of FileAttributeType. |
| common/workflow-operator/src/test/scala/org/apache/texera/amber/operator/source/fetcher/DecodingMethodSpec.scala | Adds tests for wire-name mapping and Jackson round-trip of DecodingMethod. |
| common/workflow-operator/src/test/scala/org/apache/texera/amber/operator/source/apis/reddit/RedditSourceOperatorFunctionSpec.scala | Adds tests for wire-name mapping and Jackson round-trip of RedditSourceOperatorFunction. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #6042 +/- ##
============================================
- Coverage 56.93% 56.93% -0.01%
+ Complexity 3026 3023 -3
============================================
Files 1129 1129
Lines 43794 43794
Branches 4743 4743
============================================
- Hits 24936 24932 -4
- Misses 17384 17385 +1
- Partials 1474 1477 +3
*This pull request uses carry forward flags. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
|
| config | throughput | MB/s | latency | max Δ latest / 7d | |
|---|---|---|---|---|---|
| 🔴 | bs=10 sw=10 sl=64 | 426 | 0.26 | 22,152/36,002/36,002 us | 🔴 +21.6% / 🔴 +138.9% |
| 🔴 | bs=100 sw=10 sl=64 | 954 | 0.582 | 103,161/127,307/127,307 us | 🔴 +6.3% / 🔴 +18.3% |
| ⚪ | bs=1000 sw=10 sl=64 | 1,099 | 0.67 | 914,709/964,753/964,753 us | ⚪ within ±5% / 🟢 -8.9% |
Baseline details
Latest main 434bcae from same runner
| config | metric | PR | latest main | 7d avg | Δ latest | Δ 7d |
|---|---|---|---|---|---|---|
| bs=10 sw=10 sl=64 | throughput | 426 tuples/sec | 462 tuples/sec | 777.62 tuples/sec | -7.8% | -45.2% |
| bs=10 sw=10 sl=64 | MB/s | 0.26 MB/s | 0.282 MB/s | 0.475 MB/s | -7.8% | -45.2% |
| bs=10 sw=10 sl=64 | p50 | 22,152 us | 21,162 us | 12,612 us | +4.7% | +75.6% |
| bs=10 sw=10 sl=64 | p95 | 36,002 us | 29,616 us | 15,070 us | +21.6% | +138.9% |
| bs=10 sw=10 sl=64 | p99 | 36,002 us | 29,616 us | 18,360 us | +21.6% | +96.1% |
| bs=100 sw=10 sl=64 | throughput | 954 tuples/sec | 981 tuples/sec | 988.31 tuples/sec | -2.8% | -3.5% |
| bs=100 sw=10 sl=64 | MB/s | 0.582 MB/s | 0.599 MB/s | 0.603 MB/s | -2.8% | -3.5% |
| bs=100 sw=10 sl=64 | p50 | 103,161 us | 99,005 us | 101,066 us | +4.2% | +2.1% |
| bs=100 sw=10 sl=64 | p95 | 127,307 us | 119,817 us | 107,594 us | +6.3% | +18.3% |
| bs=100 sw=10 sl=64 | p99 | 127,307 us | 119,817 us | 115,830 us | +6.3% | +9.9% |
| bs=1000 sw=10 sl=64 | throughput | 1,099 tuples/sec | 1,114 tuples/sec | 1,019 tuples/sec | -1.3% | +7.8% |
| bs=1000 sw=10 sl=64 | MB/s | 0.67 MB/s | 0.68 MB/s | 0.622 MB/s | -1.5% | +7.7% |
| bs=1000 sw=10 sl=64 | p50 | 914,709 us | 892,791 us | 986,982 us | +2.5% | -7.3% |
| bs=1000 sw=10 sl=64 | p95 | 964,753 us | 974,798 us | 1,028,491 us | -1.0% | -6.2% |
| bs=1000 sw=10 sl=64 | p99 | 964,753 us | 974,798 us | 1,058,493 us | -1.0% | -8.9% |
Raw CSV
config_idx,batch_size,schema_width,string_len,num_batches,total_ms,total_tuples,total_bytes,tuples_per_sec,mb_per_sec,lat_p50_us,lat_p95_us,lat_p99_us
0,10,10,64,20,470.03,200,128000,426,0.260,22151.94,36002.12,36002.12
1,100,10,64,20,2096.85,2000,1280000,954,0.582,103161.18,127307.29,127307.29
2,1000,10,64,20,18206.17,20000,12800000,1099,0.670,914708.93,964752.75,964752.75…ributeType constant Per review: replace the single SINGLE_STRING toString assertion with a loop over all constants so the toString contract is fully pinned.
What changes were proposed in this PR?
Pin behavior of four previously-untested source operator enums in
common/workflow-operator. No production-code changes.FileAttributeTypeSpecFileAttributeTypeFileDecodingMethodSpecFileDecodingMethodDecodingMethodSpecDecodingMethodRedditSourceOperatorFunctionSpecRedditSourceOperatorFunctionBehavior pinned
@JsonValuestring (e.g.FileDecodingMethod.ASCII→US_ASCII,FileAttributeType.LARGE_BINARY→large binary,DecodingMethod.RAW_BYTES→RAW BYTES) + constant countsFileAttributeTypeextrasgetType→ coreAttributeType;isSingletrue only forSINGLE_STRING/BINARY/LARGE_BINARY;toString== wire name@JsonValueserialize + deserialize back to the constantAny related issues, documentation, discussions?
Part of the ongoing
workflow-operatorunit-test coverage effort.How was this PR tested?
sbt "WorkflowOperator/testOnly *FileAttributeTypeSpec *FileDecodingMethodSpec *DecodingMethodSpec *RedditSourceOperatorFunctionSpec"— 10 tests, all greensbt "WorkflowOperator/Test/scalafmtCheck"andsbt "WorkflowOperator/scalafixAll --check"— cleanWas this PR authored or co-authored using generative AI tooling?
Generated-by: Claude Code (Opus 4.8 [1M context])