RATIS-2546. Add input stream to DataStreamApi for read operations in Client#1481
RATIS-2546. Add input stream to DataStreamApi for read operations in Client#1481peterxcli wants to merge 17 commits into
Conversation
Signed-off-by: peterxcli <peterxcli@gmail.com>
Signed-off-by: peterxcli <peterxcli@gmail.com>
Signed-off-by: peterxcli <peterxcli@gmail.com>
|
cc @szetszwo |
|
The no-MD5 sequential reads comparison:
The no-MD5 random reads comparison:
I'm looking into why larger buffer would cause throughput to degrade. |
There was a problem hiding this comment.
@peterxcli , thanks a lot for working on this!
The PR is quite big. Let's separate the DataStreamReply change first. Please see the comments inlined and also https://issues.apache.org/jira/secure/attachment/13082807/1481_review_DataStreamReply.patch (updated the link in the next comment.)
Signed-off-by: peterxcli <peterxcli@gmail.com>
szetszwo
left a comment
There was a problem hiding this comment.
@peterxcli , thanks for updating this!
Let's have one more simple change to add an executor and terminalReply (RATIS-2564). Please see the comments inlined and also https://issues.apache.org/jira/secure/attachment/13082846/1481_review3_refactoring.patch
Signed-off-by: peterxcli <peterxcli@gmail.com>
|
|
||
| /** Async call to send a request and receive multiple replies for the request. */ | ||
| default CompletableFuture<DataStreamReply> streamAsync( | ||
| DataStreamRequest request, Consumer<DataStreamReply> replyConsumer) { |
There was a problem hiding this comment.
@peterxcli , Let's borrow the idea from gRPC StreamObserver and add a similar interface. It is more flexible for future changes.
//ratis-common
package org.apache.ratis.datastream;
/** An interface similar to gRPC {@link org.apache.ratis.thirdparty.io.grpc.stub.StreamObserver}. */
public interface DataStreamObserver<V> {
void onNext(V value);
// see if onError(Throwable) and onCompleted() are useful. Or we may add them later.
}Then our interface will be similar to gRPC service https://github.com/apache/ratis/blob/master/ratis-grpc/src/main/java/org/apache/ratis/grpc/server/GrpcServerProtocolClient.java#L174-L181
//DataStreamClientRpc
/** Async call to send a request and receive multiple replies for the request. */
default CompletableFuture<DataStreamReply> streamAsync(DataStreamRequest request,
DataStreamObserver<DataStreamReplyByteBuf> replyHandler) {
throw new UnsupportedOperationException(getClass() + " does not support "
+ JavaUtils.getCurrentStackTraceElement().getMethodName());
}Signed-off-by: peterxcli <peterxcli@gmail.com>
Signed-off-by: peterxcli <peterxcli@gmail.com>
|
@szetszwo thanks for the grpc observer suggestion, just added a initial version of it. I also add onError and onComplete interface method for the observer, because we can decide which hook to call base on the terminal reply. Please take a look, Thanks! |
| failReads(exception); | ||
| } else { | ||
| markEndOfStream(); | ||
| } |
There was a problem hiding this comment.
Do we need to call close() or mark closed = true; here?
There was a problem hiding this comment.
no, the caller might have not counsumed the fetched content yet, if we set close, the readAsync would fail,
There was a problem hiding this comment.
I think I can refactor this to get better readability.
There was a problem hiding this comment.
It reads much better, thanks!
| for (DataStreamReply reply; (reply = replies.poll()) != null;) { | ||
| DataStreamReplyByteBuf.release(reply); | ||
| } | ||
| failReads(new AlreadyClosedException(clientId + ": stream already closed, request=" + header)); |
There was a problem hiding this comment.
I do not understand why we need to do a failReads upon a ordinary close. Do we not expect close be call in the happy path and we only call close in bad cases?
There was a problem hiding this comment.
close() could happen while readAsync() calls are pending.
| public static void release(DataStreamReply reply) { | ||
| if (reply instanceof DataStreamReplyByteBuf) { | ||
| ((DataStreamReplyByteBuf) reply).release(); | ||
| } |
There was a problem hiding this comment.
What about else? Is that possible other instances are passed here? If no maybe add an assertion, otherwise handle other instances?
There was a problem hiding this comment.
There are two classes implement the DataStreamReply, which is DataStreamReplyByteBuf and DataStreamReplyByteBuffer.
And only DataStreamReplyByteBuf need to release the Netty byteBuffer reference count
Co-Authored-By: amaliujia <amaliujia@apache.org> Signed-off-by: peterxcli <peterxcli@gmail.com>
aa3ab09 to
7b15a13
Compare
Signed-off-by: peterxcli <peterxcli@gmail.com>
szetszwo
left a comment
There was a problem hiding this comment.
@peterxcli , thanks for the update!
In DataStreamInput, readAsync() should return a ReferenceCountedObject so the user can retain/release the buffer.
Please see the comments inlined and also https://issues.apache.org/jira/secure/attachment/13082885/1481_review4.patch
Co-Authored-By: szetszwo <szetszwo@apache.org> Signed-off-by: peterxcli <peterxcli@gmail.com>
What changes were proposed in this pull request?
DataStreamInputinterface and its implementationDataStreamInputImplfor asynchronous, zero-copy read-only data streaming, including thereadAsync()method for consuming replies and proper resource release on close.DataStreamApiinterface and its implementation to providestreamReadOnly()methods for creating read-only streams.DataStreamReplyByteBufclass to support replies backed by NettyByteBuf.streamAsync()in theDataStreamClientRpcinterface that takes a reply consumer, supporting multiple replies per request.What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/RATIS-2546
How was this patch tested?
(Please explain how this patch was tested. Ex: unit tests, manual tests)
(If this patch involves UI changes, please attach a screen-shot; otherwise, remove this)