Fix flow of ACKs to ensure we do not hang on final buf write#155
Fix flow of ACKs to ensure we do not hang on final buf write#155shaunrs wants to merge 2 commits intodhylands:masterfrom
Conversation
There was a problem hiding this comment.
I think that having an initial ACK is also important. It basically tells the host that the pyboard is running the code, and has opened the file and is ready to receive data.
Usually the reason for one side to hang is because it's waiting for a block of data and one or more bytes get lost. The ACK isn't really a data integrity ACK but rather a flow control ACK. This way the host only sends data to the board when the board is waiting for it. Writing to flash or sdcard can cause interrupts to be disabled, and characters arriving during that time interval can get lost.
|
Good point, that makes sense. I'd like to keep the last-packet ACK in there, whilst it isn't a data integrity ACK there is value in being able to detect last-packet timeouts which otherwise cause I'm happy to implement this, and get feedback. Which method would you prefer:
Personally: I'd go for 2 purely because it is much more explicit. It accounts for cases where the code changes in future and/or logic is added to this feature that may introduce other edge cases. But will implement this how you see best fits the project :) |
|
The actual character used doesn't really matter. I'd avoid using Control-C, but DC1 (0x11) should be fine. |
Sorry for the long delay, been very busy on other things. This is pushed using DC1. |
f2d71d9 to
9da4ca1
Compare
|
@davehylands I understand you're likely super busy, but I'm keen to get this merged if possible? I'm currently using a local fork in all my projects :) |
Using Windows 10 and a pyboard, often times a file transfer hangs and the only solution is to unplug the board and kill
rshell. I observe this behaviour with 9 out of 10 larger file transfers (4KiB). Smaller transfers do not seem affected.I'm not familiar enough with the specific mechanism of file transfer to describe the root cause here, but this behaviour appears to happen when
send_file_to_remotehas returned (no more data to send), butrecv_file_from_remoteis still executing, causingDevice.remote("recv_file_from_host", "send_file_to_remote" .. )to hang indefinitely. Presumablyrecvis waiting for more data that is never coming?It seems the current code would never send/receive the final packet's ack, as they are processed at the start of the
whileloop. It is therefore possible for a last-packet timeout to go unnoticed byrshell. By ensuring acks are only sent after the remote has received and written the packet, and subsequently verified after local has sent a full packet to the remote we can make this more robust. This resolves my file transfer issues.