diff options
author | Marius Andra <marius.andra@gmail.com> | 2024-05-08 17:33:43 +0200 |
---|---|---|
committer | GitHub <noreply@github.com> | 2024-05-08 09:33:43 -0600 |
commit | e6f66e4d131fb8cf3236df6cb14e219dbb5d0c77 (patch) | |
tree | eb07246f455353bc5bd842e1bfd2559fb75221f9 /lib/pure/net.nim | |
parent | e662043fd1c43e8447fb7c30e97823f7d0a46a83 (diff) | |
download | Nim-e6f66e4d131fb8cf3236df6cb14e219dbb5d0c77.tar.gz |
fixes 12381, HttpClient socket handle leak (#23575)
## Bug Fixes https://github.com/nim-lang/Nim/issues/12381 - HttpClient socket handle leak To replicate the bug, run the following code in a loop: ```nim import httpclient while true: echo "New loop" var client = newHttpClient(timeout = 1000) try: let response = client.request("http://10.44.0.4/bla", httpMethod = HttpPost, body = "boo") echo "HTTP " & $response.status except CatchableError as e: echo "Error sending logs: " & $e.msg finally: echo "Finally" client.close() ``` Note the IP address as the hostname. I'm directly connecting to a plausible local IP, but one that does not resolve, as I have everything under 10.4.x.x. The output looks like this to me: ``` New loop Error sending logs: Operation timed out Finally New loop Error sending logs: Operation timed out Finally New loop ... ``` In Nim 2.0.4, running the code above leaks the socket: <img width="944" alt="Screenshot 2024-05-05 at 22 00 13" src="https://github.com/nim-lang/Nim/assets/53387/ddac67db-d7df-45e6-b7a5-3d42f79775ea"> ## Fix With the added line of code, each old socket is cleanly removed: <img width="938" alt="Screenshot 2024-05-05 at 21 54 18" src="https://github.com/nim-lang/Nim/assets/53387/5b0b4b2d-d4f0-4e74-a9cf-74aec0c50d2e"> I believe the line below, `closeUnusedFds(ord(domain))` was supposed to clean up the failed connection attempts, but it failed to do so for the last one, assuming it succeeded. Yet it didn't. This fix makes sure failed connections are closed immediately. ## Tests I don't have a test with this PR. When testing locally, the `connect(lastFd, ..)` call on line 2032 blocks for ~75 seconds, ignoring the http timeout. I fear any test I could add would either 1) take way too long, 2) one day run in an environment where my randomly chosen IP is real, yielding in weird flakes. The only bug i can imagine is if running `lastFd.close()` twice is a bad idea. I tested by actually running it twice, and... no crash/op? So seems safe? I'm hoping the CI run will be green, and this will be enough. However I'm happy to take feedback on how I should test this, and do the necessary changes. ~Edit: looks like a test does fail, so moving to a draft while I figure this out.~ Attempt 2 fixed it.
Diffstat (limited to 'lib/pure/net.nim')
-rw-r--r-- | lib/pure/net.nim | 2 |
1 files changed, 2 insertions, 0 deletions
diff --git a/lib/pure/net.nim b/lib/pure/net.nim index d77ab5db1..7f6a36557 100644 --- a/lib/pure/net.nim +++ b/lib/pure/net.nim @@ -2040,8 +2040,10 @@ proc dial*(address: string, port: Port, if success: result = newSocket(lastFd, domain, sockType, protocol, buffered) elif lastError != 0.OSErrorCode: + lastFd.close() raiseOSError(lastError) else: + lastFd.close() raise newException(IOError, "Couldn't resolve address: " & address) proc connect*(socket: Socket, address: string, |