The popular got HTTP client from the NodeJS ecosystem uses the http-timer package which itself uses the “finish” event on a http.ClientRequest to determine the point in time when the HTTP request body has been written to the remote end (has been “uploaded”). Code.
This point in time is then used as the reference for measuring the time it takes for the HTTP server on the other end to generate a response: often called “time to first byte” (TTFB), this client-side metric measures the duration between sending out the last byte of the request to receiving the first byte of the response. TTFB is often used as a server performance metric, indicating the time it took the server to process the request.
My suspicion was that the TTFB numbers I saw in my scenario were pretty off.
I have then made a quick verification experiment in which I send an HTTP request with about 10 MB body size to an HTTP server under my control. I have confirmed the TCP upload to take roughly 30 seconds (reproducibly, through my slowish but stable Internet connection), and I have confirmed the HTTP server to immediately send a response once it has consumed the request body. For all means and purposes of this quick sanity check the actual request upload duration therefore is ~30 seconds, and the actual TTFB is practically zero.
What did http-timer measure? In one attempt, the “finish” event on the http.ClientRequest fired after about 17 seconds, resulting in an alleged TTFB of about 13 seconds. Repetitions yielded 20 s / 10 s, 15 s / 15 s, and more samples in the same ball park. That is, the method seems to significantly underestimate the request upload duration (it determines the point in time when the HTTP request body has been written to the remote end as too early), resulting in an overestimated TTFB (many seconds instead of ~zero seconds).
I suspected that the presence of some big buffer(s) architecturally skews the numbers. So I did this verification experiment with small TCP write buffers on my host operating system:
$ echo 'net.ipv4.tcp_wmem = 4096 16384 65536' >> /etc/sysctl.conf
$ echo 'net.core.wmem_max=65536' >> /etc/sysctl.conf
$ sysctl -p
$ cat /proc/sys/net/core/wmem_max
$ cat /proc/sys/net/ipv4/tcp_wmem
4096 16384 65536
From here I thought that most probably there is some kind of request buffering going on within the got/NodeJS system; the internals of which system are largely unknown to me.
At least, I believe that a big buffer between the client code and the host’s TCP stack would make the problem of “determining the point in time when the HTTP request body has been written to the remote end” kind of an ill-posed problem, explaining the discrepancy between the actual timings and the measured timings.
In NodeJS, a http.ClientRequest is a Writable stream. it has the concept of a highWaterMark (which I understand to be a buffer capacity, roughly) which by default seems to be set to 16 kB, and it can be configured upon stream construction.
In the NodeJS standard library I have found that the highWaterMark is not explicitly set in the routine which largely implements http.ClientRequest construction. Is it set somewhere else? It does not seem to be set in the got code base either.
grep has revealed that
_http_outgoing.js is a place worth looking:
_http_outgoing.js:65:const HIGH_WATER_MARK = getDefaultHighWaterMark();
All I found here is that this simply sets the default (16 kB). I fiddled with the code a bit to be really sure, and found that an http.ClientRequest object in the context of got indeed uses the default of 16 kB (16384 bytes) for
From here I am pretty clueless. I am reasonably confident that my quick verification experiment has shown that there is a significant deviation between reality and what’s measured, but as of today with limited knowledge about NodejS inner workings I cannot explain this deviation. Can you? Is the TCP stack of my Linux system tricking me? Is there some additional buffering going on in the inner workings of libuv (the event loop underlying to NodeJS)? Let me know!
An interesting thing I noticed is that similar timing measurement work happens in the established HTTP client request. In their timing measurement code they however do not even attempt to determine the point in time when the HTTP request body has been written to the remote end. The timing measurement code was introduced with request/pull/2452, and there is some lovely discussion about the details. This mildly suggests that the problem is indeed ill-posed in NodeJS (and I’d still love to understand why!).