⟵ Home

NPM Registry HTTP Semantics

January 30, 2025 ∙ 4 minute read

Here at nice try, I’m working on an NPM proxy and internal registry. All is fine and dandy until you notice that NPM Registry, although returning ETag headers, simply ignore them. For instance, let’s try get the latest version of say, react, and then use If-None-Match to get an indicator our request is still fresh.

First Request: No If-None-Match

GET /react HTTP/1.1
Host: registry.npmjs.org
Connection: close
User-Agent: [REDACTED]/0.1.0

The response is expected:

HTTP/1.1 200 OK
Date: Thu, 30 Jan 2025 12:35:30 GMT
Content-Type: application/json
Content-Length: 5387821
Connection: close
CF-Ray: 90a18092cb3802e5-GRU
CF-Cache-Status: HIT
Accept-Ranges: bytes
Access-Control-Allow-Origin: *
Age: 75
Cache-Control: public, max-age=300
ETag: "5611f329debbb41fda058497d0d4c7d8"
Last-Modified: Wed, 29 Jan 2025 16:20:20 GMT
Vary: accept-encoding, accept
Server: cloudflare

Fancy stuff. Very nice. Even an ETag! So the expectation is that If-None-Match yields an HTTP 304, right? Let’s try it again:

GET /react HTTP/1.1
If-None-Match: "5611f329debbb41fda058497d0d4c7d8"
Host: registry.npmjs.org
Connection: close
User-Agent: [REDACTED]/0.1.0

And…

HTTP/1.1 200 OK
Date: Thu, 30 Jan 2025 12:37:37 GMT
Content-Type: application/json
Content-Length: 5387821
Connection: close
CF-Ray: 90a183af0d59ae90-GRU
CF-Cache-Status: HIT
Accept-Ranges: bytes
Access-Control-Allow-Origin: *
Age: 202
Cache-Control: public, max-age=300
ETag: "5611f329debbb41fda058497d0d4c7d8"
Last-Modified: Wed, 29 Jan 2025 16:20:20 GMT
Vary: accept-encoding, accept
Server: cloudflare

OK? My If-None-Match is extactly the same ETag returned on the response.

Second Request: Should HEAD help us?

Okay, but WHAT IF we issued a HEAD in order to compare ETags? That would be enough to assert we have the same payload, right?

HEAD /react HTTP/1.1
Host: registry.npmjs.org
Connection: close
User-Agent: [REDACTED]/0.1.0
HTTP/1.1 200 OK
Date: Thu, 30 Jan 2025 12:39:41 GMT
Content-Type: application/json
Connection: close
CF-Ray: 90a186b3687af23f-GRU
CF-Cache-Status: HIT
Accept-Ranges: bytes
Access-Control-Allow-Origin: *
Age: 24
Cache-Control: public, max-age=300
ETag: W/"5611f329debbb41fda058497d0d4c7d8"
Last-Modified: Wed, 29 Jan 2025 16:20:20 GMT
Vary: accept-encoding, accept
Server: cloudflare

Wha-What? A Weak ETag? How? Why?

How NPM CLI handles this

After some digging, it seems NPM’s CLI is quite… simple. It internally caches the contents for 5 minutes. After that, it just downloads it again.

Workarounds

We could (which does not mean we should) read the initial headers and drop the connection, but that does not mean we won’t receive the payload on a GET request, we will only be breaking HTTP semantics. After all, the remote will be shoving the payload to our connection nonetheless.

Another option would be perform the same GET, but using Range to not get a body, as it seems the server supports it. Let’s try:

Leveraging Range Header

The server seems to support byte ranges, as announced by the header Accept-Ranges: bytes. So let’s ask for a single byte. We will waste a single byte, but at least it’s not 30-50MB depending on the manifest. I’ll keep the If-None-Match just for extra measure.

GET /react HTTP/1.1
If-None-Match: "5611f329debbb41fda058497d0d4c7d8"
Range: bytes=0-0
Host: registry.npmjs.org
Connection: close
User-Agent: [REDACTED]/0.1.0
HTTP/1.1 206 Partial Content
Date: Thu, 30 Jan 2025 12:54:36 GMT
Content-Type: application/json
Content-Length: 1
Connection: close
Content-Range: bytes 0-0/5387821
CF-Ray: 90a19c9078a51b24-GRU
CF-Cache-Status: HIT
Access-Control-Allow-Origin: *
Age: 11
Cache-Control: public, max-age=300
ETag: "5611f329debbb41fda058497d0d4c7d8"
Last-Modified: Wed, 29 Jan 2025 16:20:20 GMT
Vary: accept-encoding, accept
Set-Cookie: _cfuvid=hY7zxXExF2P4mja1cX6ynhOfcoAuJzxR42SmGTiLJvY-1738241676925-0.0.1.1-604800000; path=/; domain=.npmjs.org; HttpOnly; Secure; SameSite=None
Server: cloudflare

Success! We got a single byte, and a 206 status response. Is that enough? I honestly don’t think so, considering that the server returns an ETag. But if that’s the only alternative, I’ll happily stick with it, although it feels a really, really hacky solution.

Contacting NPM Support

I raised a ticket with NPM support to understand what is going on, but so far, I haven’t found a suitable alternative.

Will update this once I get a response from the fine folks at NPM.

One month later, I got the following reply from NPM Support:

So sorry for the oversight and delay addressing your concerns.

As previously mentioned, with the behavior you're seeing with the If-None-Match
header for registry.npmjs.org, it’s not expected for it to always send the full
payload when the ETag matches. Normally, if the If-None-Match header matches the
ETag returned by the registry, the server should respond with a 304 Not Modified
status, not the full payload.

It might be worth checking if there's anything else in the request (such as
additional headers or network issues) affecting the behavior.

As we’ve reached the maximum allowable time for keeping this ticket open, we
will be closing it. However, the issue is documented and escalated it to our
engineering team for further resolution.

We will continue working on this and encourage you to check periodically for any
updates on the status. I'll reach out with any updates.

We truly appreciate understanding throughout this process.

So… Not fixed, and perhaps won’t be. Bummer.