⟵ Home

NPM Registry HTTP Semantics

January 30, 2025 ∙ 3 minute read

Here at nice try, I’m working on an NPM proxy and internal registry. All is fine and dandy until you notice that NPM Registry, although returning ETag headers, simply ignore them. For instance, let’s try get the latest version of say, react, and then use If-None-Match to get an indicator our request is still fresh.

First Request: No If-None-Match

GET /react HTTP/1.1
Host: registry.npmjs.org
Connection: close
User-Agent: [REDACTED]/0.1.0

The response is expected:

HTTP/1.1 200 OK
Date: Thu, 30 Jan 2025 12:35:30 GMT
Content-Type: application/json
Content-Length: 5387821
Connection: close
CF-Ray: 90a18092cb3802e5-GRU
CF-Cache-Status: HIT
Accept-Ranges: bytes
Access-Control-Allow-Origin: *
Age: 75
Cache-Control: public, max-age=300
ETag: "5611f329debbb41fda058497d0d4c7d8"
Last-Modified: Wed, 29 Jan 2025 16:20:20 GMT
Vary: accept-encoding, accept
Server: cloudflare

Fancy stuff. Very nice. Even an ETag! So the expectation is that If-None-Match yields an HTTP 304, right? Let’s try it again:

GET /react HTTP/1.1
If-None-Match: "5611f329debbb41fda058497d0d4c7d8"
Host: registry.npmjs.org
Connection: close
User-Agent: [REDACTED]/0.1.0

And…

HTTP/1.1 200 OK
Date: Thu, 30 Jan 2025 12:37:37 GMT
Content-Type: application/json
Content-Length: 5387821
Connection: close
CF-Ray: 90a183af0d59ae90-GRU
CF-Cache-Status: HIT
Accept-Ranges: bytes
Access-Control-Allow-Origin: *
Age: 202
Cache-Control: public, max-age=300
ETag: "5611f329debbb41fda058497d0d4c7d8"
Last-Modified: Wed, 29 Jan 2025 16:20:20 GMT
Vary: accept-encoding, accept
Server: cloudflare

OK? My If-None-Match is extactly the same ETag returned on the response.

Second Request: Should HEAD help us?

Okay, but WHAT IF we issued a HEAD in order to compare ETags? That would be enough to assert we have the same payload, right?

HEAD /react HTTP/1.1
Host: registry.npmjs.org
Connection: close
User-Agent: [REDACTED]/0.1.0
HTTP/1.1 200 OK
Date: Thu, 30 Jan 2025 12:39:41 GMT
Content-Type: application/json
Connection: close
CF-Ray: 90a186b3687af23f-GRU
CF-Cache-Status: HIT
Accept-Ranges: bytes
Access-Control-Allow-Origin: *
Age: 24
Cache-Control: public, max-age=300
ETag: W/"5611f329debbb41fda058497d0d4c7d8"
Last-Modified: Wed, 29 Jan 2025 16:20:20 GMT
Vary: accept-encoding, accept
Server: cloudflare

Wha-What? A Weak ETag? How? Why?

How NPM CLI handles this

After some digging, it seems NPM’s CLI is quite… simple. It internally caches the contents for 5 minutes. After that, it just downloads it again.

Workarounds

We could (which does not mean we should) read the initial headers and drop the connection, but that does not mean we won’t receive the payload on a GET request, we will only be breaking HTTP semantics. After all, the remote will be shoving the payload to our connection nonetheless.

Another option would be perform the same GET, but using Range to not get a body, as it seems the server supports it. Let’s try:

Leveraging Range Header

The server seems to support byte ranges, as announced by the header Accept-Ranges: bytes. So let’s ask for a single byte. We will waste a single byte, but at least it’s not 30-50MB depending on the manifest. I’ll keep the If-None-Match just for extra measure.

GET /react HTTP/1.1
If-None-Match: "5611f329debbb41fda058497d0d4c7d8"
Range: bytes=0-0
Host: registry.npmjs.org
Connection: close
User-Agent: [REDACTED]/0.1.0
HTTP/1.1 206 Partial Content
Date: Thu, 30 Jan 2025 12:54:36 GMT
Content-Type: application/json
Content-Length: 1
Connection: close
Content-Range: bytes 0-0/5387821
CF-Ray: 90a19c9078a51b24-GRU
CF-Cache-Status: HIT
Access-Control-Allow-Origin: *
Age: 11
Cache-Control: public, max-age=300
ETag: "5611f329debbb41fda058497d0d4c7d8"
Last-Modified: Wed, 29 Jan 2025 16:20:20 GMT
Vary: accept-encoding, accept
Set-Cookie: _cfuvid=hY7zxXExF2P4mja1cX6ynhOfcoAuJzxR42SmGTiLJvY-1738241676925-0.0.1.1-604800000; path=/; domain=.npmjs.org; HttpOnly; Secure; SameSite=None
Server: cloudflare

{

Success! We got a single byte, and a 206 status response. Is that enough? I honestly don’t think so, considering that the server returns an ETag. But if that’s the only alternative, I’ll happily stick with it, although it feels a really, really hacky solution.

Contacting NPM Support

I raised a ticket with NPM support to understand what is going on, but so far, I haven’t found a suitable alternative.

Will update this once I get a response from the fine folks at NPM.