The Accept Header

davisp · on Dec 19, 2011

As in all things web, this is related to user agents (specifically IE) having a horrendous misunderstanding of what the Accept header is for. If you google "Accept header IE" you'll see a long list of hits about its particular brand of gnarly.

An example IE Accept header might look like such:

  image/jpeg, application/x-ms-application, image/gif,
  application/xaml+xml, image/pjpeg, application/x-ms-xbap,  
  application/msword, application/vnd.ms-excel,  
  application/x-shockwave-flash, */*

Notice the obvious missing entries like text/html, text/plain etc etc. By default, IE is saying "I prefer all these other things to HTML."

The second follow on here is that if a resource provides two content-types for the same resource, and an accept header matches both with equal priority, then the server gets to decide which representation to return. The first example the author gives is technically correct in this regard. His probing shows that Netflix is being a bit naughty, but not without just cause.

Basically, they're making the assumption, "Anyone that asks for JSON really wants it." which does violate the spec, but given the long history of user agents that have historically broken this spec I can only say that its not a question of whether the spec is broken, but how it's broken.

For instance, the spec mentions in passing that given an Accept header of:

  text/html, text/*

That text/html should be preferred to text/plain even though both would end up with a quality of 1.0. Although there's no actual algorithm defined for dealing with this situation. The entire decision tree is left implicit based on the examples provided.

Anyway, getting in a huff that your browser extension breaks a website that tries to cater to developers while providing reasonable content to people that still use IE is a bit awkward. Especially if you've written a book on HTTP and haven't realized that maybe there's a certain amount of spec breakage to deal with weirdo user agents.

shiflett · on Dec 19, 2011

A few corrections.

IE is not preferring anything in your example. The default quality is 1.0, and since no quality is indicated for any type, they all share the highest preference.

Although IE's Accept header is technically useless, it's more likely to be a response to developers incorrectly parsing Accept than the other way around. This post is about sites using primitive string matching to parse Accept, and it's not unreasonable to imagine that this problem is not new.

You also claim that the spec leaves the resolution between text/html and text/* ambiguous, but it clearly states why text/html has precedence:

"If more than one media range applies to a given type, the most specific reference has precedence."

Lastly, I only get in a huff when people use illogic, especially ad hominems, to prop up a poor argument. In fact, I've written about this, too:

http://shiflett.org/blog/2007/sep/logic

I wrote a book on HTTP precisely because the spec is not always right, but that doesn't excuse everything.

masklinn · on Dec 19, 2011

> That text/html should be preferred to text/plain even though both would end up with a quality of 1.0. Although there's no actual algorithm defined for dealing with this situation. The entire decision tree is left implicit based on the examples provided.

It does a bit more than show example in passing, it specifies the following:

> Media ranges can be overridden by more specific media ranges or specific media types. If more than one media range applies to a given type, the most specific reference has precedence.

And indicates how precedence works: accept extensions > "full" mime > type/wildcard > wildcard.

The only situation which is not defined out is a type/wildcard with an extension.

This tie-breaker happens after quality-based precedence resolution.

(full agreement on Accept headers of browsers, they're completely bonkers)

orthecreedence · on Dec 18, 2011

If I'm making an API that speaks only JSON, I'm not going to bother with Accept. There is the case where certain resources have a different media type (eg "application/vnd.mycompany.user+json") which is something I'd probably just parse out as JSON. That's not to say I'm not looking at "Accept," but I really only care if there's JSON somewhere in there or not. Also, forcing clients to pick the correct media type for the Accept header, while it may be "proper" REST, seems a bit cruel for the average developer.

I don't assume building a REST API means that clients will be REST clients. Some people just can't wrap their heads around the concept, and I try to build for both.

alexchamberlain · on Dec 19, 2011

Is it really cruel to expect a developer to understand the technology they are developing?

orthecreedence · on Dec 19, 2011

Yes. Building and effective REST client is actually really hard and if I want widespread use of my API, I'm not going to be handing out REST manifestos and expect everyone to read all 800 pages of it and retain everything. Otherwise, what's stopping developers from creating on an "easier" platform that uses GET site.com?action=delete&user_id=3 syntax? I can't assume anyone, not even devs, are smart enough to use an elegantly programmed system, so I provide the ability for RESTafarians OR infant PHP developers to use it.

masklinn · on Dec 18, 2011

Interestingly, for the first version of Accept

> Accept: text/html,application/xhtml+xml,application/xml;q=0.9,/;q=0.8,application/json

it's perfectly valid to return `application/json` (even if a `text/html` representation is available) as (as far as I can tell from the RFC) the default quality is always 1 and ordering is irrelevant, so the relative ordering between `text/html`, `application/xhtml+xml` and `application/json` is left to the server.

Of course further tests show that the server code is in far broken, but still...

(and don't use the PHP code in the comment, it won't work for arbitrary accept extensions and does not return those extensions to the caller)

sounds · on Dec 19, 2011

The question asked in the article is, why does the presence of the text "json" in the Accept header change the output, unconditionally. If a more-preferred mime type is available, why does the json always dominate?

Although I don't want to dig up the details now, it's an EJB bug. http://en.wikipedia.org/wiki/Enterprise_JavaBean