Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> isn't this actually mostly spot-on

Only if you accept the initial premise, which is nonsense. The article is based on the idea of HTML as an interchange format, something that's only the case in dysfunctional situations (scraping data illegally, or a horrific breakdown in communication/collaboration between two business entities).

Sure, there was a for a time a big focus on XHTML as a "hybrid" format - interweaving Microformats/RDF & other machine-readable metadata into display documents to give them a dual purpose, but even with that, the primary purpose was always human display, not machine-readability.

HTML isn't designed as, nor primarily intended to be, a machine interchange format. And espousing such hyperbole as "HTML is a strategic dead end" based on it not meeting a use-case for which it was never designed, is harmful.

> The main thrust in this article is that scraping HTML display markup is a terrible form of data interchange between systems.

This is a perfect summary. Would it be nice if more HTML websites were more machine-readable - sure. For me, a hacker, it would make life nicer. But should it be a pre-requisite for business transactions & e-commerce to function - absolutely not.



This article predates XHTML, and was almost certainly not informed by the then-current work on RDF/Dublin Core. At the time, terminal screen-scraping was ... not an uncommon method of data interchange between systems (hence the context of 3270 terminals); using BMS maps was an improvement over the naive approach, basically letting you hook into the screen formatter. The article's point is that these approaches were legacy baggage, and the attempt to "modernize" BMS maps by letting them output HTML in addition to green-screen was doomed to fail. Instead, Duquaine advocates where we landed, with SOAP (and successor) services making data available instead of forcing integrations to go through human-readable display functions. (It's probably worth noting that this is what he focused on at Sybase, specifically his work on their RPC gateway that hooked into legacy mainframe transports such as CICS.)


This article was written during a time where the idea of a semantic web was still bright and strong. Scrapable HTML websites would have been at the forefront of interchange ideas then.


He didn't realize that gardens need walls, and HTML is the perfect building block for walls.


> the initial premise, which is nonsense.

The initial premise is: "HTML is a strategic dead end _for business transactions and e-commerce_". That premise is absolutely spot-on.


> That premise is absolutely spot-on.

So... if you feel like it, we can split hairs and say that HTML is a dead end as a data interchange format, in the same way that orange juice is a dead end as motor fuel. That's not what's really being discussed here though.

The pertinent quote in the article is:

> HTML is ultimately as strategically dead as 3270 is. HTML suffers from the same ultimate fatal weaknesses that doomed 3270

The implication is that HTML is a dead end in general because it doesn't act as an interchange format, not that it's specifically & narrowly a dead end within that use-case.

As for "business transactions & ecommerce", that's a vaguer phrase. If you mean using HTML to exchang transaction data between business application APIs, then of course it's not appropriate. If you mean using HTML to provide human interfaces to ecommerce & business transactions, then that's a different debate (not at all touched on by this article).


> The article is based on the idea of HTML as an interchange format, something that's only the case in dysfunctional situations (scraping data illegally, or a horrific breakdown in communication/collaboration between two business entities).

Think again:

https://schema.org/docs/gs.html


> Think again

I mentioned microformats & RDF in my comment...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: