JSON as a replacement for HTML

Bharatvaj Hemanth

Septemper 1, 2022

Note: Before you read, I no longer beleive JSON to be the pinnacle data exchange format. I have written a spec for gt which you may be interested in.

Recently I had to rely on a JSON parser for a C application that does some oauth stuff. When writing in C, there are two things one should look out for, performance and simplicity. After sifting through a plethora of JSON libraries I finally found jsmn. It was beautiful. When I was checking it's code all I could see was glorified bracket matching algorithm written in pure C99 without any dependencies whatsover. And it's FAST!

What XML cost us

This made me question why we still use XML in many places. And I still don't know the answer. XML is many things, but what it is not is easy to parse. And what's not easy to parse is slow.

To look for a matching tag, say for <gradleDependency>, we basically have to check </gradleDependency>. Unecessary 16 character comparisons for no good reason. This might not seem like a big deal but imagine having to parse thousands of tags. This definitely adds up. It certainly did bite us when we were writing a design tool whose renderer was based on XML/DOM.

Atleast XML is somewhat strict about it's syntax. It's close neighbour HTML is even worse.

Here is a HTML that renders perfectly in browser,

<body>
    <img src="pepe.png">
    <br>
    This will work without any tags
    <p>This will work too! It's free real estate. Yoohoo!
    <strong>HTML go brrr</strong>
</body>

Do you see it? Why is <br> even working? Why is <p> standing still?! WHY? Who thought mixing semantics and syntax was good idea? How do I even fit this into my brain? If I'm a parser that is trying to read html byte by byte and have to check whether the next byte is valid I have to go through sheer amounts of syntactical information.

JSON as a DOM frontend

It gets less weird the more you think about it.

A possible variation of DOM in JSON.

"body": [
    "img": {
        "src": "image.png"
    },
    "br"
]

To be honest JSON, even though it's better than XML, sucks readability-wise when we compare it with alternatives like json5 and edn. JSON is not something I would call an ideal data format.

So then why JSON? Because JSON is freakishly simple! and FAST! This is basically the spec. 30 min read and you are armed with the knowledge of creating yet another JSON parser. Compare this with the XML's spec.

To summarize,

"JSON is not the data format we deserve but it's the one we need."

When we have JSON for both the renderer and data exchange it make most things in web simple and efficient. Easy to parse, easy to validate, easy to write and easy to read.