Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Why is JSON so popular? Developers want out of the syntax business (mongolab.com)
169 points by william-shulman on June 23, 2011 | hide | past | favorite | 132 comments


The post actually appears to endorse using eval to parse JSON. Not only does that allow invalid JSON through, it disallows some valid JSON, and of course is a huge security hole.

If you want to handle JSON data in JavaScript use JSON.parse -- it's the safest, fastest, and most correct path to having your data available to you.

[update: edited to remove bizarre use of whole vs hole... boggles]


Yes! I was shocked to see him recommend the use of a straight eval for parsing JSON.

json2.js is a must! JSON.parse and JSON.stringify are your friends.

https://github.com/douglascrockford/JSON-js/blob/master/json...


Keep in mind that json2.js' JSON.parse() ultimately uses eval() after a few sanity checks. Definitely better than a blind eval(), but not quite the panacea it's often purported to be.


just pointing out to others although the parent probably already knew, but JSON.parse is native in all modern browsers, json2 is only needed if you support older browsers (ie7)


Holy crap, I did not know that.


Too my knowledge all browsers now ship with json built in -- if you must work with older browsers (not an entirely unreasonable requirement) you should check for the JSON object first, and if it isn't present load json2.js. Otherwise you're just adding an additional load penalty to your site (and on mobile browsers that can be 100s of milliseconds)


Json2.js does the check. However, it has to be loaded in order to be checked. Not a big issue if you are minifying and bundling all your js files before sending them to the browser client.


I gotta ask: that just sounds wrong to me. The fact that it used a built-in parser was supposed to have been a feature of JSON. Have we pedantricized that into a bad thing now too? What's the disadvantage of "allowing invalid JSON" in an application protocol you control? Likewise, what's the value of valid JSON (I honestly don't know what the example here is) that can't be parsed by a Javascript interpreter?

And where does the "huge security hole" come from? I certainly hope you're not saying that you trust requests generated by client code...


"And where does the "huge security hole" come from"

eval'ing a response from a third party essentially runs their code in your context. JSON.parse does not.


The moment you use eval to parse "JSON" data you _are_ trusting content from the client. eval _executes_ javascript, JSON just happens to be mostly compatible with JS object and array literal syntax so it "Just Works".

Because eval is executing the data it is using the full JS parser. That means that while '{"name":"bill"}' works as expected '{"name": window.location = "myevildownload.com"}' does too.

JSON.parse is built into the language. It enforces strict JSON conformance so you can't end up accidentally having invalid content that won't be parsed by other JSON libraries, and it does not execute data -- it creates the object graph and nothing else. If there's anything that is not valid JSON it fails and has no side effects. When constructing the object graph it uses the real Object and Array constructors, so nothing can be injected that way. When setting properties on objects it sets them directly and does not call setters.

If you use JSON.parse to parse your JSON data, it is not possible for an attacker to either run or inject code in your site.

And it's faster than eval.


> The moment you use eval to parse "JSON" data you _are_ trusting content from the client. eval _executes_ javascript, JSON just happens to be mostly compatible with JS object and array literal syntax so it "Just Works".

not nesseserily , this attack could be easily mitigated if supposed JSON string is first parsed and validated on server. and only then send back to eval() on browser.

so it is therefore not inherently unsafe to use eval() on JSON.


Your server side validation would have to be a full JSON parser. So in order to use eval, you're adding a full server side parse of the data on each request, increasing server load, and request latency (i've seen sites sending megs of json to the browser).

All so that you can save 6 characters of typing to load the JSON less efficiently on the client side.

Of course because people _do_ do this most engines these days preflight calls to eval to see if they can be parsed as a subset of pseude-JSON. Note: this doesn't make it safe, any inject xss is not valid json so will still be a hole, and these preparsers try to bail out quickly so treat only a minimal subset of JSON. In JavaScriptCore (so all webkit browsers other than chrome) you can't have escaped characters in string literals nor any non-ascii characters anywhere.


JS is just not enougth lisp in that way. In lisps you have the parser in your langague (read). You cant call eval on a string and thatway avoiding this securety hole.

(You can call eval on a string but it will just return that same string)


Cross site scripting. Evaling json enables a vector for javascript code injection and execution.


The best way I've encountered to construct XML:

    <person>
        <property>
            <name>first-name</name>
            <value>John</value>
        </property>
        <property>
            <name>last-name</name>
            <value>Smith</value>
        </property>
    </person>
I kid you not. That's what I'm dealing with at work right now. Thank you, enterprise SOAP solutions.


If that is the worst XML that you have experienced then you have been very very lucky. I've seen stuff that:

- Has been built by string manipulation and therefore isn't well formed and needs hacky preprocessing before being parsed (no not HTML)

- Is full of redundant information (e.g. count attributes giving the number of child elements)

- Makes evil use of vast numbers of namespaces where the element names are all the same

- Is a basically a container for delimited or fixed format data

- Had attributes that contained entire encoded XML documents

<sob>

There are probably some other horrors that therapy and/or alchohol have let me forget (like systems doing SQL queries doing string compares on lumps of XML).

I really like JSON these days...


> - Had attributes that contained entire encoded XML documents

I've got that beat. I've dealt with XML that was basically a wrapper for JSON, which contained - you know where this is going - an XML string.

It's retarded elephants all the way down.


This is getting downright Yorkshiremenesque.


Had to look that one up before dispensing an upvote.


Notice the "encoded" - they were (if I recall correctly) base64 encoded XML documents...


Hm. I never heard of XJsonX, but it would be a trivial logical extension of JsonX (http://publib.boulder.ibm.com/infocenter/wsdatap/v3r8m1/inde...)


I've seen CDATA elements that contained complete XML documents including other CDATA elements. Good thing they had a hand rolled non-standard XML parser that allowed nested CDATA tags.

I noticed it when I wanted to read the doc in with tinyXML while working 70 hour weeks to fix up some other issue before a deadline. I ended up doing a memset(nestedXMLStart, ' ', nestedXMLLength) as a preprocessing step to bulldoze the whole construct. Not pretty, but it worked.


Two words: external references [1].

XML is a data representation that desperately wants to be Turing complete through syntax.

And then they call their schema definition language (because defining XML schema is practical by using a specialized language) RELAX...

You can't make this stuff up.

[1|http://books.xmlschemata.org/relaxng/relax-CHP-10-SECT-1.htm...]


I've worked with worse:

  <PERSON>
    <PERSON_HEADER_1>
      <PERSON_HEADER_1_NAME>
        <PERSON_HEADER_1_NAME_FIRST_NAME>John</PERSON_HEADER_1_NAME_FIRST_NAME>
        <PERSON_HEADER_1_NAME_LAST_NAME>Smith</PERSON_HEADER_1_NAME_LAST_NAME>
      </PERSON_HEADER_1_NAME>
      <PERSON_HEADER_1_ADDRESS>
        <PERSON_HEADER_1_ADDRESS_1>123 Some Street</PERSON_HEADER_1_ADDRESS_1>
        <PERSON_HEADER_1_ADDRESS_2>Blah</PERSON_HEADER_1_ADDRESS_2>
        <PERSON_HEADER_1_ADDRESS_3>Blah</PERSON_HEADER_1_ADDRESS_3>
        <PERSON_HEADER_1_ADDRESS_4>Blah</PERSON_HEADER_1_ADDRESS_4>
      </PERSON_HEADER_1_ADDRESS>
... and so on. I can't think of any advantage to (or excuse for) doing it this way. The only possible reason for it that I can think of, is rather comprehensive ignorance of how XML and related standards work.


It looks like it was written by someone who wanted to parse their XML using regexps instead of a real parser.


I can top this; I once worked with an api that returned XML with 1 node that contained a string of more XML.


The apache module that provides the gateway to Ecometry? That thing was a bear. I can't believe they did all the processing in the apache module rather than with a CGI.

It didn't actually use regular expressions, but parsed the input with a lot of char pointer manipulation.

The reason it worked like this was to have the "XML" tags map directly to the terminal input screens run on the backend, because it essentially input data as if a human was typing it into the terminal and navigating the forms.

I keep meaning to write this up as the worst example of XML abuse I've ever seen; I'm actually surprised someone else here has used the same thing (or even more surprised that more than one party has implemented the same braindead thing).


I have to work with MB's of this type of XML. It forces the use of special XML tools to even make is readable.


I'm a little embarrassed to admit this, but when I first read your post, I honestly thought that you were giving an example of how simple XML can be! After all, I can quickly glance at this bit of XML and understand what it is intended to represent. This doesn't look like SOAP-style xml to me at all, it looks like simple XML over HTTP.

It gets so much worse that this. Much of the generated XML I've seen (especially the stuff used for SOAP) is nearly unreadable by humans.


In fairness, I pared down the example for clarity (and to protect the guilty).


Oh, it wasn't meant as a criticism. I think my original misperception just strengthens your point. XML written as simply as possible is still more complicated than JSON.

That said, I feel about XML much the same way I feel about Java. They are typcially more complicated and verbose than the alternatives, but I don't think you can pin the horrendous level of complexity on the languages themselves. There seems to be a stronger cultural distain for complexity in the communities around other frameworks and languages, and JSON reflects this.

Everyone is (in theory) opposed to "unnecessary" complexity. The real issue is where you draw the line. For some reason, the python/django and ruby/rails communities seem more inclined to say "no" to complexity, even if it means giving something up. To turn it around, you have to get a hell of a lot of benefit to convince rails/django people to accept the overhead of additional complexity. I'm not sure why - maybe this is because these frameworks were created by people who were agast at what they saw developing in the Java world?

This seems to be the case with JSON vs XML as well. XML isn't going to be as simple as JSON, but there's no reason you can't describe data in XML in a way that is concise, reasonably, simple, and easy to read (both for people and computers).


> but there's no reason you can't describe data in XML in a way that is concise, reasonably, simple, and easy to read (both for people and computers)

Yes, there is. Each item in XML requires 5 + len(tag) extra characters, minimally: <name>asdf</name>. Unless you make everything a property (e.g. <x name="asdf"/>), in which case you only need 5 extra characters (<x />) per parent item. JSON is minimal: you need no extra characters (or, arguably, 2 extra characters for strings with no spaces)

That's for people. It's even worse for machines, because they have to be able to parse <name>asdf</name> and <person name="asdf"/> and <person name="asdf">...</person> and then be able to verify it using a schema external to the XML in question.

What I don't understand is why people ever thought XML was a good idea. I took one look at in back in '98 when it came out (or whenever it was) and said "this isn't human readable (unless maybe you like writing your HTML by hand)" and proceeded to successfully ignore it unless I have to use it.


> I feel about XML much the same way I feel about Java.

Heh. The SOAP WSDL I'm dealing with was generated by a Java application, and it has the Java cultural stamp all over it.


Heh, I'm embarrassed to admit that the protocol-buffer system I'm using at work looks kind of like that. Our most common message types contain a repeated field, each one holding a name value pair, instead of just containing a list of non-required fields

I'm embarrassed because I was the lead designer of said system. In my defense, protobufs had just been open-sourced, and it was my first time using them. I thought that name/value pairs would give me greater flexibility. We're using a dynamic language, and I though re-compiling the .proto files every time I added a new item would be a pain. It turns out that there's still a file somewhere in the system listing all the valid names, so we could have just as easily put the list of valid names into a .proto file, and avoided sending the names over the wire with every message, and saving them in every log record, etc. Our build system is sufficiently well automated that recompiling the .protos is painless and almost unnoticeable.

Oh, well, maybe someday I'll get around to redoing our protocol. Fortunately, it's purely internal, and we don't yet store long-lived data in that format, so there's still hope.


xmlrpc[1] is similar.

    <?xml version="1.0"?>
    <methodCall>
       <methodName>examples.getStateName</methodName>
       <params>
          <param>
             <value><i4>41</i4></value>
          </param>
       </params>
    </methodCall>
my favorite part[2] is how <params> only contain <param> elements within. each <param> has exactly one child. Why is it not just <params><i4>41</i4></params>?. Its no surprise that the creator of xmlrpc was involved with soap.

[1] http://www.xmlrpc.com/spec [2] not really.


At least you get a <person> tag. I deal with a web service that returns a <dataset> <row> ... </row> <row> ... </row> </dataset>


You should check out the format iTunes uses for its music database. It's just as bad as this.


That makes me want to claw my eyes out, set fire to my keyboard and immigrate to China, where I will start a new life as a blind monk.

Most of the worst experiences of my career as a developer involve some enterprise SOAP API or another.


Are you really having to deal with the SOAP protocol yourself? I mean, their are libraries that handle all this for you. Just read in the WSDL, and call the methods as you need them. No need to muck around with XML.


Of course there are libraries that do all of this stuff - and they work nearly all the time. However, when they stop working you end up having to look at what is actually going over the wire (or WDSL if you are really unlucky) to try and work out what on earth is going on.

98% of the time SOAP works without too many problems - the trouble is that those other 2% of cases can be truly awful to debug because you have then got to wade through all of the crap that is in there to support the "easy to use" functionality.

SOAP should die.


Typo: should have been WSDL


Afraid not. The WSDL is so baroque, so deeply nested with objects inside objects inside objects, that no SOAP client is capable of parsing it.

To write requests, I've had to resort to building templates manually. To read responses, I've had to resort to walking the DOM tree to hunt-and-peck for the fields I need.


Would a JSON implementation be better then? I mean, it sounds like the design is just painful, and regardless of the implementation it would be horrible.

Or do you think JSON encourages simplicity enough to overcome these issues? That the person who created his interface would have created something cleaner?

I guess what I'm asking is, is it the API that sucks, or the implementation (or both)?

* I've never had issues with XML-RPC or SOAP implementation. I prefer JSON because it can use it from JavaScript easily. But having consumed SOAP and XML-RPC API's (mostly with banks), I've never had problems that I'd blame on the implementation.


I can't fairly evaluate that without understanding why it was done that way. It looks to me like something that might have been dumped from a database table named "person" with two columns one called "name" the other called "value."

It might be a lazy way to dump the database but that's not XML's fault.


I've encountered that many times, with all the headaches involved. What is the reason for this?


I think the difference is that XML protocols are meant to be generated while JSON protocols are meant to be hand-written to some extent.

Otherwise, what's the difference with:

{'ret': 'pump me up'}

And

<object> <slot name="ret">pump me up</slot> </object>

If it's a computer generating it?


The difference is that in JSON there's pretty much only one way to do it, the way you have there. With XML there's many ways to do it (see examples on this thread). So when using XML, you need to write code adapted to the way it's 'encoded'.


huh? thats not the motivation or intent of either format's design.


Replace "hand-written" and "machine generated" with "human consumption" and "machine consumption"


I think it is the other way around. XML is explicitly designed to be writable and editable by hand, which is also the reason for some of it's syntactic redundancy.


If you think XML is writable, you are a bigger man than me.


You need to understand the initial use case for XML. It was invented for document-oriented markup languages like HTML, MathML, Docbook etc. You can definitely write XHTML by hand, and a JSON-based syntax for the same kind of documents (which mixed content and so on) would be a lot harder to read and write.


My understanding is that XML was derived from document-oriented SGML, to beat SGML into a form that would work well with XSL and XPath.

But I'd like to point out that the way SGML-derived markup distinguishes attributes and child nodes is entirely arbitrary. You could as easily make attributes child nodes - it's all in how you interpret what's written. Likewise, you can "convert" SGML to JSON (or YAML or S-expr or whatever) very easily, bearing in mind that attributes and child nodes sit in the same space with each other - a well-formed XHTML document, for example, can be re-expressed in JSON without ambiguity, since tags have a well-specified, unambiguous list of allowed children and attributes - just give text nodes the name "text" and you're golden.


Apparently one doesn't need to be awake, sane or sober to become an enterprise "programmer".


Apparently one doesn't need to be awake, sane or sober to become a "programmer".

Fixed. I've seen plenty of crap programming everywhere, no need to pick on enterprise programmers.


Except that bad programmers in other domains will typically shrug and say "so what" when you point out the lack of code quality while enterprise programmers will try to convince you that their twisted ways are actually better. IME, ofcourse.


JSON is so popular because it's data description abilities are poorer than that of XMLs. Everything in JSON can be easily mapped to a few basic types in any popular language, and so all the JSON writers/readers do exactly that. The consequence is that it's very easy to work with JSON.

Contrast with XML, which has node attributes and children, to start with. This alone is sufficient to make it not as naturally representable in any of the languages (Python's ElementTree does come close), so you have to worry about the "syntax". And that's not even considering namespaces or schema.

XML having a much richer way to represent data and to reason about the meaning of the data make it a good format for case where it is important. But for everything else, JSON wins.


JSON is an example of 'worse is better'.


You mean "less is more".



No, in my opinion it is an example of "good enough".


Simple is more useful.


XML is good as a markup language. Unfortunatly many people use it as a data serialization language.


The problem with XML is that it seemed to encourage more complicated data structures. People sort of forgot that you still have to process all this data. And that, at the end of the day, a lot of it ends up needing to be represented in columns and rows.


No question that's a huge problem; that's the gist of this thread after all.

To my eye the problem with XML is that relational data feels like a hack and people specifying XML default to flattening their data.

My favorite example is a thing called ONIX (for representing bibliographic metadata) which is irredeemably broken, but in world-wide use, none the less.


The baby that has been thrown out with the bathwater here is a schema/data description layer.

NOT, I repeat NOT, I repeat NOT for verification but rather for tools, so that people working in strongly typed languages can interact with JSON services in a reasonable manner.

Unfortunately the one option I see, JSON Schema, appears to have caught the XML/Java bug, and has gotten very complicated.


That because schemas are complicated, no matter which way you encode the data.

JSON is so nice and easy and simple until it needs all those things that made it into XML to get the job done.

I whole heartedly agree, there is nothing sweeter than giving another vendor our WSDL with schema when they ask how to work with our system. It eliminates huge effort and ambiguity in system interaction.


Not sure if serious...

SOAP, WSDL and everything else XML and XML Schema related is overkill for simple interfaces. For complex interfaces it is just - well too complex.

If the time spent creating XML based web services that kinda but not really work (we're only talking interfaces here!) is spent documenting your JSON WS - you get quite a better experience.

Ant this is the reason why everybody has switched to JSON.


I am serious, document a JSON interface with what, English? Are you serious?

JSON has no way to describe in machine readable way what to expect, instead you have to read a document then go hand code the interface in your language of choice.

WSDL and SOAP is not complicated, and I have implemented my own SOAP stack. Then again everyone now days everyone thinks relational databases are complicated.

If you you have tools that implement it for you then there is no excuse.


Well, I think there is a middle ground: non-recursive structs, lists, maps, and some judicious "primitives" and basically allowing only natural tree mappings would go a long way, IMO.


I couldn't agree more!

Out of frustration with existing schema languages and tool support for them, I've been working on Piqi[1] which is concise, expressive and agnostic to data representation formats. It works for JSON, XML and Protocol Buffers.

[1] http://piqi.org


I've been using Piqi on a side-project of mine and I must say, I'm horribly impressed (using piqi_rpc with erlang for a web service and javascript based client to retrieve/interact with the data). The only way it could be nicer is if there was the same sort of generator for javascript so I didn't end up having to check/enforce the format at all.

Didn't realize you hung out on HN, so many thanks and great job.


ismarc, thank you for the comment! I really appreciate it -- I didn't know anyone else was using Piqi-RPC (other than myself).

I'd like to know more about your use case and the need for JavaScript generation. I'm open to new ideas. Could you please contact me at alavrik@piqi.org Thanks!


That looks very cool. I'll look at it, and maybe steal a few ideas :). I've been working on a data language myself in my spare time, mostly concentrating on generative structures like conditionals and loops and a syntax-independent data model. But Piqi looks to be a lot more ambitious, not to mention mature.


Thanks! Feel free to reuse as many ideas as you like :) It took me more than 2 years to get it there -- it is by far the toughest system I've ever designed.


Use protocol buffers! As easy to use as JSON in dynamic languages, gives you strongly-typed accessors in static languages.


Yeah... no. I'm not a big code gen guy.

But the google people seem to love them, so I'm probably in the minority.


> Yeah... no. I'm not a big code gen guy.

Me either! Which is why I'm working on a high-performance Protocol Buffer library that does not do code generation: https://github.com/haberman/upb/wiki


Excellent readme for that project. It details several issues of which I wasn't aware.

Thanks for that.

(edit: It wasn't the readme, it was the wiki page. Still, excellent documentation.)


>[…] so that people working in strongly typed languages can interact with JSON services in a reasonable manner

You mean languages that lacks algebraic data types? Because JSON encodes beautifully in ML and Haskell, and their type systems are quite static and quite strong.


JSON-schema, if you really need it

http://tools.ietf.org/html/draft-zyp-json-schema-03


I haven't been able to find any solid, broad implementations of anything that brings schemas to json - either this ietf standard or something else. Only some half-assed and not developed for years implementations for a single programming language. Are there implementation of the spec you link to?


It's not JSON-schema, but Rx is a schema validation tool for JSON/YAML/etc. that works with many languages:

http://rx.codesimply.com/


Yes, this is one of the half-assed, half-dead efforts I was referring to. Last commit Jan 2010, and even that one touched only some files, previous commits are all from 2008. It only supports a few languages - no C++, no Java, no .Net. The website itself says on the first page "relying on any implementation in production would be a bad idea". Plus, Rx doesn't offer the same functionality XML Schema does, from what I can find (only does type checking, no element order/count, default values etc).


The mostly-just-one-way-to-organize-it approach cannot be overstated here. XPath is neat, but iterating through parser results is still a pain.


That because XML does not map well to Simula style OO.

"Iterating" XML in XSLT (a language designed to manipulate XML) is way less of a pain.


I don't disagree with this, but I think there's something wrong when need another markup language to deal with the first markup language.


XSLT is XML, that whats nice about it, you can reflect on it.

XML has no operators for manipulating or evaluating, it is a metalanguage. XSLT adds this on top of XML to process XML.


You've got to be kidding. I see the usefulness of XSLT, but still XSLT is a synonym to pain. Definitely one of the most ugly syntaxes I've ever.

It has a place, it can be useful, but I thoroughly detest it.


Not kidding, XSLT is functional, it's very different from imperative style OO, it's not complex and can get a lot done with few lines of code.

If you think it's painful and ugly, then don't dig any deeper in computer science than javascript.


JSON is popular because:

JSON can be consumed by JavaScript. This makes pushing JSON rules easy. I can create an API once in JSON. Now, someone can create a web app and make API calls with JavaScript.

That's the reason it's popular.

Another, smaller reason, is:

Many languages/libraries/frameworks provide poor XML support. If your hand coding your XML-RPC/SOAP calls, or dealing with your average XML in a way different then you would JSON, I really don't know what to say other than: Why?

All the other reasons: verbosity, confusion over XML, etc. All those are okay, I guess, but aren't good reasons.

If JSON couldn't be parsed by JavaScript, it wouldn't be used.


XML was a solution to a problem that didn't exist.

Actually, no, I take that back. XML was a solution that was created for a problem that was devised to justify the work. It's insanity for the sake of itself.

JSON is simply a serialized object format for a popular programming language - that happened to fit the general case very well - yet it's also very much closer to S-expressions, which seem to be the most efficient way to model data in the general case.

It makes me wonder how people manage to convince themselves that a format like XML is somehow a "good idea."


One problem I have with JSON is that there is no canonical way to represent union types. For example, if you have the Haskell type

  data C = A { field1 :: T1, field2 :: T2 } | B { field3 :: T3 }
How do you represent this in JSON? The contents of the 'A' or 'B' constructor are easy:

  { field1: value1, field2: value2 }
But how do you indicate which constructor it is? I can come up with a couple less than ideal ways. Which one do others use?


I still prefer binary IDL over json/bson/xml.

Keep the schema off the wire and out of storage. People tend to not get as crazy when they stick to lists.


just gimme sexps


Hard to believe we had the right answer in 1960 yet are still afraid of the parens as an industry.


(we (are (very (afraid)))


)


:)


More like (afraid (very (are (we))))

or clojurish: (-> we are very afraid)


Master Yoda called, he wants his cool back


(declare (very-afraid? we))


To be clear, the parent post is referring to the invention of LISP.

As a side note, for those who don't know, JavaScript shares a number of features with LISP/Scheme. JSON has the trappings of the very LISPy idea of "Data = Code". It doesn't quite live up to it though. Now, if we only had a parenthesized syntax for JavaScript...


You can write "lispy" code with CoffeeScript, if you're feeling looney:

    (afraid (very (are (we))))
is valid syntax.


you could do {"name": {"first": "John", "last": "Smith" } } even for json there's multiple ways you could do it. You could also include the person like XML did, but oh well...


With JSON, there's still a culture of doing things the simplest possible way. Sure, it won't eliminate all the variability, but the lighter syntax makes needless complexity more obvious.

For instance, if you don't need to separate the first name from the last name, {"name":"John Smith"} is obviously simplest. If you need to, and you have few other parameters besides the name, the OP's solution is obviously simplest. If there are lots, or if the name itself is complicated (first, last, mother's maiden name, pseudonym…), then your solution is obviously simplest.


I recall having an easy time de/serializing XML in C# ... you don't HAVE to get into the syntax business if you don't want to.

And I think the Java also has something analogous in the JAXB library


If you work with any data that's remotely complicated (i.e. practically any relational data), then you get into some pretty hefty messes with C#'s and JAXB's serializers. Unless you know XML schemas thoroughly, trying to map pointers in serialized XML is a mess.


The article mentions SAX parsers and then kinda just moves on and talks a bit about loading JSON with eval. I realize there are other ways of doing things and a fair bit between his two data processing points. My experience with JSON is mostly limited to simple APIs.

Is there a way to handle streaming JSON? I guess it'd be doable in a language like JavaScript where you build up the prototype. Others would probably vary substantially. But I can't say I've tried it yet.


Yes, there is support for stream parsing of JSON. I haven't tried it myself, https://github.com/lloyd/yajl is one example, which has various bindings in other languages.


Ahh, silly me. I never realized YAJL handled that. It should be a good place to learn from. The YAJL code base is pretty tight IIRC.


  <Product>
    <RecordReference>T0142</RecordReference>
    <NotificationType>03</NotificationType>
The horror.


    Map m = new HashMap();
    m.put("a", 1);
    m.put("b", 2);
    m.put("c", 3);
    .
    .
    .
The single worst thing in Java?


I think the current way to write that is to either use Google Guave ImmutableMap.of("a", 1, "b", 2, "c", 3) or when mutable maps are required, to write: Map m = new HashMap() {{ put("a", 1); put("b", 2); put("c", 3); }};

Both of these are a bit lighter yet still far away from JavaScript or Ruby special syntaxes.


I blogged a detailed response here: http://bit.ly/lyCwCH

But the bottom line is IMO the reason JSON is so popular (and it is popular) has more to do with the ubiquity of JavaScript in browsers than anything in the format itself.


Ironically, php provides a way to access XML data almost as easily as JASON data - SimpleXML. Do other languages/frameworks really have nothing similar?


While SimpleXML gives you an easy way to walk the tree, it still requires you to actually do it. All 3 of those examples would require slightly different calls to SimpleXML to handle.


The moment I realized I could do json_decode(json_encode($simple_xml_object),true); to turn a complex SimpleXML object into an array was the happiest moment of my life.


Every day you learn a new thing. Thank you very much.


If that works, you're my hero.


what else do you do with a data structure besides walk it at some point? JSON comes out as an acyclic object graph in javascript, so what it still needs to be walked to actually do something with the data.

The difference between walking JSON and XML is simply the syntax of the parser library your are using.

The problem with XML is it has things that don't map directly to most OO languages, like node order matters, and attributes and elements are similar but different things, and mixed content.

So JSON is nice simply because it's close the the OO style everyone is used to today, and is in fact a subset of one of those languages.

IMO either way we need to settle on a way to interchange structured data, and agree on data types, but I would prefer some binary format, text encoding all this data such a waste of space/time.


Reiterating the point the blog post makes, but: It's very very easy to structure the same data in different ways in XML. While JSON has the same issue, it's a lot harder to do and you have less flexibility in making bad storage decisions.

You are absolutely right on walking the structure. JSON just tends to be easier (in my experience), even when working with XML libraries that make it less of a pain.


I'll give the opposing opinion that I find working with XML easier than JSON, but mostly because of QueryPath.


XSLT Parser is that you?


Only problem with SimpleXML is that it needs to have the complete file in memory. This isn't a problem for small XML files. But reading a file of lets say 100MB will produce tons of objects an thus a large amount of used memory.


I'd argue that any chance of having a large XML file where you'd use SAX, you'd need to deal with JSON in the same manner. Then the problem is solved with XML (use SAX), but with JSON, what do I use?


"all you need to do is call eval on a JSON string to obtain a first-class Javascript object"

need i say more?


LINQ to XML. Couldn't be easier!


var yesItCan = JSON.parse(json);


first thing that comes to mind is saving on bandwith.imagine sending same amount data down the line with less bandwidth.


Actually early proponents of XML used to downplay this by claiming that it will be compressed on wire anyways and it's well-suited to higher compression


why did XML happen at all? JSON looks like the obvious solution. :\




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: