Can You Spot The Duplications?
<foo> <bar>baz</bar> </foo>
Now the quiz:
- Is this a good thing?
- Why was it allowed in the first place?
- Are the reasons still valid today?
Details at 11:00.
Re: Can You Spot The Duplications?
1. The 'information' is not duplicated, which would be a bigger problem in terms of introducing complexity and bugs into a system. The 'syntax' is verbose and repetitive indeed.
2. Because XML was based on SGML. Most likely to leverage existing tools and parsers.
3. Yes, because (almost) every mainstream language, operating system, text editor, and IDE supports this syntax. Using anything else forces you back into "writing your own parser" mode, diverting attention away from solving your customers' problems.
Re: Can You Spot The Duplications?
Re: Can You Spot The Duplications?
Answers:
1. It's not a good thing.
2. SGML allows minimization:
<foo> <bar>baz</> </>
The feature was left out of XML to make it easy to write XML parsers.
3. The reason is no longer valid. Modern validating XML parsers are monsters (think XML Schema support):
[weiqi@gao] $ ls -l xercesImpl.jar -rw-r--r-- 1 weiqi weiqi 1010675 Feb 20 2004 xercesImpl.jar
Re: Can You Spot The Duplications?
The same way you figure out which closing brace goes with which opening brace in C---with the help of your editor.
In SGML, you can choose not to use minimization for a particular closing tag. There is also a normalization program that fills in the missing tag names for you.
The mandatory quotes around attribute values is another irritation.
The "my site is W3C validated XHTML 1.1 strict" fanatics probably don't realize those untidy HTMLs were valid back in 1994 as SGMLs.