This Week in D December 27, 2015

Welcome to This Week in D! Each week, we'll summarize what's been going on in the D community and write brief advice columns to help you get the most out of the D Programming Language.

The D Programming Language is a general purpose programming language that offers modern convenience, modeling power, and native efficiency with a familiar C-style syntax.

This Week in D has an RSS feed.

This Week in D is edited by Adam D. Ruppe. Contact me with any questions, comments, or contributions.

Statistics

1 bug fixed
16 bugs and enhancement requests opened
12 pull requests merged into the language: 5 into DMD, 4 into Phobos, and 3 into druntime.
0 pull requests merged into the website.

In the community

Community announcements

See more at the announce forum.

Project Spotlight

This week, I, Adam Ruppe, am going to talk about what I started working on a few days ago: a new documentation generator.

First, why another documentation generator? The major alternatives for D are ddoc, built into the compiler, ddox, from the vibe.d project, and harborD, by Brian Schott. There's also an old project called candydoc which just styles and enhances ddoc with CSS and JS while still using the dmd generator, and there is the old version of my dpldocs.info which just navigates the JSON output from dmd.

The three big ones, though, are dmd -D, ddox, and harbored. Each one shares a number of flaws in my eyes.

First, let me go over the problems with built-in ddoc. The first, and most obvious one is that it sucks out of the box. ddoc's greatest strength is being built into the compiler so there's little excuse to not write some inline docs in your code. However, the default output is so bad that I think you're better off just looking at the source instead of actually running the generator!

However, even if you do fix the output with ample amounts of redefined ddoc macros and CSS, the generated documentation, while passable, is still short of ideal:

The compiler's output for function prototypes is readable only for the simplest functions.
There's no interlinking capability, except hand-written links and often handwritten or hard-to-discover anchors (though the anchors can be automated by macros too, this step still needs to be done by the author and realized by the reader)
Similarly, it has no table of contents feature at all.
The only capability for encoding characters in the output are the code block and inline backtick syntaxes. A random , for example, needs to be encoded as $(LT). Ddoc calls this a feature: embedded html, but it makes it easy to corrupt output.
The macros have limited semantic capabilities, both for text and for code.
It always outputs one file per module.
It is tied too closely to the dmd compilation model, including conditional compilation. Building docs on Linux can exclude even mentioning Windows functions since they are versioned out of the build, as well as code behind other -version flags! D and ddoc offers -version(DDoc) to work around this, but the use in practice is awkward at best.
The macro syntax is awkward and error-prone, using common text characters like commas as significant syntax, the "current symbol" is highlighted which is virtually never useful to me, and words with colons in them are assumed to denote a new section which often isn't what I want.

None of these are fatal flaws to me. I do find ddoc usable, with a handful of redefined macros and css improvements, but each one is a limitation that adds up to hurt the end user experience.

While I have tried to improve ddoc in the past, including writing the `code` feature myself, before which it was impossible to automatically encode samples for html output, it isn't an easy process to get changes into the compiler due to the worry of breaking everyone else's documentation. Everybody who uses ddoc has built up careful workarounds that work for them and any change to the status quo may break their process.

These problems have spawned the other replacements. Let's look at ddox.

ddox's major change is one file per entity instead of one file per module, and it also makes an attempt at automatic interlinking and does a good job at generating tables of contents. Otherwise though, it is a pretty conservative reimplementation of ddoc, aiming for full compatibility.

ddox also works by parsing the JSON file dmd outputs with its -X -D options. I also went this route with the first draft of my dpldocs.info website because it is easy to do - just run dmd -X -D and run a standard json parser on the resulting output to get a list of declarations and their attached comments, with some preprocessing done by dmd itself.

The problem with this approach is it retains several of ddoc's quirks related to the D compilation model. Versioned out declarations are still missing from this output. Moreover, the semantic richness is limited by what dmd offers in the JSON output. Now, dmd outputs a lot of information in that file, and the richness is always growing, but going beyond it necessitates reading strings of D source code, either form the json output or opening the original file yourself. But, if you have to parse D anyway, the ease of json starts to lose its appeal. In some cases, you must demangle names to get type information out of the json, and in others you must parse strings. In either case though, you lose out on some of the details of the original source.

ddox today does very little additional parsing.

Finally, harbored, written by the author of libdparse, dcd, dscanner, and other useful D analyzer tools, tries to transcend the limits of dmd's model by using its own lexer and parser and it does a reasonable job, however it still falls short for me:

It syntax highlights function prototypes, but does not further format them
It has individual pages, but hides them inside HTML frames, making linking awkward.
It still doesn't exploit semantic information through source code analysis as much as I'd like.
It removes some of ddoc's useless features, but retains other ddoc flaws in the name of compatibility.

I believe all three doc generators are competent implementations and useful... but I also believe I can do better.

I started writing mine using libdparse to analyze the source itself, independently of dmd's output, just like harbored does. After two days of hacking, I created something fairly usable.

Compare and contrast these links to see what I'm talking about:

Getting the link for the ddox version and my version were both easy: I simply copy/pasted the url bar when I found the function. Getting it out of ddoc and harbored meant finding the doc, then going back and copy link location because the destination url was not apparent on the page. In ddoc, it was hidden as an anchor in the source, not visible in the output. In harbored, it was hidden behind a html frame.

I believe copy/pasting a URL is the easiest way to share a doc link with someone and you shouldn't have to search for it. Of the existing ones, only ddox passed this test. My generator does too by generating one page per entity.

Next, look at the function prototypes. Phobos' ddoc generated pages are nearly universally considered completely unreadable. All the text is crammed together on one line with no attempt at highlighting (except the function name itself). Harbored highlights and ddox does some formatting, but none of them pass the readability test to me.

Inspired by MSDN, which manages to make complex function calls like Win32's CreateWindowEx which has many parameters readable, I make liberal use of whitespace to organize things. My code is only a few days old and still has a few bugs, but I think it is already significantly more readable than the competition.

My prototype is more than just text, though, it is also links. I still have a lot of work to do on this, but I am parsing the source and can pick out language features in use, such as alias parameters or auto returns, and link those keywords to supplemental articles explaining those features.

The target audience for these docs are intermediate users. Beginners won't even know how to get here and should have tutorials to guide them at a higher level, and advanced users probably already know this stuff and only use docs to refresh their memory. Intermediate users, as well as beginners who land here via a web search, will appreciate the additional context and learning opportunities the links allow without impacting the immediate doc's usability.

I will also parse these features into an automatically generated list of links for the See Also section.

Next, the parameters. These are defined in ddoc as a section, but each formats them a little differently. Phobos's ddoc output is a straight table. ddox copies the table, but also attempts to automatically interlink... and fails, due to the lack of semantic richness in its source material. It thinks range in the description refers to the parameter and makes it a link, but it actually refers to the concept. Harbored also uses a table.

I decided to go with a vertical list, again making liberal use of whitespace, inspired by MSDN. Some parameters require more than one line of explanation and I don't want to limit it to half the screen in a table cell. The additional space also gives me room to add other information from the parser such as type, and coming later, I will also extract some information out of template constraints.

For example, for range, it is possible to make the parameter table also link to an explanation of forward ranges and infinite ranges. It is possible to recognize the is(typeof(binaryFun)) pattern and explain that too. (I haven't finished writing this yet.) With the extra space, I don't feel like I have to try to cram everything in.

The return value is basically the same on all, it just has the type and text. I will, however, also link in some information for auto returns.

Finally, the examples are the same in all, though ddox doesn't syntax highlight it (probably because it isn't syntax highlighted in the original input json...). Since many of these examples are unittests which must compile with import name resolution though, I will be able to link function names in them to the exact overload they refer to.

One final subtle change on my doc is the use of dynamic javascript to enhance the experience, which I feel none of the others have even tried to exploit. Try hovering your mouse over the word Range in the function prototype of my link. Notice how it highlights the other occurrences of it, giving you at-a-glance eyeballing of where the type is used - template arg, runtime arg, return value, constraint, docs. All now stand out at a glance as being related.

My docs are still very young, but I believe I have already improved upon the UX and that my approach brings a lot of potential that the competition leaves either virtually impossible or unutilized. My approach of being willing to break ddoc compatibility and hack a bit of common pattern recognition will also allow me to make doc writing easier and richer without major limitation.

I really look forward to continuing work on this in the coming weeks to make my whole dream of stellar D documentation a reality. We can innovate in this field and set a new standard, using D's rich semantics, to make docs that other languages cannot parallel.

Learn more about D

To learn more about D and what's happening in D:

Read http://dlang.org and the D wiki.
Want in-depth material? Check out the Books on D.
Join us on IRC: channel #d on irc.freenode.net.
Check out the forums (TIP - check out the NNTP and mailing list links under "Also via" on the forum to subscribe to email updates or access the forum with a newsgroup client!)
Follow D Programming on Twitter
search for #dlang on Twitter
and/or follow This Week in D's editor on Twitter.
Check out the D tag on Stack Overflow