This Week in D May 31, 2015

Welcome to This Week in D! Each week, we'll summarize what's been going on in the D community and write brief advice columns to help you get the most out of the D Programming Language.

The D Programming Language is a general purpose programming language that offers modern convenience, modeling power, and native efficiency with a familiar C-style syntax.

This Week in D has an RSS feed.

This Week in D is edited by Adam D. Ruppe. Contact me with any questions, comments, or contributions.

Statistics

13 bugs fixed
16 bugs and enhancement requests opened
52 pull requests merged into the language: 18 into DMD, 29 into Phobos, and 5 into druntime.
1 pull request merged into the website.

DConf 2015

DConf 2015 was this past week! About 30 men gathered in person at Utah Valley University for about nine hours a day over three days to discuss D, with the majority of the conference also being livestreamed over Youtube to many other people.

The conference was also professionally recorded and those videos will be made available later, once editing is finished.

This Week in D had an intrepid report on-site for the conference all three days.

Major announcements

Walter and Andrei were asked if a macro system was coming to D. They both answered flatly: no.

DConf 2016 has already been announced: it will be held in Berlin, Germany, graciously hosted by Sociomantic , a company whose major infrastructure is written in D and is always looking for talented D programmers for their back end as well as developers for their web front end and various other positions.

Wednesday Morning Session

Walter Bright kicked off the conference with a talk on memory allocation and performance. Walter's slides are online.

He started by outlining allocation strategies with pros and cons - garbage collection, reference counting, and manual management - arguing none of them are quite ideal.

The thrust of his talk was to use lazy, non-allocating ranges as much as possible to write code that does not need to allocate memory at all until the last possible moment. Then, it should leave the allocation strategy up to the high-level user who will have more understanding about the data's lifetime and usage pattern. Then, and only then, can the programmer pick an optimal strategy.

He issued a call to action to scrutinize all functions that return arrays and examine if they can be made to return a range instead, to make Phobos more efficient and friendly to GC-less (and other allocation free) programming.

Walter also teased us with extended language support for ranges, but these probably aren't actually going to happen - the library can already do it, and he made a point about how people get too excited over a cosmetic language change relative to a more functional library paradigm.

Next, Daniel Murphy (aka yebblies) talked about the ddmd project - translating the DMD source code from C++ to D. His slides are available here.

He started by explaining the why: why move from C++ to D? Editor's note: I'd like to point out that these reasons can apply to other C++ code bases too. Sometimes, porting existing code to a new language isn't worth it, but sometimes it is and the option ought to be explored in your individual case. Among the reasons he listed was that D is generally more pleasant to write improving morale, easier to refactor, takes less time working around C++ quirks, and offers new features that can improve performance.

The problem is porting code to a new language is non-trivial. dmd has about 120,000 lines of code and about twenty pull requests per week, so it is big and moving. These problems make porting by hand and rewriting from scratch pretty unrealistic options. (Though, SDC, a rewrite from scratch, is progressing as a third-party compiler.)

To solve these problems, Daniel shows how an automatic conversion program was written.

His first try was to translate the code on a lexical level, replacing tokens in the string after running the C++ preprocessor. This got agonizing close... but couldn't go the whole way because the preprocessor code is important and the remaining changes become skyrocketingly difficult.

An AST converter was also thwarted by the incredibly difficult final 5% of work as well as the complication of porting preprocessor code. The preprocessor brings the wrath of many programmers trying to translate C or C++ - it is a different language that needs to be handled too, combining the semantics of both the preprocessor and actual language into one language - the target D code - is practically impossible in the general case, even for humans. This is one reason why D ditched the idea of a preprocessor. Another difficulty with this approach was keeping comments in the new code - since the converted D is meant to be used by humans, keeping comments in the right place was also important.

Instead, Daniel describes how he was able to simplify the task at hand to make it doable. Instead of trying to cover all C++, he would just cover the easy parts of C++, no error handling, using just what dmd does, and he'd modify the dmd source itself at times to make it easier to handle. So, basically, the magicport conversion is a hand-port assisted automatic process.

Preprocessor rules were chief among the code that had to be hand-simplified to expedite the automatic conversion process. Also, dmd's code style was to declare all functions for a class in the header, then implement them in files based on related functionality, not by class. So, for example, each class would have a toJSON method, and then toJSON functions for all classes were defined in the json.c[pp] file instead of spread across various class implementation files. This style was not conducive to conversion to D, so the C++ source was refactored to instead use a visitor class, which was easy to convert automatically.

Even after this, a handful of source files had to be manually ported. However, with the bulk of the work done automatically, it was possible to keep up with changes to the C++ source as they happened, meaning development did not need to stop as the conversion happened, and there was room to experiment with improvements.

As a result, a lot of work was done on the C++ source and the magicport file is a load of hacks... but it worked and created compilable D code. Daniel describes, however, that this was not the end of the work. The code still had to link - including the new D frontend and the C++ backend - and, of course, to run correctly. This prompted him to do a lot of work on improved C++ interoperation and a few codegen bugs, as the converted C++ source exercised a different part of D than typical D code.

The remaining issues were hacked around and ddmd now works! The main blocker to getting it mainlined is performance and is expected to be fixed within a couple months when gdc and ldc get updated to the newest D code. Daniel notes that gdc and ldc will also benefit from the change, in that the frontend forks should all be unified, allowing them to work with much less maintenance work for new releases.

Existing pull requests should still work as they too can be automatically converted and rebased against the D code, without requiring a lot of manual effort.

This effort has taken over two years to complete and accounts for about 8% of all dmd pull requests at this time (though Daniel noted that most of them were small changes).

Daniel concluded by noting how this approach might work with other projects. It is a lot of work and needs to modify the original source, but it does not need to pause development of the original code and yields benefits long term. He notes, however, that his magicport should NOT be expected to work as-is on other projects; the concept is more reusable than the code.

At 11:00 am on Wednesday, Brian Schott (aka hackerpilot - he explains he is also an aviator) took to the mic to talk about the D tools he has written and the grammar issues he found in the process. His slides can be downloaded from the dconf.org website.

Brian, in a similar fashion to his lightning talk last year, used humor quickly, making a list of bad ideas early in his talk to segue into the work he has been doing.

He discussed dfmt, a D style lint and formatter including a fuzzy length algorithm, which judges breaking line length limits against how ugly the code would become split into multiple lines. Editor's note: as a user of the rubocop lint for Ruby which includes a line limit checker which often, in my opinion, leads to uglier code to pass its checks, I found this a very fascinating feature.

He went on to discuss dcd, a D completion daemon which uses a client and server model to provide an editor-agnostic (or IDE-agnostic) code completion, function call help, goto defintiion, and related functionality. He did a live demo and described its architecture.

When optimizing its performance, Brian moved to fewer allocations and garbage collector use. He noted the trade-off of this approach - while he was able to improve performance, it came at the cost of more debugging work, as manual memory management was tricky to get right in all cases. His biggest problem was making sure manually allocated memory was appropriately scanned for GC objects, to avoid false-free bugs.

Brian next discussed harbored, a doc generation tool. (The source of the name is that boats are docked at a harbor... so documentation, docs, implies harbor...)

Unlike the built-in doc generation, harbored uses the same syntax but a different approach - just looking at the source code rather than a fully analyzed AST. The advantages of this include original type names rather than resolved aliases for easier readability, platform-agnostic docs (it works across version statements), and the ability to generate docs for one file without needing the entire import tree to run the compiler.

The next tool Brian demonstrated is his dscanner, a multi-use scanner and lint program for D code. Dscanner is able to find problematic source forms (such as if(a==a), an obviously-always-true and thus useless, or at the very least, confusingly operator overloaded, statement and warning about it) as well as D-aware grep-like functionality.

One of the things he demonstrated was a find declaration function. By running dscanner -d some_symbol_name source/*.d, the program immediately spit out the file, line, and column numbers of the function declaration. This is advantaged over plain grep because grepping for a definition isn't always easy - knowing the pattern for the return value, for example, is not always obvious (indeed, you might want the definition to read the return values!) and searching for name, for example, turns up a lot of calls as well as the definition - giving a lot of false-positivies to wade through. Dscanner skips all this and goes straight to the meat.

Brian showed how this is just the beginning of what Dscanner can do. He also demoed outline printing, showing a class with its members printed to the console, ctags generation for use in editors, and an upload to SonarQube, a bug report dashboard his company uses for quality assurance.

Next, Brian talked about the library he wrote to support these tools, libdparse, and the challenges in implementing it efficiently. He started by pointing out that writing a D lexer and parser was easier said than done.

Among the tips he offered were: use perf (a performance profiler), compile with ldc or gdc, perf some more, and optimize algorithms, for example, using a binary search instead of a large switch statement. Editor's note: the string switch implementation is a binary search over a sorted list of cases, but Brian's implementation must be better optimized for his specific data set than the generic implementation in druntime.

He also noted that SIMD is not necessarily a magic bullet for these tasks, but can be used to optimize cases by scanning for cases where further analysis is certainly useless. His tip was to write basic code that always works, then SIMD filters to avoid looking deeper when possible.

Brian then returned to some good-point-backed humor by pointing out silly aspects of the D grammar. As the language has grown, some bizarre cases have developed. As Brian wrote his independent implementation of the lexer and parser, he found several bugs in the spec (where the documentation did not match the behavior of dmd) and bugs in the compiler (where dmd's implementation does silly things). He showed a few notable examples, such as int[] i(T) = 10;, which was introduced by a change that meant to just enable short-form template alias statements and ambiguous empty catch blocks.

Brian explained how he'd like to fix these issues, but recognizes that it may break some code. As a response, he wrote dfix, a program to automatically fix code and aid in migration to new compiler versions and showed it in his talk.

Finally, he discussed some future directions for dfix, such as making it into a more generic code refactoring tool, starting with the ability to rename a module or function and have the instances of it automatically updated (which is harder than it might sound due to scoping and importing rules, but dfix can parse D code and understand what name actually refers to the changed symbol and which one is a separate variable with the same name).

Editor's note: I have been aware of Brian's tools for some time, often noting a new release in This Week in D, but I never wanted to actually try them until this talk. If you are skeptical like me, give this talk a closer look and consider trying it. I was pretty impressed, enough to download the programs and start playing with them. While I haven't yet used them for anything serious, I can see that Brian did nice work and am looking forward to exploring these in more depth.

After Brian's talk, we breaked for lunch, ending the Wednesday morning session.

The remaining five sessions of DConf 2015 will be reported on in future editions of This Week in D, so be sure to watch for next week's issue! Also watch out for our announcement for when the videos are available. In the mean time, this post lists the starting times of each talk in the recorded ad-hoc livestream. (Note that the coming recordings will be of much higher quality; the livestream was done at the last minute by an attendee using his webcam.)

From the ditor: Please note that it is my plan to finish these write-ups within the next two weeks, not six, but I have not yet returned home; I still had other business in Utah and my time is limited with that and the trip back.

Open D Jobs

A new page has been added to the D Wiki listing open D jobs. Take a look if you're interested, and add yours if you know of one that is available!

In the community

Community announcements

DerelictMantle - unofficial, experimental, reverse-engineered

See more at digitalmars.D.announce.

Significant Forum Discussions

Make dub part of the standard dmd distribution talks about including dub with dmd and people's criticism of it as a build tool or package manager, with proposed solutions.
Entry point a la "git" or "go" discusses pros and cons of putting the D tools under a common command namespace with greater integration amongst them.
Why aren't you using D at work? discusses people's successes and failures in using D professionally. Many of the failures come because the company's existing code and processes are good enough for them and switching doesn't look appealing in the short term or that D's tools aren't a good fit for their company (sometimes not good enough, sometimes just too different). And more discussion.

See more at forum.dlang.org and keep up with community blogs at Planet D.

Learn more about D

To learn more about D and what's happening in D:

Read http://dlang.org and the D wiki.
Want in-depth material? Check out the Books on D.
Join us on IRC: channel #d on irc.freenode.net.
Check out the forums (TIP - check out the NNTP and mailing list links under "Also via" on the forum to subscribe to email updates or access the forum with a newsgroup client!)
Follow D Programming on Twitter
search for #dlang on Twitter
and/or follow This Week in D's editor on Twitter.
Check out the D tag on Stack Overflow