cgi.d overview

cgi.d is a lower level interface for writing web applications in the D Programming Language. I've been using it in production since 2010.

import arsd.cgi; void yourFunction(Cgi cgi) { cgi.write("Hello, world!"); } mixin GenericMain!yourFunction;

CGI is a standard protocol for web servers to communicate with applications. Underneath, it passes variables via environment variables and stdin, and takes your output via stdout.

class Cgi and mixin template GenericMain in cgi.d simplify and make uniform access to this protocol, as long with, if you use add-on libraries, FastCGI or an embedded http server.

Hello, world! Explained.

Above is a basic hello world application. You can compile with:

dmd yourapp.d cgi.d

To run the resulting program on the web, you'll want to copy the binary into a CGI enabled director on your web server. For example, cgi-bin on Apache. You can enable cgi on specific files as well.

First, of course, you'll be importing arsd.cgi - the module name of my cgi.d. Next, you'll notice that there isn't a traditional main() function.

GenericMain mixin

Why?

Writing your own main has a little bit of boilerplate. To avoid this, I made a mixin template, GenericMain, in cgi.d that you can use instead.

GenericMain constructs a CGI object for you, then passes it to your user program, providing a try/catch wrapper to make prettier exception handling and calling cgi.close() for you on exit.

GenericMain also provides a single point of changes inside the library. If you wanted to switch to FastCGI or an embedded http server using your own main function, you'd probably have to make modifications to your code.

However, with GenericMain, those other branches are built-in to the library, so switching to them is as simple as recompiling with a different version switch. Then, the mixed in main handles the different initialization, allowing your app to remain the same no matter how it communicates with the outside world.

Note: if you compile in debug mode, GenericMain's exception handler will write out the exception message and stack trace to the browser as well as to the server log (via stderr). This is meant to make finding errors faster and easier when developing.

When compiling your application for production use, simply do not use the -debug flag to dmd. Then, cgi.d will only write the details to stderr, writing nothing but a generic error message to the browser.

How do I use it?

Instead of writing main(), write any function that takes a single parameter - an object of class Cgi. Then, at the bottom of your file (it does have to be at the bottom since otherwise dmd sometimes complains about forward reference), write "mixin GenericMain!YourFuctionNameHere;".

The name of the function doesn't matter. But, the signature should be "void function(Cgi);".

GenericMain can also take a tuple of additional arguments to pass to Cgi's constructor, but usually, you don't want to use this. Trust the defaults whenever possible because the constructors might change with other protocols.

Be sure not to use the -debug switch to dmd when compiling for your production server. If you compile with -debug, any exceptions will display details to the users like stack traces which are both ugly and potentially will give bad guys info they don't need to know. Learn more about this in the preceding "Why?" section.

cgi.write()

cgi.write() is the low level function for outputting your response body. It's my opinion that you should try to minimize calls to this function, by using some kind of template system that outputs your page at once at the end of program execution.

I said in the overview that cgi uses stdout. So, why use this function instead of std.stdio? There's two major reasons:

See the section on output for more information.

Handling Input

There's several ways for the user (or his browser) to pass your program data.cgi.d aims to give easy access to all this data in a simple, uniform way, regardless of the low level details of data encoding, etc.

Form and query string data

cgi.get[], cgi.getArray[], cgi.post, cgi.postArray[], cgi.request!T. Explain why get and getArray both exist. Explain why they are fully immutable and strings. Discuss coolness with request!enum and how std.conv.to rocks. Perhaps discuss why it is buffered and how the forms and http work - especially display: none, disabled, and type=checkbox to be aware of.

Files

import arsd.cgi; void fileExample(Cgi cgi) { } mixin GenericMain!fileExample;

A special type of form submission is an uploaded file. On the low level, file forms are submitted as a multipart MIME style data stream. A MIME part has some header metadata about the field and then the field's data.

cgi.d shields you from the encoding type. Regular fields are still available through cgi.post and cgi.postArray, just like with normal forms.

The uploaded files themselves are in another immutable associative array, cgi.files[]. The key is still a string - the name of the element in your form.

The value of the associative array is not a string though, unlike the other fields. Instead, it's an instance of struct Cgi.UploadedFile.

UploadedFile lets you access the header data from the MIME part - a content type (unreliable by the way, browsers often don't report anything useful), the name - same as the key type - and the original filename the file had on the user's machine (note this may not be usable either).

Then, of course, the content of the file. It's an immutable ubyte array. This means inspecting it and writing it out somewhere is trivial: std.file.write("your-filename", cgi.files["name"].content);. No special function calls required.

Note: one of the arguments to cgi's constructor is the maximum file size. It defaults to about 5 MB. Anything bigger will be rejected with the HTTP error request body too large.

If you need bigger files, pass a bigger number to Cgi's constructor via the GenericMain mixin: mixin GenericMain!(myfunction, 15_000_000); will give a limit of about 15 MB.

Be sure your server has enough memory to handle this though, since the entire file is stored as an array in RAM. See the future directions sidebox for why this is and what might be done about it in the future.

The biggest FIXME left in cgi.d is to provide files to you through a streaming range interface. But, 9/10 times, I prefer the simple array anyway, so I haven't gotten around to this yet. One of the tricky parts of this is the data is streamed in the order in which it appears in the form... so the file might not be last. It'd likely change the interface to get and post vars too, so I really don't want it to be the default.

Advantages of the range interface though would be less memory usage, since it isn't buffering the whole file, faster response since you can process it by chunk as it comes in, and the possibility of reporting size somewhere, for client side upload progress scripts.

The big disadvantage is losing the super simplicity the current setup has.

Cookies

cgi.cookies[]

Additional path information

cgi.pathInfo, cgi.scriptName, cgi.queryString, cgi.getCurrentCompleteUri(). Explain why you'd use these.

Authentication data

cgi.authentication

Other headers

FIXME: add these to cgi.d so you don't have to use getenv

Output

Response Body

Writing your main output is done via the cgi.write() method. You can pass any kind of array to it, since outputting binary data is just as valid on the low level as outputting a string.

If you aren't outputting UTF-8 encoded HTML though, you should first call cgi.setResponseContentType(), giving the MIME type of your output as the argument.

Indeed, all headers should be set before the first time you call cgi.write()! See the next section for more information.

The second argument to cgi.write() is a boolean telling if you outputted your entire response at once. This is never required, but may help the library optimize it's network output.

You'll probably want to use a template system most the time instead of calling cgi.write() directly at most points. cgi.d does not provide one. I often use my dom.d as a template. Start with a static html file, use the dom library to add dynamic data to it, then do a cgi.write(document.toString(), true); once at the end of the program.

You're free to do it however you want though.

Note: if you write portitions at a time and want to ensure it's sent to the user's browser, call cgi.flush() after calling cgi.write(). If you don't, the output may be automatically buffered.

Response Headers

An http response consists of headers and a request body. (and can include footers but they are rare, of limited use, and not currently wrapped by cgi.d anyway)

All headers must be set before you call cgi.write() the first time. If you try to set a header after calling cgi.write, it will trigger an assert failure.

The library tries to set sensible defaults, so if you're just outputting dynamic html, you don't have to set any headers at all. Just skip to using cgi.write(). See also: the sections on caching and cookies, though.

There's two ways to set headers in cgi.d: specific calls for certain types and a generic header(string); call for everything else. When possible, you should use the specific calls.

You should avoid setting Content-Length or Transfer-Encoding yourself unless you know better, because the library or the web server usually tend to take care of these for you.

Do not use header() to set the status line. Instead, use setResponseStatus.

The next sections will discuss some of the special header functions in more detail. Whenever possible, you should use these functions because then, your intent is clear, it can keep the output cleaner (no duplicates, etc.), and can sometimes use additional logic or structure to make sure it gets it right.

Status lines

cgi.setResponseStatus(), explain what some common ones need and why you probably don't need to use this in your app.

Authentication

Add user/pass to cgi.d itself. Discuss requireBasicAuth and Apache's silliness. Explain why it isn't necessarily what you want. Link to oauth.d.

Redirects

cgi.setResponseLocation(). Discuss why the alternatives suck. Point out why the second parameter exists.

Cookies

cgi.cookies[] and cgi.setCookie

Caching

HTTP has support for client-side caching, based on tags or dates, conditional or absolute, on any given url retreived with the GET request method. This saves the user a lot of time by allowing them to skip asking your server at all in some situations! However, in my personal experience, it's rare for a dynamic web app to take advantage of this capability, meaning they are slower than they have to be.

cgi.d aims to make HTTP caching easy to use, in both simple and complex applications. I've found websites at least feel faster even if the cache only lasts a minute or two.

Currently, cgi.d only addresses absolute expiration dates on cached pages. This is still enough for many sites though.

Trivial caching - always or never

The simplest form of cache control in cgi.d is the setCache() method. It takes a boolean.

True means always cache publically, never expire. Use that if your output for this URL is never going to change. (Conceptually, if you're application is strongly pure - all output dependent on nothing but the URL and GET variables - you can use setCache(true).)

My gradient generator, for example, uses setCache(true) because the generated images is always the same given the same input parameters.

False means never cache. If your page is private to the viewing user and/or changes very rapidly and those changes must be reflected as they happen, use setCache(false). A call to setCache(false) will turn future cache calls into no-ops - it means never, even if another part of your app is cachable. This strong statement means you can use cache calls in individual functions of the application, worrying only about their own behavior, and you still get the right result for the aggregate.

This makes caching a lot easier to use. Since you can use cache calls in individual functions, it'd be nice to have more control over it than always/never, right? That's where expiration dates come in.

Setting Expiration Dates

cgi.setResponseExpires, why the buffering algorithm works the way it does and how even little things cna help alot - esp with javascript. Briefly explain how I haven't gotten around to moving to std.datetime yet.

Sessions

cgi.d does not provide any kind of sessions at this time, aside from simple cookies. I've been using the database instead, storing a session id in a cookie.

While I've considered a PHP style session object or using a signed and encrypted cookie, I haven't felt a big need for it yet, so it doesn't exist at this point.

I recommend setting a HTTP only cookie with the session id in it, and using that session ID as a key into a database table designed for your app with your session data in it.

Alternatively, if you use dom.d, passing data around via forms might be preferable while still being trivially easy. (of course this can be preferable even with full server side sessions for certain applications!) [FIXME: link to an explanation on how and why to do that]

Next Steps

With cgi.d alone, you can write a web app, faster, more easily, with better performance, and more correct than with most (if not all!) the other language alternatives out there.

But, you'll probably want to use a database, a template system, and maybe even a higher level library for accessing functions through the web.

Some libraries I've written that you might find useful are:

database.d (and the associated mysql.d, postgres.d, or sqlite.d)
This provides access to relational databases.
dom.d
This provides a Javascript inspired DOM to the server side. I use it as the base for my own templating system.
web.d
This automatically generates code to make a class' methods available via the web interface. The goal is to use D's reflection to automate as much work as possible. It's built on top of cgi.d and, to a lesser extent, dom.d.

See the next Alternative libraries section and the See Also section at the end of the document for more.

Other libraries for writing web apps in D

I tried to keep cgi.d generally useful both on its own and as a foundation for other libraries. But, if it isn't right for you, fear not! There's other options available for D as well.

Among the ones I've heard of are:

If you've written web libraries for D, send me a link to and a brief description of your product and I'll include it here too.

Appendix

CGI Speed

The costs associated with CGI are very low and are usually made up for very quickly thanks to D's superior speed over other web app platforms.

I've found the difference to be negligible on most servers. Don't blindly believe that cgi is slow! However, if your data proves it is too slow for you specificially, it's no problem - try switching to FastCGI - as simple as a recompile - and see if the situation improves.

However, there is no magic bullet to writing fast programs. You'll want to profile to ensure you're attacking the right bottleneck!

CGI Setup

In addition to using cgi-bin, you can set individual files to run with the CGI protocol in other folders on apache.

To do this, open .htaccess and write:

Options +ExecCGI SetHandler cgi-script

Be sure "AllowOverride All" is set in your main httpd.conf file for the server for this to work.

You can also "SetHandler fcgid-script" to use FastCGI, if installed on your server. You'll want to compile with "-version=fastcgi" when building your app.

IIS Setup

Coming later... short of it is check the handlers in the web site section and then you might have to change CGI restrictions on the main iis setup.

Examples

See Also

Miscellaneous notes of tangential interest

The cgi.d module was originally an offshoot of my bugbar project, which uses an embedded http server written from scratch. The Cgi class actually spoke a fair amount of http for that little server, which is why much of the code goes beyond what's required of a simple Cgi wrapper.

I've found these capabilities to be useful though - not only did it give me a better understanding of the underlying http protocol, but it also ensured the Cgi class is flexible. So, when I added FastCGI support to it, it was very easy!

Eventually, I might add other bottom-ends. If you use the guidelines I've determined that work with the existing implementations, your programs should get these for free with nothing but a recompile.

Reporting bugs, requesting features, and about the author

Email me, Adam D. Ruppe, at destructionator@gmail.com if you find a bug or need an extra feature. Please include [cgi.d] in the subject line.