Book and paper (CC-BY) Literate programming is an old idea: Donald Knuth invented it during the 70' to write TeX, the typesetting system.[1] The main idea of literate programming is to document a computer program while writing it by intertwining pieces of code and explanation. In other words, instead of writing a program, one writes a book that documents and contains the program. Then, two programs are used to extract the (i) the compilable code and (ii) the book ready to be printed.

The main philosophy behind literate programming is that:

  • One should document his/her program;
  • By putting program documentation very close to program text, we improve significantly the ability to update the documentation when the program is changed.

Where we are

Several literate programming systems now exist. Most of them are grand-child of the original Knuth's WEB system. They can be either generic, like noweb, or specifically tailored to a programming language like OCamlWeb for OCaml. The main idea of those tools is to write a LaTeX document and put pieces of code in it. For example, I have used noweb to write demexp, a special voting system for massive direct democracy. The OCaml code of demexp can be read like a book.

The main issue with those tools is that you need to write a LaTeX document. You cannot use your preferred (or mandatory) IDE to edit the code and it is rather cumbersome to use in a Windows environment.

That's why we have developed Lp4all. LP4all is a literate programming system where documentation is put within the language comments, using a special syntax close to a Wiki one. From those special comments one can generate web pages, like those for Lp4all itself.

However I'm not using Lp4all!

I'm not using it because, in my last document at work, I needed all the power of LaTeX:

  • Advance LaTeX packages to make special figures, add hyperlinks to the generated PDF document, ...
  • Use LaTeX ability to create special counters for dedicated cross-references;
  • Ability to pretty-print a relatively confidential programming language;
  • etc.

So, on one side, I need the flexibility of Lp4all, its ability to integrate within any IDE or development environment. On the other side, I need a complex documentation system, giving to me all the tools to fine tune my documents and produce quality documents. Moreover, Lp4all is limited to a single program while I need to generate various documents from a single source.

Where we should aim at

(or at least some ideas about it ;-)

One simple solution to above dilemma would be simply to add LaTeX capabilities within Lp4all. Some tools like ocamldoc (the OCaml equivalent to JavaDoc) is following this approach, with extensions to mark parts using the LaTeX syntax. This probably the most practical approach. After all, Wikipedia is using the LaTeX syntax for mathematical formula.

However I find that this approach is not satisfying.

First of all, LaTeX is a very weak system (a package can break another package) and is very complex to use when you want to extend the system. It would be much simpler if we could have a documentation system build upon a sane language.

Moreover, to really exploit the full potential of literate programming, one needs to produce and track dependencies between several documents. For example, if I modify the specification, I need to modify some parts of the code. Inversely, if I change my code I need to know which part of the "specification" have to be updated. This tracking between pieces of code/documentation goes to test, user documentation, code documentation, test review, Q&A documents, GUI design, mathematical scripts describing an algorithm, formal models to check the software, etc. And more importantly, if I modify a piece of this code/documentation, I need a list of all document parts I need to review and possibly update, a kind of make on steroids.

A third point is that one needs more graphical interfaces: it would be easier to graphically link a piece of code to a document paragraph describing it than create a LaTeX reference and reference it later in the document. Moreover, documents nowadays use graphics extensively and we need to link pieces of code/documentation and some graphics. For example, it would be interesting to link the drawing of an automaton with the set of functions in a code implementing this automaton. And when navigating through the software, it is crucial to be able to go from a given automaton state in the graphics to the code implementing it and vice versa.

And of course, one needs a system agnostic documentation software, running on many platforms and working with all possible kind of IDE and languages.

Overall, one needs an integrated documentation system that is able to encompass all the kind of artefacts that make a software and that can be used to maintain those artefacts along the whole life of the software, from design to maintenance. And this system should be flexible enough to not constrain you to a fixed development environment, letting you chose the tools that you prefer.

An interesting proposal in that regard is Scribble, a system for writing library documentation, users guides and tutorials. Scribble allows to write LaTeX like documents, JavaDoc like documents to document libraries or WEB-like literate programming documents. However Scribble is tightly linked to the Scheme language and thus lacks the important platform independence feature.

As you see, my proposed ideas are rather general and I am very far from being able to specify the documentation software I would like to have. I know for sure that the current documentation systems are not satisfying and I hope somebody will propose one day a better system, matching the needs we have to build better software. I would prefer not to have to implement it by myself. ;-)


[1] Knuth's literate programming system was called WEB, long before the Web was invented. :-)