Designing Next-Generation Applications

- or -

Application Development in Crisis

Linas Vepstas <linas@linas.org>

April 2003

Abstract Any programmer, systems architect or CTO/VP Technology who has tried to develop a sophisticated software application in the last 5-10 years knows that application development is a hard, time-consuming, costly expense. It does not matter if the application is commercial software intended for retail or industrial use (Microsoft, Oracle), an 'in-house' business application (IBM Global Services), a multi-user weblication, or an open source KDE/Gnome desktop software project. There seems to be no practical way of scaling any given software technology to accomplish the tasks and implement the features that are easily envisioned by the programmer/user/visionary. This essay tries to review what the problems are, and hopefully to find, out of the fog of war, a strategy for dealing with things and a few ideas on how things might get better.

This essay deals primarily with the issues facing Free Software developers, and in particular, desktop and server application developers. I personally participate in several Free Software projects, and my personal motiviation for writing this is to better clarify the issues so that I can make a rational technology choice for the ongoing development of these projects. I hope to touch on the following technologies: SQL and databases in general, GUI's, such as Gnome, KDE or XHTML, laguages such as Java and Scheme, and what one can expect from operating systems. I want to think about accounting systems, bug tracking systems, workflow systems, CRM, B-to-B, ERP, to-do list managers, project management systems, chat and collaborative systems, peer-to-peer and digital cash systems. All of these seemingly very diverse systems in fact share a lot of conceptual commonality. The fact that this commonality is not appaerent to most programmers is due to a large measure of the broken-ness and inadequacy of the software development environment and software tools that we have today.

This is a draft, a work in progress.

Introduction

I want to start explaining what the problems are by explaining some of the projects I'm involved in, and some of the projects I'd like to be involved in, and pointing out what barriers these projects face. By necessity, these are "small" projects, with "small" development teams. This is not only because most Open Source software projects are "small", but also because there are severe economic and technical constraints working against "large" projects. Medium and even large corporations often find it hard to justify the costs of development for large projects, and even when there is a good business plan to back the effort, projects often founder under the weight of complexity. This is no accident, and indeed is the kernel of this essay: "small" teams can function efficiently and effectively in ways that "large" teams cannot. I put quotes around "small" because "small" can mean 50 to 100 people: a dozen developers, plus a dozen test and performance people, plus a dozen 2nd level support/bug fixers, plus even more individuals playing all sorts of sales, marketing, planning and support roles. The key here is that "small" refers to the total number of architects and lead/core developers, which by necessity must be less than a dozen. "Medium" and "large" projects are those that attempt to put more than a dozen or two developers on a project. These are projects that flirt dangerously with Brooke's Law: putting more engineers on a late project will make it later.

One "small" project I'm involved with is GnuCash, a personal financial organizer, similar to Quicken, but for Unix systems. The similarity is in not just in the features it provides, but also in its design and heritage. It is firmly rooted in the classical desktop paradigm. It uses a standard desktop GUI toolkit (Gnome) for its buttons and sliders and menus, although its report-generation subsystem uses HTML under the covers. It was beleived that using HTML would make the reports easier to customize, and prettier to look at (a hope that remains unrealized). It can save its data to a file, although it can also use an SQL database to store its data; and, by means of the expressiveness of SQL, it is able to support multiple users. It has some embryonic business accounting features, although it lacks many other features that accountants and even home users might take for granted. It has some onlne capabilites: it supports the German Home Banking Computer Interface (HBCI), it can import OFX (Open Financial Exchange XML) files, and it uses XML for its native file format. It is written in C and scheme, and uses HTML, XML and SQL

GnuCash

The Apps

Here are the aps I've worked on, let me explain why each one seems to baloon out of control, why one single individual cannot carry them to fruition, why a "small team" of dozens of programmers cannot possible carry them to fruition. All this is a symptom of the limitations of today's propgramming technologies.

GnuCash
GTT
WISE & workflow arches
Remote-sensor data gathering system
warehouse stock management system
Data Driven application programming
MICROS~1 db builder
Distributed storage
peer to peer

Similarities and Differences

Storage & Persistence

I am soo pissed off that I have to invent C structs, and I have to invent SQL DB schemas, and I have to write the glue code that goes from one to the other, and back again. Maybe I should be coding using object-oriented databases. But this has faults, the most obvious of which is a lack of standards & portability.

I am further pissed off tha I have to convert C structs into XML, and back again,

SQL
Objects that save themselves. Smalltalk. The versioning problem. I want to hard-code the ontology and the ontological relationships, but want to leave many details open to the user or aplication programmer to extend and modify. There is no coding framework that helps me make these decisions.
Funny thing: we have GUI's like Gnome that hard-code certain things, for example, that buttons can only be pushed, and have only a few 'clicked-on' callbacks, while leaving other things completely configurable: the button title, the button shape and appearence (via themes). Why is it that there is no comparable infrastructure for REA accounting ontologies?
The query problem. The Q in SQL.
stove-pipes, and why most app programming today is mostly stove-pipes. By stove-pipe, I mean converting from C or C++ class/struct to SQL, and back. Or converting from LDAP to my native format. Or converting OFX to what gnucash wants. Or in general doing format conversion between space and time (storage and roceedure).
Ontology . What is it? what do I mean when I use this word?
The ability for (power) user to modify extend the visual display as well as the type & anture of data that is stored.

Data vs. GUI

Model-view-controller, yada yada, yada. Its surprisingly time consuming to connect a good GUI to a database so that it manipulates that data in a reasonable, intuitive fashion. And, what's worse, for the most part, a proceedural connection between what's in the database, and what's shown in the GUI, is flatly not needed.

Based on personal experience, most of this can be done much better with a declarative language. This is what I try to do with the DUI/DWI designer for Gnome/Glade GUI's. But in fact, DUI/DWI is a symptom of the problem: there are no, none, zero popular declarative languages. Most are proceedural (python, perl, C) or evaluative (scheme, lisp). Neither of these language categories are suitable for expressing what DUI/DWI is trying to do. I picked XML as the markup language, since XML is about as close as we come to a popular declarative language with bindings proceedural languages. But, for DWI/DUI, XML is clunky. Really clunky. A "real" programming language is one wehre the programmer wants to create it in a text editor. A sucky programming langauge is one where you want to program it with a GUI WYSIWYG tool. Writing XML by hand sucks, almost everyone want to write XML with a WYSIWYG tool. This is a symptom of the fact that XML is too clunky to be used elegantly.

Desktop vs. Webtop

The difference between programming applications in Gnome, and applications in PHP.

The GUI issue: visual appearence & responsiveness of traditional desktop client vs. a web-top client. Can this issue truly be solved? Why can't java applets solve this issue?
The data movement issue: where is the data? is it cached in the app? is it stored at the server? Can it be peer-to-peer distributed? How is interactivity and ease of use affected by caching? What's the bandwidth? Why are there no transactional local data-cacheing API's and interfaces?
Multi-user issues lead to the traditional client-server coding model, and what happens are that inadequete client-server protocols are invented out of thin air, and are badly documented, and someimes turn into RFC's, and become monsters in thier own right.
Why corba sounds great in principle, sucks in practice. Corba (and RPC's, too some extent) was invented to keep the programmer from having to invent a custom protocol and document it in an RFC. But it falters. May SOAP avoid its traps, but its possible that it may not.
The versioning problem. How the protocol-version problem of corba or SOAP or RPC's is a varient of the object-versioning problem of persistent-object architectures. Why hasn't the object-versioning/protocol-versioning problem been solved yet? Note that even traditional SQL database schemas have versioning problems. The data-caching problem exacerbates the versioning problem: Its bad enough when an app at one version wants to connect to a server at aonther version. Its even worse when there are more intermediaries, or if the the data is cached on a peer-to-peer distributed storage. If my "database schema" is on one-hundred thousand peer-to-peer nodes, how can I ake changes to it? How do I upgrade?
Why does Java suck, let me count the ways.
Langauges vs. Libraries. How language design is like library design, and how its different. Programming languages 100 years from now. Garbage collection: language, not library. Semantic closure: language not library. Versioning, persistance, data caching, distributed messaging: are not these candidates for language, as opposed to library?

Ontology

An ontology is like a database schema, in that it defines objects and thier inter-relations. But a true ontology is different: This definition resembles the traditional description of a database conceptual schema; however, it does differ in at least three important ways: objective, scope and content. First, the objective of an ontology is to represent a conceptualization that is shareable/reusable and where idiosyncrasies of specific applications are ignored. Second, the scope of an ontology is all applications in the domain, not just one. And finally, an ontology contains knowledge specifications where the meaning of the structures represented is explicitly specified and constrained and where the rules to infer further knowledge are explicitly defined. quote from "The Ontological Foundation of REA Enterprise Information Systems", Geerts & McCarthy, 2000