|
In January of 2003, we published an interview with virtual reality pioneer Jaron Lanier (http://java.sun.com/features/2003/01/lanier_qa1.html) that raised some basic questions about programming: Is there something fundamentally misguided about the way we write programs today? Why is it so difficult, if not impossible, to write bug-free programs that contain more than 20 to 30 million lines of code? Do we need a radical new paradigm shift in programming? If so, what might it look like? The interview provoked a strong response, both inside and outside of Sun Microsystems. One response came from Sun's Victoria Livschitz, a senior IT architect and Java Evangelist who has an interesting history. Livschitz grew up in Lithuania, where she was the women's chess champion and a National Chess Master in 1988 -- the same year in which she won the prestigious Russian national junior mathematical competition. She studied applied mathematics at Kharkov University in the Ukraine before coming to the US, where she subsequently received a degree in Computer Science from Case Western Reserve University. After a four-year stint at the Ford Motor company, she came to Sun in 1997, where she has served as principal architect on several high-profile eCommerce and EAI projects, while managing all aspects of Sun's technical presence at General Motors. In 2001, she was named System Engineer of the Year for the company's Central Area, and won the Trusted Advisor Award at Sun. In addition, she is a founding member of the World Wide Institute of Software Architects. We met with her recently to talk about programming, chess, and other challenging matters. Playing Chess and Creating Software
The parallel between chess and programming is rather obvious. Programming is also about knowledge, creativity, and technique. Good programmers must have a vast body of knowledge at their fingertips: the programming syntax of one or more languages, standard and special-purpose data structures, typical (as well as advanced) coding techniques, many kinds of libraries and APIs, a multitude of design patterns, and so on. Good programmers use their creative vision to recognize many patterns that may be relevant to the solution of the specific design problem at hand, and correctly choose the best approach. Finally, no matter how good the architecture and design are, to deliver bug-free software with optimal performance and reliability, the implementation technique must be flawless. It's not surprising that so many people at Sun like chess. I am actually hoping to organize an internal chess championship sometime this year. Seems like that would be a lot of fun. Of course, chess is also a sport, which demands unique talents and qualifications that are required of a champion, such as willpower, stamina, the ability to take risks, and so forth; competitive chess is really more similar to football than programming. Then again, coding marathons to meet deadlines are very much like chess tournaments, don't you think? Software Problems
And here's what's really sad -- the overwhelming majority of so-called "successful" development projects produce mediocre software. Take almost any corporate accounting application, and you'll find it poor in quality, unimpressive in capabilities, difficult to extend, misaligned with other enterprise systems, technologically obsolete by the time of release, and functionally identical to dozens of other accounting systems. Hundreds of thousands of dollars are spent on development, and millions afterwards on maintenance -- and for what? From an engineering standpoint, zero innovation and zero incremental value have been produced.
Jaron's emphasis on "pattern recognition" as a substitute for the rigid, error-prone, binary "match/no match" constructs that are dominant in today's programs is intriguing to me, especially because I've always thought that the principles of fuzzy logic should be exploited far more widely in software engineering. Still, my quest for the answer to Jaron's question seems to yield ideas orthogonal to his own.
I can see two reasonable ways to create complex programs that are less susceptible to bugs. As in medicine, there is prevention and there is recovery. Both the objectives and the means involved in prevention and recovery are so different that they should be considered separately. The preventive measures attempt to ensure that bugs are not possible in the first place. A lot of progress has been made in the last twenty years along these lines. Such programming practices as strong typing that allows compile-time assignment safety checking, garbage collectors that automatically manage memory, and exception mechanisms that trap and propagate errors in traceable and recoverable matter do make programming safer. The Java language, of course, personifies the modern general-purpose programming language with first-class systemic safety qualities. It's a huge improvement over its predecessor, C++. Much can also be said about the visual development tools that simplify and automate more mundane and error-prone aspects of programming. Having said that, these technological advances are still inadequate in dealing with many categories of bugs. You see, a "bug" is often just a sign of recognition that a program is behaving undesirably. Such "undesirability" may indeed be caused by mechanical problems in which code does something different from what it was intended to do. But all too often the code is doing exactly what the programmer wanted at the time, which (in the end) turned out to be a really bad idea. The former is a programming bug, and the latter a design bug, or in some exceptionally lethal cases, an architectural bug. The constant security-related problems associated with Microsoft's products are due to its fundamental platform architecture. Java technology, in contrast, enjoys exceptional immunity to viruses because of its sandbox architecture. I don't believe that future advances in software engineering will prevent developers from making mistakes that lead to design bugs. Over time, any successful software evolves to address new requirements. A piece of code that behaved appropriately in previous versions suddenly turns out to have deficiencies -- or bugs. That's OK! The reality of the program domain has changed, so the program must change too. A bug is simply a manifestation of the newly discovered misalignment. It must be expected to happen, really! From that vantage point, it's not the prevention of bugs but the recovery -- the ability to gracefully exterminate them -- that counts. In regard to recovery, I can't think of a recent technological breakthrough. Polymorphism and inheritance help developers write new classes without affecting the rest of the program. However, most bug fixes require some degree of refactoring, which is always dangerous and unpredictable. Fighting Software Complexity
Things appear simple to us when we can operate intuitively, at the level of consciousness well below fully focused, concentrated, strenuous thinking. Thus, the opposite of complexity -- and the best weapon against it -- is intuitiveness. Software engineering should flow from the intuitiveness of the programming experience. A programmer who works with complex programs comfortably does not see them as complex, thanks to the way our perception and cognition work. A forest is a complex ecosystem, but for the average hiker the woods do not appear complex.
Now back to your question. For a long time, programmers have been manipulating subroutines, functions, data structures, loops, and other totally abstract constructs that neglect -- no, numb -- human intuition. Then object-oriented programming took off. Developers could, for the first time, create programming constructs that resembled elements of the real world -- in name, characteristics, and relationships to other objects. Even a non-programmer understands, at a basic level, the concept of a "Bank Account" object. The power of intuitively understanding the meaning and relationship between things is the proverbial silver bullet, if there is one, in the war against complexity. Object-oriented programming allowed developers to create industrial software that is far more complex than what procedural programming allowed. However, we seem to have reached the point where OO is no longer effective. No one can comfortably negotiate a system with thousands of classes. So, unfortunately, object-oriented programming has a fundamental flaw, ironically related to its main strength. In object-oriented systems, "object" is the one and only basic abstraction. The universe always gets reduced to a set of pre-defined object classes, some of which are structural supersets of others. The simplicity of this model is both its blessing and its curse. Einstein once noted that an explanation should be as simple as possible, but no simpler. This is a remarkably subtle point that is often overlooked. Explaining the world through a collection of objects is just too simple! The world is richer than what can be expressed with object-oriented syntax. Consider a few common concepts that people universally use to understand and describe all systems -- concepts that do not fit the object mold. The "before/after" paradigm, as well that of "cause/effect," and the notion of the "state of the system" are amongst the most vivid examples. Indeed, the process of "brewing coffee," or "assembling a vehicle," or "landing a rover on Mars" cannot be decomposed into simple objects. Yes, they are being treated that way in OO languages, but that's contrived and counter-intuitive. The sequence of the routine itself -- what comes before what under what conditions based on what causality -- simply has no meaningful representation in OO, because OO has no concept of sequencing, or state, or cause.
Processes are extremely common in the real world and in programming. Elaborate mechanisms have been devised over the years to handle transactions, workflow, orchestration, threads, protocols, and other inherently "procedural" concepts. Those mechanisms breed complexity as they try to compensate for the inherent time-invariant deficiency in OO programming. Instead, the problem should be addressed at the root by allowing process-specific constructs, such as "before/after," "cause/effect," and, perhaps, "system state" to be a core part of the language. I envision a programming language that is a notch richer then OO. It would be based on a small number of primitive concepts, intuitively obvious to any mature human being, and tied to well-understood metaphors, such as objects, conditions, and processes. I hope to preserve many features of the object-oriented systems that made them so safe and convenient, such as abstract typing, polymorphism, encapsulation and so on. The work so far has been promising.
But expanding the pure object-oriented paradigm to allow for a richer set of basic abstractions -- like processes and conditions -- is only half of the arsenal in the war on complexity. The other half is a powerful aggregation/decomposition model that is rather weak, convoluted, and fragmented in modern programming. In order to deal with complexity, the organization of the software elements is of utmost importance. Hierarchies and collections are pretty much the only tools we've got to define how things relate to each other and how they should be organized into manageable structures. Hierarchical aggregation fits well with the fractal nature of many organic and artificial systems, and it is intuitively obvious to most people. Plus, the depth of the aggregation scales linearly with the exponential growth of elements, which is hugely important. Collections are similarly plentiful in the natural and virtual worlds, fit well with peer-to-peer systems, and once again, are totally intuitive. Unfortunately, this wonderfully simple division of structures into hierarchies and collections is, again, too simple for our needs. There is a plethora of other relationships that also don't fit very neatly. Master/slave, many-to-many, component/container, interval, element/metadata, and so on, are just a few common ones that we deal with every day. We treat each structural relationship differently every time. Theoretically, the object is the one and only "unit of software" in object-oriented systems, but is that really true? We have explicit distinctions between classes, packages, resource files and application bundles, containers and components, classes and interfaces, applications and services, and so on. Each new technology introduces new concepts. Inside the source code, we've got "Is-A" and "Has-A" as two alternative mechanisms to create new software components out of existing ones. Still, all these things combined cannot express the simplest aggregation of several elements with particular semantic relationships; therefore, an external graphical "design pattern" is needed to document which elements are aggregated and how the collective system works. Talk about the complexity and counter-intuitiveness of programming! What seems to be missing is a unified component architecture rich enough to cover the whole spectrum of needs, from distribution to reuse. I am convinced that it isn't that hard to do. First, a notion of a "component" as a fully autonomous element of software must be strictly defined. An object, to be sure, is not a component, although many components may be implemented with objects. Then the rules of relation, composition, and aggregation of sub-components into higher-level components will be defined, in fully codifiable form. Familiar "Is-A" and "Has-A" relationships will be present, among many others. Finally, the rules of derivation will be defined and codified to enable a comprehensive reuse framework. Inheritance, for example, will be only one form of derivation made possible under the new model.
Equipped with such a powerful component architecture, a new theory of reuse may be developed, this time addressing the entire software lifecycle over a project's lifetime in a graceful, truly evolutionary way. Refactoring will no longer be a brutal, destructive operation. Instead, a safe, almost organic rejuvenation of the old components by the new ones -- guaranteed at compile time to be semantically, as well as syntactically, correct -- will become possible, analogous to the cyclical rejuvenation found in every corner of nature. Software is truly amazing media, unlike anything else found in nature or created by humankind. Like information in general, software is not an entirely physical substance, for it has no mass, volume, or density. Neither is it an entirely metaphysical concept, for it interacts with real, physical entities, and causes very concrete physical impacts, such as the rotation of a turbine, the flow of electricity, or the imprint of an image on the page. Software is a product of our imagination, like a book, a painting or a movie, designed to synthesize a particular representation of the real world. But unlike all other forms of pure art, software is constructed for utilitarian purposes to do more then merely reflect the real world; software interacts with the world and in many cases even controls it. And what is truly amazing -- software is replicable: instantaneously, in arbitrary numbers, at zero cost! I believe there has to be a better way to harness the power of software media than what we came up with in the last millennium. Advice to Developers
As far as optimism about the future, I see a lot of interesting work around presentation of data to end users. Sun's Project Looking Glass is a good example of the innovative thinking and good use of intuitive metaphors that make interactions with complex multi-media information effortless. Apple and Microsoft seem to be working on similarly interesting technologies. Sadly, none of it is going into basic research and the development of principally innovative general-purpose programming languages. The complacency around C/C++ and the Java language is pervasive. C#, the first programming language in years, looks more like the Java language. Enormous productivity gains remain to be uncovered and difficult problems are yet to be solved. The world has gone crazy with XML and then web services; SOAP and UDDI are getting enormous attention, and, yet, from a software engineering standpoint, they seem to me a setback rather then a step forward. We now have a generation of young programmers who think of software in terms of angle brackets. An enormous mess of XML documents that are now being created by enterprises at an alarming rate will be haunting our industry for decades. With all that excitement, no one seems to have the slightest interest in basic computer science. Still, there must be people out there who think differently. Jaron Lanier is clearly one of them. Recently, one project at Sun Labs appeared to be genuinely interested in beginning work on the "next thing after Java technology" as part of far-reaching research into new computing platforms. So, I don't know, things may begin turning around. See Also
Interview with Jaron Lanier
Contact Victoria Livschitz |
| |||||||||||||||||||||||
|
| ||||||||||||