Edict™ is the dictionary editor that builds archives for Gaius Navigator. Edict is a complex and artificially intelligent compiler of natural languages, a linker, a grammar-based compressor, a correlator, and a workflow engine.
To prove and justify some of our core operating system technology we looked for a simple application that would involve a large dataset with a potentially large number of cross-references but with a small number of dimensions in terms of directions of navigation, We started with legal text which had the nice property of throwing references from one text to another in a quasi-regular fashion. We started collecting laws from the legislative bodies of the world, but settled for the Romanian legal system where we seem to have found most resources. The first steps along this way were seductively encouraging. Then a long thorny road awaited us with the milestone - the Romanian Legal Dictionary- eluding us. The Authorities here are ruling laws in a frenzy. The datasets are huge, much more than we bargained for. The text is nothing near homogeneous. In the process of data gathering we were faced with a huge amount of duplication of material. Items that were supposed to be identical were slightly modified by say one comma. Had to develop a system to reduce it all and keep it sane. The only problem here was that the kind of language we were dealing with was natural (for those of you who have forgotten, "natural" is any language other than C++, Pascal, and Assembler, and the kind people use to talk to each other). We started to learn the grammar of our language and the vocabulary. In doing so we discovered that the archive must be held in a format that is grammar-compressed, but still correlatively traversable. We spent well over a year developing the grammar-based codec which is really not so big a piece of code but of excruciating difficulty. While doing this, we were observing worryingly major shifts in the way the Authority published its legal text. Had to start building an automated learning system at the gate of our Correlator. We were constantly racing against time, but it all came together, and we now have a powerful tool, which does the data entry automatically. So we can handle anything they throw at us. This database production system, called Edict also plays a more generic role in the scheme of our Operating System and is the Artificial Intelligence core for anything that's spoken, the body that spawns other compilers and data import/export processes.
EDICT5 at work compiling a legal archive
The screenshot above shows one of the working frames of Anna K5 with Edict running the learning system over some internet resources and updating its knowledge of law. The result is an updated Romanian Legal Dictionary that Gaius Navigator uses.
As is the case with operating system mechanisms, Edict cannot be identified as one application, and rather is a set of extensions on top of Anna's file system, an extension of the system's browser, an extension of the runtime evaluator, and a set of additions to the family of type editors. It cannot be uncoiled from the OS and is one of the reasons why we cannot package this production system independently as a product.
If there's continuing interest in this kind of a tool, we might consider opening-up a line of business where we build similar archives for you. We'll see.
Send mail to
questions or comments about this web site.