|
1st July, 2002
Real-world objects and fuzzy models
We live in a world that contains fuzzy boundaries of
classification. Our western style of thinking has forced us to
continually categorise all objects that we encounter. However,
such categorisations as we might encounter in the programming
world enforce an exactness of labelling that we do not
necessarily employ in real-life. Object-oriented thinking is a
very rigid and declarative means of forcing classification of
data into well-defined clusters that represent a real-world
object. The exactness appeals to our scientific senses and
states exactly what constitutes a type of object. For example,
to paraphrase one of my lecturers a cat can never be a dog. It
has an inherent cat-ness. Specifying the exact attributes that
distinguish a cat from a dog is very difficult yet we are able
to make the distinction intuitively. As a simple exercise,
define a programmatic test to separate a cat object from a dog
object that would apply over all instances of dogs and cats.
At the other extreme, our quest for exactness causes problems
in correctly attributing an instance of an object to a class.
If we say that a dog has four legs and a tail and we encounter
a dog with three legs and no tail is this instance a non-dog?
Although it may seem that I am delving into the neural networks
area of classification, the problems I outline here are ones we
encounter to some degree in all business systems. We spend a
large amount of time modelling real-world objects so we may
develop a representation for use in the system. We do this so
that we may define the exact interactions between these object
representations. We have high predictability of operation
through rigid contracts for interaction and therefore we have
high program quality if we correlate quality to the level of
predictability of the system.
The price we pay for such exactness is rigidity in the system
and this makes the system resistant to change. In areas of
system control, this is the desired effect as we do not want
change to introduce unpredictability of operation. There must
be an investment in analysis to determine how the system should
react to new controls or the introduction of new factors and
accordingly modify the system. However, in other areas we
discover that even though the change does not affect any
critical areas of decisions in the current system, the change
imposes a programming cost that is not proportional to the
nature of the change. We need to explicitly declare that we
wish to access an object's characteristic in the
object-oriented world. The application of the object paradigm
imposes a broad-brush level of severity for access. Should we
access a characteristic that does not exist, there is a penalty
for the operation. There is no differentiation between critical
characteristics and non-critical characteristics. A dog with
three legs would be a specialisation of a dog that requires
explicit attention by the programmer; an attention that is not
carried with the same weight by the user of the information.
Business entities that tightly couple specific semantics into
their structures lock the system in which they reside to a
rigid specification for operation. While this provides for
preciseness, it reduces the gracefulness with which a system
can adapt to changing conditions. It also lacks the subtlety to
represent ambiguous characteristics of entities.
The practical sense of this does not become apparent until we
turn to our particular area of interest. In product catalogues
there are so many exceptions and features for products that
there is a high level of difficulty in representing the
characteristics of a product through a direct object-oriented
approach. The degree of product specialisation and the special
language used to describe product features coupled with the
explicit specificity of a programming "object" either
leads to the creation of a single representation that has
superset of all products or many representations to match the
specialisations. If we were to deliver a general catalogue that
would generalise across industries and even across product
lines without requiring a high-degree of on-site programming
customisation, clearly a traditional object model would work
against these requirements. And you have to contend with issues
such as products that span multiple categories and products in
the same category may be missing certain characteristics or
have additional characteristics. For example, a model of CD
Walkman may have an FM radio while another may not. Yet both
are still Walkman's and the missing characteristic is not of
great importance in any programming sense as it is still a "product".
Beyond that, the characteristics have a language to describe
them and the representation must be able to access the
language.
For these reasons, we worked on Flexcorp to model "product"
objects. We wanted a fuzzy representation for product
characteristics that had sufficient power to both contain the
descriptive language for an instance of an object as well as a
means to describe the set of characteristics. It is fuzzy in so
far as allowing recognition of an object as being that type of
object even if certain characteristics are missing.
Furthermore, we wanted to be able to produce these
representations without the explicit programmatic overhead to
access the representations; that we should not treat an object
without a characteristic any differently to another object
declared to be of that type. The longer we worked on the
concept, the more convinced we became that the fuzzy
representation contained in Flexcorp could be applied to many
real-world object such as company descriptions or user
descriptions. An address within a user description may have two
address lines or three address lines. The presence or absence
of the additional line should not be a reason to discriminate
against objects once we are aware that the object is of a
certain type.
There are several unsought-for advantages of such a liberal
attitude. First, there is the flexibility to add new
characteristics to an object after implementation without
serious ramifications on the usage. Suppose that a new
requirement arises where we wish to store a user's age, other
than adding programming to receive the input and display such
characteristics or to export it to another application, there
should not be a major exercise to rework the rest of the
plumbing in the application. The second is that we may have a
product that contains multiple values for a single
characteristic. For example, a book may be both science fiction
and a classic. There is a fuzziness of categorisation allowed
from the multi-valued quality. These are real-world
requirements for representing objects.
Now, what we have described here may seem to undermine the
theories of object-oriented programming. We have tried to
develop a working theory for modelling real-world objects in
such a manner that may be acceptable for mainly human
consumption. We do not shun object-oriented approaches except
in its direct application to imperfect real-world objects and
the loss of expressive information in such rigid definitions.
Object programming and object modelling cannot reconcile that a
cat has degrees of cat-ness while retaining a characteristic
cat-ness. The exactness and programming discipline of a
programming "object" is appropriate for control and
predictability. We recognise the strengths and uses but suggest
it is not appropriate in certain instances of data modelling,
particularly where we must retain the nuances for human
interpretation.
|