At my library, we’re currently working on a project in conjunction with several other regional knowledge institutions to put online our full collection of historical documents regarding the Civil War in Missouri and Kansas. One piece of functionality we’re creating is a way to visually represent the relationships between people, places, and things within this pool of data. These visualizations are based on a relationship database that we constructed, using a basic semantic structure: “Object A [relationship] Object B” and we can verify this relationship with “Document X”. Thus, for example:
Iskabibble Jones is married to Bridgette Jones and we know this because of information contained in Bridgette’s letter dated …
Only, instead of statements, we represent this all graphically with links to images and documents. It’s a pretty nifty function!
The way we’re building the database for this relationship visualization tool is representative of how online data gets handled in general. It illustrates the fundamental paradigm that has governed computer development from the beginning – and, indeed, the development of mechanized data handling even before the advent of computers.
Namely, we parse data into as many different discreet elements as needed, define those elements, and define the relationships between them (these relationships are themselves discreetly defined elements). It’s all about granularity, isolating various definable aspects of a given body of information and placing these elements in relationship to each other. This allows us to create programs and apps and coding structures that manipulate this data as needed. The holes in the punch cards of the earliest computers; the 1s and 0s of binary code which reduce all data and actions to either “on” or “off”, “true” or “false”; the element tags of XML schemas; even the packet-switching technology that underlies the entire internet – they all function within this same overall paradigm. In these mechanized and digital systems, information is understood, at its root, as collections of discretely definable and related elements.
It brings to mind the state of mathematics before the creation of calculus. Before calculus, if you wanted to mathematically define the fluid flow of a river, the best you could do was to measure the flow rate and volume at a series of points along the river; mathematically, the river was then understood as a series of discreet points, taken in sequence. To understand the river as a whole, you would technically need data from every point along its length. Of course, mathematically speaking, there are an infinite number of points in any given length, so this method of defining a river would always be an estimate. There was no way to mathematically define it as a cohesive whole. There was no way to deal with the river directly.
To measure the area enclosed by an irregular curve (or series of irregular curves) you had to measure the area of as many discreet polygons, circles, ellipses, and/or parabolas as fit inside the confines of the irregularly shaped area – in other words, you had to mathematically redefine the original irregular shape as a structure made up of other regular shapes. The area, then, was the sum of the areas of the all discreet shapes contained within it. As accurate as this method could be (but never completely accurate, as there would always be odd, irregular sections of the original shape that this method can’t cover) it was still always and essentially an approximation that stood in reference to the actual area being measured. There was no way to define the irregular area in-and-of itself.
Calculus changes everything. Calculus allows you to mathematically define the flow of a river as a singular unit – no longer is a river just a series of discreet points but a meaningful thing all its own. Calculus allows you to measure the area enclosed by an irregular curve without having to break it down into sections – at last, you can understand it mathematically as itself and not in reference to other shapes.
Pre-calculus mathematics was based on breaking down systems into discreet elements. Calculus is based on defining things holistically, without artificial parsing.
This is not to say that pre-calculus methods are incorrect or that they don’t serve very useful functions – it’s just that they’re fundamentally limited and incapable of representing larger synergistic aspects of the systems they seek to define. Calculus gives us an additional set of tools that allow us to understand these systems in far more comprehensive ways.
The way we currently handle online data is pre-calculus: bodies of information are parsed and broken down and redefined as structures of other inter-related elements. It’s the only way we can meaningfully handle data in current electronic systems.
What we need is a calculus for digital data handling. A way to handle bodies of information without the need to break it down and break it apart. A way to define and manipulate information that comprehends it holistically and not as a structure of other, smaller elements.
Just image what we could accomplish then…
2 thoughts on “Data Handling in Electronic Systems – Inspiration for a Paradigm Reassessment”
Wait a second…just what COULD be done with holistic, calculus-style data handling? What are we currently missing out on? What are the Internet equivalents of the river and the area enclosed by the irregular curve?
I’m intrigued, and I’m probably going to have to stop working for the rest of the day just to think about this.
It’s just a random thought that’s been flitting around in my head for a couple weeks. We spend so much time focusing in and trying to identify the base essentials of a piece of information – I can’t help wondering what we’re missing. What we’d discover if we focused in the other direction, so to speak.
My analogies start to show their flaws if we push them too far.
I also know that I have absolutely no concept of how one would build a “calculus”-based digital information system. That’s waaay beyond my ken and my purview.