1.1 The Discipline of Organizing
To organize is to create capabilities by intentionally imposing order and structure. Organizing is such a common activity that we often do it without thinking much about it. We organize the shoes in our closet, the books on our book shelves, the spices in our kitchen, and the folders into which we file information for tax and other purposes. Quite a few of us have jobs that involve specific types of organizing tasks. We might even have been explicitly trained to perform them by following specialized disciplinary practices. We might learn to do these tasks very well, but even then we often do not reflect on the similarity of the organizing tasks we do and those done by others, or on the similarity of those we do at work and those we do at home. We take for granted and as givens the concepts and methods used in the Organizing System we work with most often.
The goal of this book is to help readers become more self-conscious about what it means to organize things — whether they are physical resources like printed books and shoes or digital resources like web pages and MP3 files — and about the principles by which the resources are organized. In particular, this book introduces the concept of an Organizing System: an intentionally arranged collection of resources and the interactions they support. The book analyzes the design decisions that go into any systematic organization of resources and the design patterns for the interactions that make use of the resources.
This book evolved from a master’s level university course on “Information Organization & Retrieval” I taught for several years at the University of California, Berkeley’s School of Information. My goal was to synthesize insights from library science, information science, cognitive science, systems analysis, and computer science to provide my students with a richer understanding about information organization than any discipline alone could provide. I came to realize that information was just one of the many types of resources to organize and that it would be beneficial to think about the art and science of organizing in a more abstract way. This book is the product of countless discussions with students and faculty colleagues at Berkeley and other schools, and we are collaboratively developing a new discipline that unifies four types of organizing, as follows:
We organize physical things. Each of us organizes many kinds of things in our lives—our books on bookshelves; printed financial records in folders and filing cabinets; clothes in dressers and closets; cooking and eating utensils in kitchen drawers and cabinets. Public libraries organize printed books, periodicals, maps, CDs, DVDs, and maybe some old record albums. Research libraries also organize rare manuscripts, pamphlets, musical scores, and many other kinds of printed information. Museums organize paintings, sculptures, and other artifacts of cultural, historical, or scientific value. Stores and suppliers organize their goods for sale to consumers and to each other.
We organize information about physical things. Each of us organizes information about things, when we inventory the contents of our house for insurance purposes, when we sell our unwanted stuff on eBay, or when we rate a restaurant on Yelp. Library card catalogs, and their online replacements, tell us what books a library’s collection contains and where to find them. Sensors and RFID tags track the movement of goods - even library books - through supply chains, and the movement (or lack of movement) of cars on highways.
We organize digital things. Each of us organizes personal digital information—email, documents, e-books, MP3 and video files, appointments, and contacts—on our computers, smart phone, e-book readers or in “the cloud,” through information services that use Internet protocols. Large research libraries organize digital journals and books, computer programs, government and scientific datasets, databases, and many other kinds of digital information. Companies organize their digital business records and customer information in enterprise applications, content repositories, and databases. Hospitals and medical clinics maintain and exchange electronic health records and digital X-rays and scans.
We organize information about digital things. Digital library catalogs, web portals and aggregation websites organize links to other digital resources. Web search engines use content and link analysis along with relevance ratings to organize the billions of web pages competing for our attention. Web-based services, data feeds and other information resources can be combined as “mash-ups” or choreographed to carry out information-intensive business models.
Let’s take a closer look at these four different types or contexts of organizing. Are there clear, systematic and useful distinctions between them? We contrasted “organizing things” with “organizing information.” At first glance it might seem that organizing physical things like books, compact discs, machine parts, or cooking utensils has an entirely different character than organizing intangible digital things. We often arrange physical things according to their shapes, sizes, material of manufacture, or other visible properties; for example, we might arrange our shirts in the clothes closet by style and color, and we might organize our music collection by separating the old vinyl albums from the CDs. We might arrange books on bookshelves by their sizes, putting all the big heavy picture books on the bottom shelf. Organization for clothes and information artifacts in tangible formats that is based on visible properties does not seem much like how you store and organize digital books on your Kindle or arrange digital music on your music player. Arranging, storing, and accessing X-rays printed on film might appear to have little in common with these activities when the X-rays are in digital form.
It is hardly surprising that organizing things and organizing information sometimes do not differ much when information is represented in a tangible way. The era of ubiquitous digital information of the last decade or two is just a blip in time compared with the more than ten thousand years of human experience with information carved in stone, etched in clay, or printed with ink on papyrus, parchment or paper. These tangible information artifacts have deeply embedded the notion of information as a physical thing in culture, language, and methods of information design and organization. This perspective toward tangible information artifacts is especially prominent in rare book collections where books are revered as physical objects with a focus on their distinctive binding, calligraphy, and typesetting.
Nevertheless, at other times there are substantial differences in how we organize things and how we organize information, even when the latter is in physical form. We more often organize our “information things” according to what they are about rather than on the basis of their visible properties. At home we sort our CDs by artist or genre; we keep cookbooks separate from travel books, and fiction books apart from reference books. Libraries employ subject-based classification schemes that have a few hundred thousand distinct categories.
Likewise, there are times when we pay little attention to the visible properties of tangible things when we organize them and instead arrange them according to functional or task properties. We keep screwdrivers, pliers, a hammer, a saw, a drill, and a level in a tool box or together on a work bench, even though they have few visual properties in common. We are not organizing them because of what we see about them, but because of what we know about to use them, The task-based organization of the tools has some similarity to the subject-based organization of the library.
We also contrasted “organizing things” with “organizing information about things.” This difference seems clear if we consider the traditional library card catalog, whose printed cards describe and specify the location of books on library shelves. When the things and the information about them are both in physical format, it is easy to see that the former is a primary resource and the latter a surrogate or associated resource that describes or relates to it.
When it comes to “organizing information about digital things” the contrast is much less clear, When you search for a book using a search engine, first you get the catalog description of the book, and if you’re lucky the book itself is just a click away. When the things and the information about them are both digital, the contrast we posed is not as sharp as when one or both of them is in a physical format. And while we used X-rays — on film or in digital format — as examples of things we might organize, when a physician studies an X-ray, is it not being used as information about the subject of the X-ray, namely the patient?
These differences and relationships between “physical things” and “digital things” have long been discussed and debated by philosophers, linguists, psychologists and others (See the sidebar, ““What is Information?””).
The distinctions among organizing physical things, organizing digital things, or organizing information about physical things or digital things are challenging to describe because many of the words we might use are as overloaded with multiple meanings as information itself. For example, some people use the term “document” to refer only to traditional physical forms, while others use it more abstractly to refer to any self-contained unit of information independent of its instantiation in physical or digital form. The most abstract definition, presented in What is a Document? is when Buckland provocatively asserts that an antelope is both “information as thing” and also a “document” when it is in a zoo, even though it is just an animal when it is running wild on the plains of Africa. Similar definitional variation occurs with “author” or “creator.”2
If we allow the concept of information to be anything we can study — to be “anything that informs” — the concept becomes unbounded. Our goal in this book is to bridge the intellectual gulf that separates the many disciplines that share the goal of organizing but that differ in what they organize. This requires us to focus on situations where information exists because of intentional acts to create or organize.