Main menu

New Sales Channels for TDO book

It has been a while since I sent out any TDO news… here is something urgent.

O’Reilly, the publisher of the 2nd->4th editions of the book, has decided to stop selling one book at a time so that it can encourage adoption of its entire collection of books in the Proquest “Safari” package. Your university library might subscribe to that, so if they do you can point your students to get TDO there.

But if you go to the O’Reilly online catalog and search for TDO, it directs you to Safari or to Amazon, where you can buy the 2013 print edition (which you don’t want!)

I prevailed upon O’Reilly to put the current editions of TDO into other channels, so you need to have your students go to:

Apple itunes – selling MITP 1est edition, selling OReilly 2nd and 4th editions (you’ll want one of the latter)

google play


“Why I Think Books Should Die”

My friend Noz Urbina, who shares our vision of adaptive content, recently interviewed me for this provocative essay. It very clearly (and very flatteringly) explains the motivation for the content architecture of The Discipline of Organizing and how this content architecture is reflected in the processes for building and customizing different editions of the book.

Noz ends the essay with this too clever statement:

We need to kill books as they’re envisioned today and move to something like “content collections” or “author-maintained IAs” or “Bodies of Organized Knowledge.” I like that last one… we could call them BoOKs for short.

“The Man Who Organized Everything”

The Boston Globe newspaper has an "Ideas" section that has a story today about The Discipline of Organizing with the amusing title "The Man Who Organized Everything," meaning me. The writer (Chris Wright) is a free lancer who lives in London who interviewed me by Skype a couple of weeks ago when I was on vacation in southern France. I was very surprised and flattered that a major mainstream paper would pick up on our book, because it is a serious scholarly textbook, but Wright had read a fair amount of the book and very nicely conveyed some examples that regular folks would appreciate while also making the point that the book is serious and deep.

Wright and I talked a fair amount about southern France because he had vacationed there in the past, and that included a discussion about vineyards because that part of France, called Cote du Rhone, is well known for its wine. I observed that vineyards are good examples of organizing systems with very systematic arrangement of grapevines based on the "terroir" - the soil, the drainage, the way the sun hits the vineyard, wind, and other factors - as well as on the particular varietal of grape. Two adjacent vineyards can differ a great deal in how they are organized as a result. A couple of days later as I was driving back to Paris through the Burgundy area I noticed that the vineyards were very tightly arranged, with a lot less space between the rows than in Cote du Rhone because Burgundy wine commands higher prices, which induces grape growers to squeeze as many vines as they can on a plot of land.

-bob glushko

JK Rowling & Stylometric Analysis

In mid July 2013 The Sunday Times in London revealed that Robert Galbraith, who had written a detective novel called The Cuckoo's Calling, was in fact JK Rowling, the author of the Harry Potter books. This story has been widely covered (NY Times story here) but most of them don't say much about how the "stylometric" analysis was carried out. I think that this is an excellent story for those of us who put a little NLP and "computational classification" in our courses so I started looking for some relevant sources.

The best one I found so far was a report by Patrick Juola of Duquesne University, who did the analysis, so I'd trust him as the best source of how it was done. The report is called Rowling and "Galbraith": an authorial analysis , an extremely readable case study that also tells a bit of the history of authorship analysis, including a mention of the classic work by Mosteller and Wallace that determined that James Madison and not Alexander Hamiilton was the author of some of the Federalist Papers (discussed in my lecture notes from Oct 18 of last year).

There is a story today (July 29) in the Chronicle of Higher Education titled The Professor Who Declared, It's JK Rowling that goes beyond Juola's piece, even introducing the concept of "adversarial stylometry" - what you do to diguise the authorship of a text. A tool that stripped out or changed the key words that would identify an author would be useful for whistleblowers or dissidents.

(Sorry 'bout the link, which might smack you into a paywall... it works for me but that's because i have the massive resources of the UC Berkeley library working in the background to make links work).

-bob glushko

The Chaos of Personal Digital Archiving

It uses only simple visualization with bar charts and Venn diagrams, but Dog House Diaries cogently shows the problem with the lack of standards and app/context proliferation when it comes to keeping track of your personal digital resources. Think about where you keep your:

  • Photos
  • Documents
  • Music
  • Video
  • TV
  • Messages
  • Friends (where you interact with them digitally)

I wonder if doing this kind of self-assessment is a useful assignment for our courses... maybe a two-person assignment where each person makes the assessment and then they jointly reflect?

Peter Brantley on Intelligent, Adaptable Books – Like TDO!

Peter Brantley, who heads the open annotation project (where I have a student this summer working to integrate a browswer version of TDO with annotation capability), has written an interesting post in his blog on Publishers Weekly called "The Intelligent, Adaptable Book." Brantley and I met earlier this week to discuss the TDO/ integration and we talked for a long while about some of the ambitious things we're trying in TDO to balance breadth and depth. We needed breadth to include all of the disciplines and domains that care about organizing, and we needed depth to discuss them in a way that was credible - hence, the tagged endnotes. Brantley describes our approach and includes a screen shot from the lastest version of the "Academic EBook Edition," which is going to be coming out in just a few weeks.

A Webinar about The Discipline of Organizing

Today I did a webinar hosted by Scott Abel, the "content wrangler" personality that some of you might know from the conferences he organizes. The topic of the webinar was - you guessed it - The Discipline of Organizing. I presented a 40 minute version of the talk that I gave at several ISchools this April, and since all of you know what the book is about, I didn't mention it here. But afterwards I realized that it might be useful for some of you to be able to point colleagues or students or other people to this as a good introduction to TDO. So here's the replay.

bob glushko

Two Books on “Organizing by Component Parts”

I've recently read (maybe "viewed" is more accurate because the books are "picture books") two books that fascinated me because they took novel approaches to what we might call "Organizing by Component Parts." The Discipline of Organizing discusses issues of Resource Identity (Section 3.3) in great detail. Nevertheless, these two books augment that discussion and I expect to use them to illustrate my lectures because of the novel way in which they answer questions about whether a resource should be considered a collection, a composite of parts, or a single resource whose internal composition isn't usefully deconstructed.

The first book is Things Come Apart by Todd McLellan

McLellan photographs 50 things - including a bicycle, chain saw, piano, typewriter and various electronic devices - both neatly arranged and "exploding" in mid-air. You'll be amazed at how many parts some of these artifacts contain.

The second book is The Art of Clean Up by Ursus Wehrli

See also some examples here.

Wehrli is a Swiss comedian whose gimmick in the book is to deconstruct paintings or photographs into their component parts and then fastidiously arrange them by shape, size, or texture. We didn't get the idea for TDO's cover from Wehrli but there is a family resemblance.

-bob glushko

The Faustian Bargain with Ebooks

Clifford Lynch, the executive director of the Coalition for Networked Information and an adjunct professor at the UC Berkeley School of Information, writes a very critical article whose title says it all: “Ebooks in 2013: Promises Broken, Promises Kept, and Faustian Bargains.” His most scathing comments are about how publishers make it difficult for libraries to lend ebooks and about the cartels created by ebook platforms. Definitely a potential additional reading for my fall course.

Classifying TDO

I suspected that a multi/trans-disciplinary book like TDO was going to be hard to classify, but it is almost amusing to see how it is being done. I am not surprised that the professionals at the Library of Congress put us in Z666.5, a category whose components are listed as:

  • Bibliography. Library Science. Information Resources
  • Libraries Topic Component
  • Library science. Information science Topic Component
  • Information organization Topic Component
  • General works Topic Component

On the other hand, at you find TDO if you search for books on "library management" - and in the Apple IBooks store it is categorized as "System Administration."

Provenance Problems and Repatriation – A Precedent

A couple of weeks ago I noted an interesting NY Times story about how provenance problems had led the Metropolitan Museum in NYC to decide to return a statue to Cambodia. There is a fascinating followup story in the May 16 NY Times ("From Jungle to Museum, and Back") that adds some additional twists to this story.

Careful research by some French scholars shows that several pieces from the same site were stolen and then found their way to different museums in Denver, Cleveland, and Pasadena (there are some very comvincing photos and diagrams in the story). Now that the Metropolitan Museum in NY is returning its piece, it will be interesting to see if the other museums do the same given this compelling evidence.

-bob glushko

The Future of the Library

Libraries - in their historical, current, and future realizations - are an important topic in The Discipline of Organizing, and in particular Chapter 1 goes into some detail about "What Is a Library?"

For many people the abstract concept of a library as an organizing system is hard to see because of the powerful cultural notion of the library as a physical place realized in an inspiring and monumental building or collection of buildings. But as books and other traditional resources in libraries have migrated to digital forms, and as the web becomes the default library for many people, this puts pressure on the "library as physical place" to stay relevant by adapting to new purposes. Of course a big challenge here is that the new purposes that the library must satisfy are changing very quickly, much faster than the inertia of large institutions like big public libraries lets them keep up with.

This tension is the topic of a very informative article in the May 13, 2013 Wall Street Journal by Julie Iovine called "The Library's Future is Not an Open Book" ( It briefly describes some architecturally significant libraries, some like Boston's that were built in the late 19th century and others like Seattle's that were opened just a few years ago. It also discusses some of the challenges and controversies emerging in library renovation efforts like the one underway at the New York Public Library.

Definitely a good addition to my Fall 2013 course syllabus.

-bob glushko

Amazon’s Surprising Pricing for Print and Kindle TDO

Now that The Discipline of Organizing is available in print form, MIT Press has pushed the content-equivalent ebook versions into the usual channels. It didn't surprise me that Amazon was the first to offer an ebook version, but I was surprised by the pricing. The list price of the print book from MIT Press is $40, and Amazon is offering it for $30.63, a 23% discount. The list price for the digital version is $39.99 (I guess the marginal cost of printing, warehousing, and distribution come to 1 cent), and the Kindle version is offered for $27.95, a 30% discount. I expected the digital version to be much cheaper.

Provenance Problems -> Repatriation

Just about everything I know about provenance - which isn't a lot - made it into The Discipline of Organizing Section 3.5.4, so I was pleased this morning to read a story in the 4 May 2013 NY Times about how the Metropolitan Museum of Art has decided to return some 10th century Khmer sandstone sculptures to Cambodia after determining that they were looted during the 1970s when Cambodia was in chaos ( I'll put this case study in my course syllabus because it nicely frames the dilemma that incomplete provenance poses for museums today. The Met has a staggering collection of artifacts from everywhere, but I especially remember galleries with Eqyption, Greek, and Roman artifacts - much of which was acquired when the conventions and practices for collecting were much looser than they are today, and indeed, the Met has returned some things to Egypt and Italy in recent years. But at the same time, one can argue that if the Met and other museums hadn't collected these resources, they wouldn't exist today, because Eqypt and Italy weren't capable of caring for them.

One party who comes out looking bad in the story is Sotheby's, which in 2011 tried to auction a sculpture that undeniably comes from the same site as the ones the Met is repatriating; a Sotheby's statement argues that "the Met's voluntary agreement does not shed any light on the key issues in our case". The case Sotheby mentions is one in which the US government is suing to obtain the sculpture on behalf of the Cambodian government.

Using “Big Data” in “Human Resource Organizing Systems”

When I defined an "Organizing System" in The Discipline of Organizing as "an intentionally arranged collection of resources" I recognized that this could also apply to people-as-resources in the familiar sense of "human resources." So as I said in Section 1.2.2, "We might discuss how human resources are selected, organized, and managed over time just as we might discuss these activities with respect to library resources." But I decided that "these topics are much more appropriate for texts on human resources management and industrial organization so we will not consider them much further in this book."

Now I'm not so sure. I've recently seen stories about the emergence of "work-force science" as a discipline that applies data analysis to HR management. For example, Steve Lohr wrote "Big Data, Trying to Build Better Workers" for the NY Times ( and described how some firms are analyzing patterns of workforce communication and document creation to characterize the efficiency and innovativeness of their employees. A firm called Ultimate Software offers software that analyzes data about employees to calculate how likely they are to leave the firm (a "Retention Predictor"). And of course academics are familiar with the use of the h-index ( as a "big data" measure of their productivity and impact based on the patterns and number of citations their most cited papers have received.

If big data and analytics are to become more important in human resources management, the design of the "resource descriptions" for the human resources is a critical acticity. So I think The Discipline of Organizing has to expand in scope to accommodate this.

Diagnosing and Improving Textbooks with Data Mining

On May 1 I heard a stimulating lecture by Rakesh Agrawal from Microsoft Research in which he described work to diagnose the conceptual coherence and comprehensibility of textbooks and then automatically select additional web resources to augment the text at the problematic location. The key measure of coherence is called "dispersion" and it captures the intuitive idea that a section of text that discusses too many unrelated or weakly related concepts will be hard to understand. So you extract noun phrases (ignoring the very frequent ones), and then search Wikipedia to build a graph of the connections among the phrases. I.e., if a section mentions "metadata," "Dublin Core," and "gasoline" we'll probably find links between the first two but not with gasoline. This dispersion measure is then combined with a readability measure based on average word and sentence lengths, and sections with high dispersion and low readability become candidates for augmentation.

I think that TDO is very tighly written and is highly readable, but nonetheless we want it to be as good as it can be, and in particular we are creating enhanced ebook versions where we are asking the questions "what kind of enhancements like photos or interactivity add value" and "where best to incorporate them."

So I'm making a field trip down from Berkeley to Miscrosoft Research in Mountain View soon and will let you know if we can apply Agrawal's work to TDO.

The home page for the reserach project is

-bob glushko

“Paperless” vs “Digital” and “Terror” vs “Ordinary Damage”

Two stories this week in the Wall Street Journal illustrate once again how critical the choice of a category can be. A story titled "Scalpers Beware: New Laws Redefine What is a Ticket" ( ) reports that California's state legislature just voted to preserve the concept of the "paperless" ticket that can be redeemed only at the venue by the purchaser with the credit card used to buy it. This is the "Screw Stubhub Act" designed to prevent ticket resales, and the law contrasts "paperless" with "digital" tickets, which are receipts with QR codes that are scanned to gain entry to the event. They never exist on paper but according to the law they are not paperless. The key difference is that digital tickets can be resold, and Ticketmaster doesn't like that.

The second story, titled 'Terror' Threatens Insurance Payouts ( ), reports that the Boston businesses that suffered damage from the Marathon bombs are arguing that the bombings were not an act of terror, even though Obama and others have called them that. The reason it matters is that after the Sept 11 2001 "acts of terror" insurance companies changed their policies so that "terrorism coverage" was extra since it was a qualitatively different kind of incident than your ordinary fire or other property casualty that might damage a business. So if government makes an official categorization that was happened in Boston was terrorism, most businesses won't get their damages covered because few of them paid the extra premiums.

Automated Selection of a Publishing Venue Using “Semantic Technology”

Springer publishes over 2600 journals and maybe has decided that this is too many.  So it has created a "journal selector service" to help authors target an appropriate venue.

The Springer Journal Selector uses semantic technology to help you quickly choose the Springer journal that is right for your paper. Enter your abstract, description of your research, or a sample text and the Springer Journal Selector provides a list of relevant journals.

I can access this service at

So I thought I'd try the abstract for The Discipline of Organizing (see below) to see what Springer's "semantic technology" thinks the book is about. I'm happy to report that TDO doesn't get shoved into a narrow niche; the top six journal recommendations are:

  • Multimedia Tools and Applications
  • International Journal on Digital Libraries
  • Personal and Ubiquitous Computing
  • Journal of Science Education and Technology
  • World Wide Web
  • Education and Information Technologies

I think this nicely demonstrates that we're introducing TDO in a way that manages to communicate its multi- or trans-disciplinary essence. As a further test, I tried the first two paragraphs of TDO Chapter 1 and four of the top six recommendations were the same.

Here's the book abstract:

Organizing is such a common activity that we often do it without thinking much about it. In our daily lives we organize physical things--books on shelves, cutlery in kitchen drawers--and digital things--Web pages, MP3 files, scientific datasets. Millions of people create and browse Web sites, blog, tag, tweet, and upload and download content of all media types without thinking "I'm organizing now" or "I'm retrieving now."

This book offers a framework for the theory and practice of organizing that integrates information organization (IO) and information retrieval (IR), bridging the disciplinary chasms between Library and Information Science and Computer Science, each of which views and teaches IO and IR as separate topics and in substantially different ways. It introduces the unifying concept of an Organizing System--an intentionally arranged collection of resources and the interactions they support--and then explains the key concepts and challenges in the design and deployment of Organizing Systems in many domains, including libraries, museums, business information systems, personal information management, and social computing. Intended for classroom use or as a professional reference, the book covers the activities common to all organizing systems: identifying resources to be organized; organizing resources by describing and classifying them; designing resource-based interactions; and maintaining resources and organization over time. The book is extensively annotated with disciplinary-specific notes to ground it with relevant concepts and references of library science, computing, cognitive science, law, and business.

Instructing Readers to Read

There are some innovations in textbook design in The Discipline of Organizing, most notably:

  • The large proportion of the text that was factored into discipline-tagged endnotes
  • The use of content transclusion in glossary entries. To ensure that the glossary definitions are correct, they are transcluded from the definitions where they appear in the text rather than being copied or rewritten.
  • The completeness of the “hypertexting” of explicit and implicit references in the ebooks; if it looks like a link in TDO, it acts like a link. For example, you can follow links from headings in the table of contents to the identified section. When you follow an endnote superscript to the note, if there are bibliographic citations in the note, you can follow them to the bibliography, and if there’s a URI or DOI in the citation, you can follow it to the source on the web or in a digital library.

But how would a reader of the print TDO know about any of this? People are pretty familiar with endnotes at the end of chapters, but no textbook I know of has ever factored out 1/4 of the book into disciplinary supplements,  The disciplinary tagging is explained in the book preface, but do students read the preface to a textbook?

The other two innovations aren’t explicitly mentioned anywhere. Maybe hyperlinking from the T of C is common enough, as are cross references, so perhaps we don't need to explain those.  But citation linking is rarely as completely implemented as it is in TDO. As long as we're using ebook readers like IBooks and Kindle, there isn’t any way for the reader of the printed or ebook to ask for an explanation or guidance in how to take advantage of these design features of the book. So we have to develop some kind of instruction for readers,

I know this concern about whether readers can make good use of a book is not a new one. People in “media studies” or “literacy studies” are well aware of it. See Serafini, Frank. “Reading multimodal texts in the 21st century.” Research in the Schools 19, no. 1 (2012): 26-32

“As the complexity of the texts readers encounter increases, decoding, as a separate skill, becomes less an indicator of comprehension and should be viewed as only one aspect of a reader’s ability to navigate the multimodal landscapes encountered. In addition, non-linear structures, hypertext, visual images, and multimodal compositional structures need to be navigated by readers if they are to be successful in today’s educational settings.” (p. 28)

So how can we best instruct people how to make good use of TDO in print and ebook formats? I would appreciate any suggestions... Maybe we could make a movie showing how to use the book? But how do we inform readers that it exists?

--Bob Glushko

Classification Arbitrage

In economic contexts, "arbitrage" involves taking advantage of a price difference between two markets - you buy where the price is low and sell where the price is high.  A recent NY Times story discusses what we might call "classification arbitrage" - choosing a classification that gives the classified resource some property or advantage that another classification won't provide.  The story, titled "In a New Aisle, Energy Drinks Sidestep Some Rules", says that Monster Energy has changed its self-classification from "dietary supplement" to "beverage".

So what?  It turns out that dietary supplements are regulated by the US FDA, and beverages aren't.  So by changing the classification Monster no longer needs to inform the FDA about reports that link its high caffeine levels to health risks.

-bob glushko

Minimalist Self-Assessment in Ebooks, Part 2

Readers don’t pay as much attention to the graphics in textbooks as an author would like. This is especially frustrating when the graphics go beyond “decoration” or “illustration” and are designed to be “explanatory”. Perhaps we can encourage TDO readers to study the graphics by enhancing them with self-assessment guidance?

For example, Figure 1.2 Presentation, Logic and Storage Tiers is a complex depiction that is partly explained in the text:

Figure 1.2 illustrates the separation of the presentation, logic, and storage tiers for four different types of library Organizing Systems and for Google Books. No two of them are the same in every tier. Note how a library that uses inventory robots to manage the storage of books does not reveal this in its higher tiers.

Just as we can easily convert the “key points” at the end of Chapters 2-10 into self-assessments we should be able to do the same with the figure titles. Selecting the figure title could display some questions or suggestions about what to look for in the graphic, with another link to the answers or explanation like the italicized extract from Chapter 1 here.

-bob glushkotdo-chapter1.3.2-color

Save Our Citrus App

SaveOurCitrusScreenShotSection 10.7.3 of TDO is a case study titled "Japanese farms look to the cloud" that describes the organizing system for a highly automated farm. The farmworkers use mobile phones for communication, location tracking, and to take pictures of infected crops for review by an expert. Today I discovered a US analog to this last function. The Department of Agriculture has written a smartphone app called "Save Our Citrus" ( so people can see images of diseased fruit (like the one from my phone screenshot here), and then take photos of their own lemons and oranges and send them in for an analysis. Because I have several citrus trees in my garden, I downloaded the app to check it out and will send a photo of my lamest-looking lemon just to see what the quality of service is. The app also has a link to another US DoA site called "" that provides advice about insects that do bad things to plants.


Minimalist Self-Assessment in EBooks

We have been talking about ways to enhance TDO with interactivity, and one is to turn the Key Points at the end of TDO chapters 2-10 into self-assessments.

For example, the first three key points for chapter 2 are:

  1. Selection, organizing, interaction design, and maintenance activities occur
    in every organizing system.
  2. These activities are not identical in every domain, but the general terms enable
    communication and learning about domain-specific methods and vocabularies.
  3. The most fundamental decision for an organizing system is determining its
    resource domain, the group or type of resources that are being organized.

The simplest approach would be to hide each of these behind a linked question, like

  1. What four activities occur in every organizing system?
  2. What is the benefit of using general terms rather than domain-specific ones?

…with the idea that the student answers the question and then selects the link to see how he did…

But this is pretty shallow because the student isn’t required to be precise. So a second approach might be to popup a note window and have the student type in the answer, so he has something concrete to compare against the answer.

And so now we are blurring the line between self-assessment and annotation. Maybe the annotation mechanism can be used here.

-bob glushko


Organizing Home Media

I just discovered a nice little article:

Sease, Robin, and David W. McDonald. "The organization of home media." ACM Transactions on Computer-Human Interaction (TOCHI) 18, no. 2 (2011): 9.
McDonald is a professor at the University of Washington Information School and Sease was a master's student there.  The paper analyzes 20 interviews with people about their music and video collections (mostly about music) and discusses their organizing systems.  They report that people often use multiple organizing schemes, which they describe in nice detail.  Sease and McDonald don't use the exact vocabulary we use in TDO, but their way of talking about principles, resource properties, and the interactions supported by each scheme is very consistent with TDO.
If I'd found this paper earlier, it would be in the book. I think it will make a good supplemental reading or could be used in an assignment.

Publication Schedule for the book and ebooks

The Discipline of Organizing book is on its way to the printer and should be available in April 2013. Epub2 (most ebook readers) and mobi (Kindle) formats will be available at the same time.

-bob glushko

L28. Mobile and Multimedia IR

Lecture Readings

Schmitz, Patrick and Black, Michael – “The Delphi Toolkit: Enabling Semantic Search for Museum Collections”

Posted in Readings | Tagged | Leave a comment

Hearst, Marti – “Search User Interfaces” (Ch. 12 – “Emerging Trends in Search User Interfaces”)

Posted in Readings | Tagged | Leave a comment