Boing Boing points to a description of an unconventional filing system proposed by Japenese economist Noguchi Yukio:
- All documents, regardless of their class, level of importance, or perceived chance of being required at a later date are stored in A4-sized envelopes, which have the flaps cut off[...] By "all documents," Noguchi means just that. He puts all categories of documents, including things like membership lists and his passport in envelopes.
- The title and date of the document are marked on the side of the envelope, as shown, and the envelopes are stored vertically on a bookshelf.
- Absolutely no "classification" of documents is attempted. The color coding is optional, and used only to shorten the amount of time to find a document.
- New documents (envelopes) are added at the left end of the "envelope buffer," and whenever a document is used (i.e., the envelope removed from the shelf), it is returned to the left end of the bookshelf. The result of this system is that the most recent (and frequently) used documents migrate to the left, while documents that are not used often or not used at all migrate to the right.
Algorithms folks should immediately recognize this as the Move To Front heuristic for dynamic list maintenance: After accessing any item in a linked list, move it to the front of the list. The cost of accessing an element of the list is equal to its rank in the list. So if you access the same item twice in a row, the second access has cost 1.
Moreover, assuming your file drawer has a fixed size, the Noguchi Filing System is also the same as the Least Recently Used paging algorithm buried inside every operating system on the planet: When your memory (or filing cabinet) overflows, throw away the page (or file) that you accessed least recently. The cost accessing a page is 0 if the page is in memory, and 1 if it isn't.
These are both classical examples of competitive online algorithms. An online algorithm is an algorithm that responds to a sequence of requests—for example, "Put this file in the drawer", or "Get my passport out of the drawer". The algorithm must handle each request as soon as it arrives. There is some cost associated with the algorithm's behavior—for example, the number of files that we examine. At the end of the day, we will most likely look back and realize that the algorithm made some sub-optimal decisions that led to extra work, but such wasted effort is inevitable. The algorithm can't look into the future; it has to handle each request immediately. Our goal in designing a good online algorithm is to minimize the amount of wasted effort.
This goal can be formalized as follows. Let A(X) denote the cost of online algorithm A on request sequence X, and let OPT(X) be the cost of the best possible offline algorithm, meaning an algorithm that knows the entire request sequence in advance. The competitive ratio of algorithm A is the maximum, over all request sequences X, of the ratio A(X)/OPT(X). If the competitive ratio is R, we call the algorithm R-competitive. We want this ratio to be as small as possible.
My poor departed high school English teachers are spinning in their graves at the ugly phrase "competitive ratio". After all, it's not the ratio that's competitive, but the algorithm. It really should have been called "competitiveness". But once the term was introduced, it stuck. Alas, we theoretical computer scientists aren't known for our L33t gr@mm3r 5ki11z.
In fact, both MTF and LRU are motivating examples in the paper of Danny Sleator and Bob Tarjan [pdf] that first introduced competitive analysis. Sleator and Tarjan proved the following theorems, among others:
- Move to Front is 2-competitive, and this is the best possible competitive ratio of any (deterministic) online list-access algorithm.
- Least Recently Used is M-competitive, where M is the number of pages that fit into memory, and this is the best possible competitive ratio of any (deterministic) online paging algorithm.
So no wonder the Noguchi Filing System is so attractive—in a very real sense, it's the best possible filing system!
Well, almost. We can do slightly better, at least on average, if we introduce some randomness. The following list-access algorithm, called BIT, is 7/4-competitive. Whenever you access a file that you've accessed before, move it to the front of the list. But the first time you access a file, flip a coin, and move the file to the front of the list only if the coin comes up heads.
Even this isn't the best algorithm known; another slightly more complicated algorithm is 8/5-competitive. On the other hand, the best lower bound known for the competitive ratio of randomized list-access algorithms is 1.50084, so there is still lots of room for improvement.

Recent Comments