Advertisement
Articles

Redefining RA: The Ideal Tool

E-Mail This Link


Enter recipient's e-mail:


Close
Email
Print |
RSS |
Share | |

Large-scale tagging projects outside libraries put users at the center and offer a model for readers' advisory

By Neal Wyatt -- Library Journal, 10/15/2009

Libraries are not so unlike market giants Netflix, Amazon, and Pandora. That is, they all thrive when librarians succeed at helping users make the next choice. Despite our significantly different business models and missions, we share an interest in facilitating the search for information, including what to read, view, and listen to next. The desire to harness data, user-oriented data, is driving innovation by these companies and others, and librarians of all stripes can and should learn from them. We, too, want our patrons to be able to use our tools to make connections among titles, whatever the format, match something with what they desire, and find the content they want—again and again.

In libraries, appeal is the readers' advisory (RA) framework used to help patrons find books they will enjoy. An entirely reader-driven concept, it rests upon the perceptions and desires of the reader. It considers the facets of a reading experience such as pacing, characterization, story line, and tone. But our RA tools at best make a cursory nod toward appeal and, at worst, ignore the concept entirely. We should be striving to integrate the concept of appeal better into the databases we use and make those tools continually responsive to the influence of the very people we serve.

The RA community has yet to demand appeal-based searching, and RA database designers have yet to deliver it, both of which need to change. It is a huge problem to figure out how to get appeal, that slippery stuff of ephemeral feeling and ill-defined synonyms, into what is basically a mathematical construct, but it is not impossible. Consider our cultural cousins in art and music. They, too, have complex appeal worlds to navigate, but Steve: The Museum Social Tagging Project and Pandora, an offshoot of the Music Genome Project, are both exploring innovative ways to make good use of the concept of appeal.

Project Steve

Steve is a collaboration of museum professionals who are exploring the use of social tagging to describe works of art and other collection objects. In a range of different interfaces, the museums working with Steve ask visitors to tell them what words describe the objects they see. Why do this? A few years ago, after the Metropolitan Museum of Art had created a state-of-the-art online collection for visitors to view objects, Susan Chun was in her office at the Met, looking at a report of search findings. She saw a lot of bad results. Visitors were getting either no hits or incomplete ones. They walked away from the catalog thinking the Met, with one of the largest collections in the United States, did not own a particular item—an item Chun knew could be hanging on a gallery wall. Now Chun is the project lead and co-principal investigator of Steve.

There was obviously a "semantic gap between the ways users queried and how we described," says Chun. "The system was not adequate to serving users who thought of art in systems of color, emotion, ideas, and events." It turns out that the Met's catalog was built to do what most library catalogs, and RA databases, do: support the work of the museum but not the work of the visitor. A painting is cataloged to support curatorial access, stewardship, and research. A record includes such things as condition, attribution, dates, materials, techniques, and storage or gallery location. Surprisingly, little in the record matches what the public sees when they look at an object, such things as color, content, and emotional response. Records designed for curators are largely useless to a visitor who knows she wants to view luminous paintings of sunsets but does not know Monet. Just like in a library, with our insider language "in processing" and "on order," the assumption in many museums is that the staff will train the viewer to see an object the way a curator does.

The visitor's view

The search report got Chun thinking. Just what could make the catalog work? Steve was born, named not by acronym or after a real person but with the easy to remember and approachable Steve. Chun and the other museum professionals at Steve "started working on ways to gather terms reflecting the ways that users search online collections," says Chun. "The Steve team wanted to find a way to discover a visitor's view." Steve asks visitors to tag works of art themselves, providing their own frames of reference and access points. Users tag on color, shape, emotion, and descriptions of objects or events. They enter tags on materials and methods and sometimes offer evaluations. Tags range from very personal perceptions to general descriptions to expert terminology. For example, tags for Port of Saint-Cast, a 19th-century landscape by Paul Signac, cover style (Pointillism), feel (calm, quiet, idyllic), colors (blue, yellow), and content (beach, beach with sailboats, coast, harbor, water, sky, sailboats, sand, sea, seashore, harbor, ocean, dots).

Steve also hopes to gather tags from a wide range of experts and resources. From the start, the Steve team thought that visitors' tags would be most useful when combined with other kinds of descriptions including terms taken from the museum catalog, from other museum publications, and from subject experts in other disciplines. For example, sending images of landscapes to botanists to ask about plant names allows the team to gather information that has not existed as part of the object record before. The Steve researchers settled on tagging because, as Chun points out, tagging is "a low barrier way" to surface connections based on the visitor's point of view. It is also illuminating. The initial research from Steve revealed that 82 percent of the terms viewers use to tag objects do not appear in the catalog records of those objects.

Putting tags to work

While Steve has collected hundreds of thousands of tags, gathering the data is the easiest part; putting it to work is more difficult. Ultimately, tags will have to be sifted and processed, weighted and classified. They will be turned into cataloging and database sets. The work is only beginning on what Chun calls "the hard science," the algorithms that churn the databases to pull out useful connections. These tools take a great deal of brainstorming and whiteboard space to develop, and there are lots of questions of vocabulary and connective tissue to discover. For example, Steve is working on developing comparison tables to create synonym charts so "beautiful" and "lovely" cross-connect. The team is also trying to figure out how to make tags less ambiguous and build interfaces that will encourage the production of the most useful tags.

Steve began as a way to create a better finding and discovery tool, but the results have shown that better access is only one of the outcomes. Chun points to a deeper engagement between the museum and its visitors and better understanding of how visitors view objects. It turns out that being asked to tag objects makes visitors feel that their opinions matter and are valuable to the museum. In an age when all cultural institutions are in dire need of public support, that is no small thing. Understanding how viewers see objects has been, in many ways, Steve's biggest breakthrough. Tagging has allowed users to speak without fear that one will say the wrong thing about art to an expert. And museum curators and educators are gaining vital insight into what visitors seem to know, understand, like, dislike, think of the collection, and what they connect and group together. This information "provides a wonderful opportunity to inform practice," says Chun, "what we teach, what we write about, how we write about it, what we exhibit, what we digitize, and how we catalog. This kind of evidence of visitor mind-set is something we have never had before. It leads to a fundamental shift in what we do and our relationship with our audience."

Pandora

Pandora is an online music service that creates playlists of surprising matches and new discoveries for a broad range of music based on listener input. If you have not yet played with Pandora, its book equivalent would be typing into a database that you enjoy Lorrie Moore and getting a rotating list of author read-alikes back. You might think we can do that now, and, indeed, there are databases we can buy that include read-alike suggestions. The question remains, what drives the match? When a database says that the early books of Anne Tyler are good matches for Lorrie Moore, is that connection built on appeal or on subject headings and genres? Odds are, unless it is a hand-built list, it is generated by subjects and narrowed by genre. But it does not have to be. Pandora offers an interesting model for thinking through how RA databases could one day work.

Once a listener enters a song or an artist's name, Pandora begins to make "some near-instantaneous, behind-the-scenes calculations," says Nathan Altice, an adjunct professor of sound communication and Ph.D. student in the media, art, and text program at Virginia Commonwealth University, Richmond. It then builds playlists of songs that are "musically similar" to the listener's starting song. The algorithms driving Pandora are engineered to account for multiple instances of listener input. As Altice explains, "If users so choose, they can make further refinements. The more time the user spends customizing, the better Pandora 'learns' about their musical preferences."

How does Pandora do this? Almost a decade ago, musician Tim Westergren and his colleagues Will Glaser and John Kraft created the Music Genome Project. Their goal was to break music down into its smallest fundamental parts and create a taxonomy. The results laid the groundwork for Pandora. The creation of the musical genome was a combination of scholarship and invention. "There is some amount of historical music theory," says Westergren, "but essentially it was a whiteboarding exercise and a lot of forensics." They started with the principles of SHMRFT—sound, harmony, melody, rhythm, form, and text—and went on from there. SHMRFT is one definition of the fundamental building blocks of music—and it is not so far away from the building blocks of appeal.

Extrapolating connections

The questions the project asked are both fascinating and fundamental. For example, they knew that rhythm mattered but did such things as the wah-wah effect applied to a guitar really factor into liking music for the same reason? And if so, how deeply did it matter? When the Music Genome Project was finished, its creators had broken down music elements into hundreds of factors that accounted for such things as the reputation of the musician, the aggressiveness of the drumming, and the articulation of certain instrumental solos. The result was a system that could extrapolate connections based on multiple discrete elements and codify those elements together to create something very close to an appeal-based playlist. Unlike Steve, which relies greatly on user-generated tags, the discrete elements Pandora uses are assigned by a staff trained in music theory.

"A large part of the Music Genome Project's success," says Altice, "is that it closely models the brain's evolved cognitive processes. Music is separated into its constituent components and graded according to an exhaustive set of categorical criteria: Is the tempo fast or slow? Are the pitches organized into a major or minor key? Is the singer male or female? What is the overall melodic contour? Each characteristic recorded is then coordinated into broader, more manageable categories, such as genre or style. Thus, Pandora creates hierarchies of groupings that are parsed according to the listener's interest. Our brain goes through a similar process every time we hear music."

A Pandora-based match goes something like this. Say a listener really enjoys "These Are the Days" by Jamie Cullum. Pandora matches it to "Scenes from an Italian Restaurant" by Billy Joel. Here is why: both are piano based and have heavy percussion and straightforward instrumentation. The vocals are slightly nasal, are in alto, and have what Pandora calls "Acoustic Sonority." The lyrics are well enunciated, and, says Westergren, "they are really prominent as an element in the tune; it's a basic rock song structure, and there is a little blues in there, too."

What is most important when making a match? "The tempo," says Westergren, "the groove, the rhythmic qualities. Is it in 3, or 6, or 4/4?" After that it is a toss-up, based on the contents of the song, the instruments used, vocals, and major and minor keys. The toss-up factor might account for the shift in the Pandora model as it was implemented and gained a wide fan base. Now, moving closer to Steve, Pandora also weights listener feedback into the music played, changing how choices are constructed when the community makes it clear that the playlist needs correction.

Both Steve and Pandora strongly push the integration of the user perspective, be it through Steve's user-generated tags or Pandora's community thumbs up/thumbs down voting. The user's perspective, what the user sees, hears, and responds to, drives the connections.

The RA connection

Between Pandora and Steve there might be an answer for libraries. The elements that matter in the work of Steve (color, emotion, objects) and Pandora (tempo, lyric, key) are the appeal cousins of RA's pace, character, language, setting, tone, and detail. The underlying desire of a viewer to connect with Port of Saint-Cast or a listener to jump from "These Are the Days" to other songs is the same desire that makes a reader want to go from Lorrie Moore to Anne Tyler. Shouldn't we then be asking for RA databases to follow Steve and Pandora and be built upon expert suggestions from a trusted peer group, reader input, and expansive appeal and subject cataloging, all supported by finely detailed algorithms capable of taking the data we feed it and coming up with nuanced suggestions? Or is there another path?

We should be inspired by the creativity at Netflix, for example, which recently tapped the brains of the research community with a contest to provide better recommendations in exchange for the use of massive data. Similarly, libraries have access to data (that can be stripped of personal information) and a motivated and massive user group that could be mined. Could RA-rich data become open access? For instance, instead of being segmented in RA databases, LibraryThing, GoodReads, on blogs and review sites, and in individual library system catalogs, RA content could be shared. It could become a rich collaboration of authors, readers, reviewers, publishers, and librarians. Database companies could compete not to own the data but to be the best at constructing algorithms to crunch it, at creating the best interface for it, and at integrating the data into our systems.

What we need

However we frame the questions, we need to ask them, and we need to start seeking answers. To get started, LJ convened another RA Big Think (see "An RA Big Think," LJ 7/07, p. 40–43) to ask what should be in the next generation RA database. The suggestions reflect what readers are asking on the floor and thus what RA librarians need and the group's own wishes as avid readers and hopes as RA experts. [For visual takes on the brainstorm, see diagrams on p. 40–42.]

As might be expected, appeal comes up a great deal. Almost everyone stressed the need for appeal-based searching and description. Searching is also a big concern. It is clear that the current methods of searching are not serving all our needs, nor are they creative enough to serve our patrons' point of entry. Just like that picture hanging in the Met, the book our patron wants might be waiting on our shelves, if only we had a way to find it by the jacket cover or a misremembered quote.

The content of the item record, beyond the basic bibliographic data, was also the focus of much conversation. Seattle-based Nancy Pearl, author of Book Lust and Book Crush, among others, and Jen Baker, a fiction librarian and readers' advisor at the Seattle Public Library (SPL), focused on author data, requesting author photos, links to interviews, and biographies that move beyond the typical PR line and link to an author's web site. David Wright, also a fiction librarian and readers' advisor at SPL, wants more detail on the time period and location of a title. Cindy Orr, Cleveland-based library consultant and editor of ReadersAdvisorOnline.com, desires series information, both publication order and reading order. Everyone craves lots of reviews, from a wide range of sources including the library press and from selected specialized magazines, popular magazines, book blogs, web sites, and information, as Wright puts it, "off the mainstream radar." To augment this, everyone also hopes for richly detailed plot summaries. Also often discussed were read-alikes and booklists, especially those on authors we have never heard of, as well as authors in high demand. We want them supported by why the matches were made, not just a listing. Tell us why Sara Donati is a good pairing with Diana Gabaldon; otherwise, the list is fairly useless.

The paths Steve and Pandora take also came up. Almost everyone asked for tagging and the ability for readers to tag and give input. We want other tags added from reviews, along with the ability to contribute tags ourselves. And we need this searchable in what amounts to, as Wright puts it, "a sophisticated and fine-grained tag cloud supported by advanced search features." We also pine for databases to act like Pandora, allowing us to ask for a read-alike for a particular author and receive an appeal-based list of explained suggestions.

In addition to appeal, more subtle and high-concept ways to search, more reviews, and read-alikes and booklists that include why the titles were matched, the Big Think team wants more sophisticated interaction with the data, enabling something fairly elusive—that spark of, yeah, that might work. As Joyce Saricks, author of The Readers' Advisory Guide to Genre Fiction, points out, "We are looking for possibilities rather than answers."

Complementary human expertise

The team is looking for a way to explore connections. SPL's Baker imagines it as a mind-mapping tool that shows in visual form possible connections and tangents, one that is created based on the input from the widest array possible of readers, librarians, reviewers, authors, and other experts.

This gets pretty close to what Altice calls "curatorial cyborgs," a melding of human experts and intricate computer-based algorithms. And it might be the direction in which we need to venture. RA has the human experts; what we need now is a database that manages to meld rich RA-infused data with an algorithm that lets us use it as we will.

If the day comes when a reader can open an RA database, input the title of a beloved book, and get back a list of suggestions that was collaboratively developed based on appeal, a range of expert input, and the books other readers suggest who also loved that title, then we will be well on our way to a database that supports our work. When an RA librarian can spend an hour working on a booklist using a tool that enables her to explore and map, note and annotate, and jump from one idea to the next, knowing that what she is exploring has been culled from authors, readers, reviewers, other RA librarians, and a bevy of additional experts who have all been cooperatively building a rich pool of data for years, and that behind them the database is tracking and creating a map of everything the librarian asks to be noted, then we might have arrived. For now, we can draw on the vision of the perfect database record and search to inspire such a future.


Author Information
Neal Wyatt, author of The Readers' Advisory Guide to Nonfiction, writes LJ's Wyatt's World and RA Crossroads and edits LJ's Reader's Shelf column. She lives in Richmond

 

The Big Think Team

Besides the author, Neal Wyatt, the Big Think Team consists of:

Jen Baker, book reviewer and librarian, Seattle Public Library

Cindy Orr, library consultant and editor, Reader's Advisor Online Blog

Nancy Pearl, creator of the "One Book" program and author of Book Lust, More Book Lust, and Book Crush

Joyce Saricks, readers' advisory consultant and author of The Readers' Advisory Guide to Genre Fiction and Readers' Advisory Service in the Public Library

David Wright, librarian at Seattle Public Library and "He Reads" columnist for Booklist





 

Welcome the LJ Archives.

This archive site is the home to all LJ articles published prior to January 2012;
Advertisement

LJ Reviews Database

LJ Reviews Center

Latest Stories



From the Blogs



Advertisement

Advertisement

Connect with Library Journal


Follow on Twitter








About Us | Advertising Information | Submissions | Site Map | Contact Us | RSS | Subscriptions
©2011 Media Source, Inc., All rights reserved.
Use of this Web site is subject to its Terms of Use | Privacy Policy
Media Source Inc. Media Source Inc. Media Source Inc. Media Source Inc. Media Source Inc. Media Source Inc.