You’re surfing the Web and discover a site that seems to have been waiting just for you. It knows your likes and dislikes, the last book you bought, and maybe even the kind of coat you’ll be shopping for this winter. Want to bring this technology to your own site? Read this book excerpt to learn more.
The Insider’s Guide to Collaborative Filtering
and Recommender Systems
Any sufficiently advanced technology is indistinguishable from magic.
-ARTHUR C. CLARKE, PROFILES OF THE FUTURE (1962)
Man is a tool-using animal….
Without tools he is nothing, with tools, he is all.
-THOMAS CARLYLE, SARTOR RESARTUS (1833-34)
A Brief History of Collaborative Filtering
Collaborative filtering is at the same time very new and very old. At its core, collaborative filtering is any mechanism whereby members of a community collaborate to identify what is good and what is bad. Even in prehistoric days, our species relied upon informal collaborative filtering. When a tribe encountered some new berry or root, they didn’t all eat it simultaneously. Some people watched to see if the others became ill. If those who had eaten did get sick, the others would use this strong negative recommendation by avoiding that food (e.g., the deadly nightshade); if not, they would eat it themselves.
All sorts of knowledge was gained through observation of our neighbors. We learned which animals were dangerous and which were tasty. Then, as we developed into societies where people had time for art, philosophy, and science, the same process of collaborative filtering helped us decide which creations and theories were worth our time and money.
We imagine that many listeners sat transfixed by Homer’s stories. Others might have considered him overrated and lobbied for their own favorite storyteller. These people, though they may not have known it at the time, were critics. As early as we had choices to make, we found critics to guide us. Today, as then, we can choose from among a variety of critics. Film critics help us decide which movies to see, theater critics lead us to the right play, and restaurant critics suggest a place to eat beforehand or afterward. In the financial world, analysts, brokers, and advisers recommend places to invest our money, and then members of the press critique the critics, helping us to better select our analysts, brokers, and advisers.
In addition to critics, we have editors and publishers to filter material for us. Because printing and distributing information can be expensive, commercial editors and publishers assess the marketability of a submitted work. On the opposite end of the spectrum, university presses, religion-affiliated publishers, and other not-for-profits are charged with forwarding an agenda. In either case, the editor and publisher work to identify content that they feel is worth distributing and, by implication, worth reading.
All of these—editors, publishers, critics, and even cave people —are examples of collaborative filtering. Collaborative filtering exists wherever people help filter out the wheat from the chaff, so the rest of us don’t have to. In return, we reward them (all except the cave people) with our patronage and purchases.
The examples we’ve described above are all manual processes of collaborative filtering, however. Editors, publishers, and critics pick products based upon their expert opinions. In other words, by hand. They don’t use automated systems to make these decisions for them. As a result, these editors and critics can’t tailor their products and reviews for each individual customer. The New Yorker, for instance, doesn’t publish a different magazine for each customer based upon his or her fields of interest. Instead, the New Yorker’s audience is universal (their entire customer base and beyond). Sometimes, though, people want and need personalized material. This is where automated information-gathering systems come in: namely, collaborative filtering and its two predecessors, information retrieval and information filtering.
Collaborative filtering is not the only, nor even the most prominent, technology for helping people find what they want. As soon as large collections were created, people organized them for better search performance. From the Great Library at Alexandria to the Library of Congress, organization and cataloging have made content more accessible. Through combinations of author, title, and subject indexes, library patrons can quickly and easily find books that match explicit search criteria.
Indeed, the problem of information retrieval—finding information in a large catalog—lent itself well to computerization. Gerard Salton’s seminal 1968 book, Automatic Information Organization and Retrieval, set forth the mechanisms for automatically indexing documents by examining the terms used within them (or within titles and summaries). Once a collection of documents is indexed in this way, a user can search for specific terms and retrieve documents of interest. Today, we see wide proliferation of such systems, including the widely used Web search engines. As the demand for search continues to grow, the quality of information-retrieval systems escalates as well.
Information-retrieval systems address a particular information niche: cases where users have ephemeral information needs and want to meet those needs by using a relatively stable, indexed collection. In addition to Web searches and library catalogs, this niche includes a wide variety of search tasks from finding a file on your The Insider’s Guide to Collaborative Filtering and Recommender Systems computer’s hard drive (when you’ve forgotten where you stored it) to searching through newspaper archives or corporate records.
Sometimes, however, the situation is reversed. Users may have a relatively stable information need, and want to check new information content to see if any of it meets that need. Information- filtering systems address this niche by either being told or learning the user’s need, and then examining a stream of new content to select items that meet it. The simplest information-filtering examples are clipping services. A corporate executive may want to see any newspaper articles that mention his company or its competitors. More sophisticated information-filtering systems help travelers find out when flights to a particular city go on sale, or avid readers discover when a favorite author has published a new book. Internally, information-filtering systems look like the mirror image of information-retrieval systems. The database stores a wide range of user profiles (or queries), and each new document gets passed through this database to see which profiles are triggered.
Computerized Collaborative Filtering
Both information retrieval and information filtering help people manage the problem of information overload by directing them to items that match their interests. Until recently, that seemed like enough, but the quantity of content available keeps increasing. In 1970, it may well have been possible for a corporate officer to read every article mentioning his company. By the year 2002, with the wide distribution of content on the Internet, it would take a team of officers to keep up. Something needed to be done to help people find items not only by topic, but also by quality or taste—and that something was the automation of collaborative filtering.
Automated collaborative filtering sprouted in three directions in quick succession, resulting in three different but compatible systems: pull-active CF, push-active CF, and automated CF. Since the three systems perform different tasks, you might even find all three on the same Web site.
Tapestry is widely recognized as the first computerized collaborative filtering system. Developed at Xerox PARC as a research project, Tapestry was designed to help small workgroups team up to figure out which articles (usually electronic-bulletin-board articles) were worth reading. Tapestry users could annotate articles, for example, by marking them as “Fred should look at this” or “Excellent!!!” Other users could ask the system to find articles that met specific criteria, including the keywords in the article (à la information retrieval and filtering), the annotations left by others, and even the actions others took when seeing the article. For example, a user might say, “I want to see all the articles that Mary replied to, since if Mary replied to them, they must have been interesting.” This type of collaborative filtering has become known as pull-active collaborative filtering because a user takes an active role in pulling recommendations out of the system (by forming queries).
Soon afterwards, David Maltz and Kate Ehrlich at Lotus Research developed a prototype push-active collaborative-filtering system. In their unnamed system, users who read interesting messages could easily “push” the content to others who might also value it. In some ways, this resembles today’s joke-distribution chains, where jokes are forwarded to friends who (hopefully) share the same sense of humor. In organizations, a select number of people share the responsibility of gathering information and distributing it to the right people. These people, either officially or unofficially, serve as information hubs. Push-active CF made their tasks much easier.
At about the same time, the GroupLens project was developing automated collaborative filtering. The major difference between active and automated collaborative filtering is that active collaborative filtering requires human effort to establish the relationship between the people making and the people receiving recommendations. Accordingly, active solutions work best in small communities (workgroups, friends, or family) where people already know each other and their tastes.
Automated collaborative filtering uses each individual’s history of interaction with the system to identify good recommenders for that individual. In the simplest form, automated collaborative- filtering systems keep track of each item a user has rated, along with how well the user liked that item. The system then figures out which users are good predictors for each other by examining the similarity between user tastes. Finally, the system uses these good predictors to find new items to recommend.
Soon after GroupLens appeared, a number of other automated collaborative-filtering systems emerged—clearly systems that were developed in parallel. MIT’s Media Lab debuted the Ringo (later Firefly) music recommender, which used collaborative filtering to help people find music. And Bellcore created the Video Recommender, in which people rated movies by E-mail and received recommendations in reply. The number of independently generated collaborative-filtering systems suggests that its time had surely come.
The Role of Today’s Marketer
With all this new recommender technology, some marketers are understandably concerned about their future. Will they be replaced the way tollbooth operators are being replaced by E-ZPass? Or as assembly-line workers have been replaced by machines? No. Recommenders need the right data, placement, and follow-through. They need human insight and direction. And they’re only part of an overall marketing strategy. Recommenders are like tools on a carpenter’s belt. So can marketers sleep easy? Yes, provided they know how, when, and where to employ recommenders.
This book examines many different recommender systems, in addition to collaborative filtering, so that marketers can initiate, update, or overhaul their recommendation practices. First, though, we should explain what we mean by marketing. The role of marketing and the marketer has evolved to keep pace with technological advances. Now, when we think of marketing, two separate fields emerge:
1. Manufacturer and wholesale marketing
2. Retailer marketing
Manufacturer and wholesale marketing refers to the efforts undertaken by manufacturers and distributors to promote products to merchants and the public, increase brand awareness, and generally position, price, and otherwise define the brand identity of a product. This is not the type of marketing we’re addressing here.
Retailer marketing, which for small marketers has often been synonymous with merchandising, focuses on the smaller, more local decisions of how to promote, price, bundle, and sell products. Because this type of marketing can engage individual customers, it is the most ripe for recommender systems. This has been our area of study.
For small retailers, sales and marketing may overlap. A bikestore owner might decide that she needs to sell more Cannondales because of their high profit margins, so she advertises a free helmet with the purchase of a Cannondale mountain bike. When her customers come in, she can recommend the bike-and-helmet specials based upon biking preferences they’ve demonstrated in the past. And if she knows that the customers prefer recumbent bikes, she won’t waste their time on Cannondale’s Jekylls and Scalpels.
Marketers for large retailers have lost touch with the customer; they make decisions that guide and drive sales from a distance. To deliver the personalized service customers demand, they need to narrow the gap. We’re not here to suggest that marketers become salespeople. We simply want them to deploy marketing techniques to serve each customer personally, the way good salespeople do.
The first step away from sales was mass marketing—the idea that a catalog, advertisement, or flyer could be sent to an extremely wide audience to get them to come buy. These marketing tools were necessarily untargeted but relatively cheap to produce. In an age of few alternatives (Sears or Montgomery Ward?), mass marketing The Insider’s Guide to Collaborative Filtering and Recommender Systems works fairly well. But generic advertising doesn’t reach out to people who are different from average. And it doesn’t work when customers have many shopping alternatives to choose from.
The chinks in mass marketing’s armor revealed themselves as media became more specialized. In the latter half of the twentieth century, a media explosion allowed marketers to narrowcast to audiences described by income, age, sex, race, religion, geography, and interest area. An advertisement for a product in a young women’s magazine, for example, could address a different audience than an advertisement in a men’s or parents’ venue. In addition, the availability of categorized mailing lists made it possible to send different mailings and offers to smaller groups of people. Instead of getting a generic message, people received messages they were more likely to identify with.
Even demographic-based marketing has its limits, though. Real people don’t fit cleanly into catalogs or simple categories. Some people reading young women’s magazines are those young women’s parents—people unlikely to be reached by the same advertisements. Some people in wealthy neighborhoods are cash-poor. Other “millionaires next door” live in modest neighborhoods and have no characteristics that reveal their wealth. These people fall through the cracks when using simple demographics, like wealthy neighborhoods, age, race, or sex, to determine marketing strategies.
Two things happened, largely in parallel, as technology continued to advance. Customer-relationship-management software and computerized record-keeping tools made it possible to pursue one-to- one marketing. This marketing model, first popularized by Peppers and Rogers in their 1993 book The One to One Future, makes an effort to treat customers individually, if only by tracking and remembering their preferences. At the same time, the Web and improvements in printing technology created cheaper delivery mechanisms. The Web, unlike physical stores, could present each customer with unique interfaces and tailored products—at virtually no extra cost. And efficient custom printing allowed each customer to get a semi-custom catalog, newsletter, coupon book, or offer. The convergence of these technologies resulted in the ability to deliver personalized messages. The only thing missing was the knowledge. One-to-one marketing still relied too heavily on human use of information. Determining what offers or products to display to each customer—especially on a mass scale—takes a lot of effort.
That’s where automated recommender systems come in. They close the gap between the goal of one-to-one marketing and the reality of limited sales effort. With recommender systems, marketers can now set up general promotions (whether on-line sales, crosssales by telephone, E-mail or physical mail campaigns, or in-store coupons and suggestions) and allow the technology to grind through the process of matching individual people with products and offers.
Rather than crunching numbers to figure out which income level gets which advertisement, today’s marketers decide which recommender technology and interfaces to implement and where. The variety and appeal of recommenders are growing rapidly. At Amazon, we discovered over twenty distinct recommenders! Marketers everywhere (not just on the Web) are boning up on the potential applications. For one thing, recommenders draw customers in like one-to-one merchants because they demonstrate knowledge of individual preferences. But by studying recommendations, marketers also learn more about product relationships and purchasing patterns. As they do, promotions and customer interests dovetail together in a way that mass marketers can only envy.
Recommender Technology and Interfaces
In our “Introduction,” we explained in general terms how collaborative filtering works. Now we’ll go into a little more depth and also introduce you to the other recommenders we examine throughout the book. In addition, we’ll explain how customers participate in the exchange of preferences and recommendations— the interfaces.
At the end of each company profile in Principles #1 through #8, we’ll remind you what recommenders these companies used and how the interfaces operated. With that in mind, you may find it The Insider’s Guide to Collaborative Filtering and Recommender Systems helpful to refer back to this chapter for more detailed descriptions of these recommenders and interfaces.
Automated Collaborative Filtering-The Technology
Automated collaborative-filtering systems depend on one thing: customer preferences. Customer preferences not only illustrate the taste of an individual customer, they also build the mountain of data necessary to establish effective nearest neighbors. So how do you collect these preferences? Obviously, sales are a good indicator of what customers prefer (especially if a customer purchases an item repeatedly). By studying how long a customer spends on a Web page, companies can establish whether or not the customer was interested in the products displayed there. If a customer prints or forwards a Web page, that indicates her preferences, as do items placed on a wish list or in a shopping basket. Customers might also tip their hand with reactions to recommendations that they’re given: Do they click on the product, do they buy it, do they ignore it, do they rate it poorly after having purchased it?
Once we have a set of ratings and/or preferences for a population of users, we can start making recommendations and predictions.
My wife said I should really go see the movie Beaches. I ask MovieLens, the personal movie recommender, “How well will I like Beaches?” The system then fetches my history of ratings (also known as my profile) and compares it to other users’, trying to match their profiles against mine. Profile matches can be scored in two ways. The correlation is the degree to which, for movies we both saw, we rated them similarly. The overlap is the number of movies we both rated. Ideally, I’d like to find a set of people who have a high correlation with me and who also have a high overlap. The high correlation means that we agree, and the high overlap indicates that our agreement isn’t just a fluke—it is based on a lot of information.
Next, we take the people who agree best with me (my nearest neighbors) and who have already seen Beaches. We then average their ratings for that movie to make a prediction for me. If these people liked it, I’ve got a date. If they didn’t, I have a discussion.
It’s that simple—almost. There’s actually a lot of math behind these calculations. In part, we do this because people rate differently. On a scale of 1 to 5, many people rate almost all movies 4 or 5—they either like it or love it. Others rate movies all the way from 1 to 5. To help match these people together, we normalize their ratings, which is a mathematical way of adjusting them to a similar scale. If, for example, someone uses only 3’s, 4’s, or 5’s when they rate movies and their mean rating is 3.7 (User #1), we might match them with someone with a lower mean rating (User #2). A movie rating of 4 for User #1, in other words, might be a 3 for User #2.
Then things get complicated. We give different ratings different weights based on the correlation and overlap of that person. Then come the business rules. We want to suggest items the customer doesn’t know about, products and inventory, and products that are likely to lead to customer loyalty.
If, instead of a prediction question, I asked a recommendation question (“What movie should I see?”), the collaborative-filtering system would again gather a neighborhood of people who agree well with me. It would then combine their ratings on all movies to determine which ones are best liked among people with tastes like mine, and would return that list of movies to me. Often these lists will be ranked based upon how strongly my nearest neighbors rate them. It’s interesting to note that movies that are “best liked” by my nearest neighbors are more useful to me than movies that are “most popular” (seen by all my nearest neighbors but not liked as strongly). “Best liked” movies may, in fact, not be popular at all. They may be very obscure and little seen, which makes these recommendations more valuable; after all, I may never have learned of their presence without the help of my nearest neighbors.
Tuning Recommendations and Predictions
We should hasten to point out that there are dozens of research papers exploring specific details on how to tune collaborative-filtering algorithms to make them work best for particular applications. Tuning can be quite complicated. It’s affected, in particular, by the density of ratings (what percentage of items a person rates) and the number of people and items.
There are also both research papers and unpublished trade secrets about making collaborative-filtering algorithms fast. Most commercial systems store all preference information, and do their best to use that information. They may settle for a “good” neighborhood if it is faster than the “best” one, however. And there are lots of tradeoffs about how many neighbors to consider for different questions. In practice, we suggest leaving these factors to the professionals. A commercial-strength recommendation engine will be tuned already, and experts can adjust it to match your application even better.
Complete List of Recommenders
A manual recommender provides recommendations that have been hand-generated by sales or marketing staff. These may be broad, impersonal recommendations (e.g., our editor’s picks) or manually crafted personal recommendations (e.g., the salesperson’s suggestions to a regular customer).
A searchable database isn’t a recommender per se, but it may appear like one to a customer. When the database is indexed in meaningful ways (with categories like clothes for toddlers or winter clothes), customers may be able to narrow their search significantly just by following the categories. Indeed, the category descriptions are a form of recommendation—they recommend sets of items the marketer thinks are useful to view together.
Segmentation is the division of customers into groups. Stores may decide to suggest different products to people based on their age, where they live, their income, or other criteria. Often segmentation is the result of extensive off-line data mining to determine statistically different populations. Segmentation recommendations are, therefore, group recommendations.
Statistical summarization is the presentation of ratings data in aggregate, rather than an attempt to turn that data into a personalized recommendation. Examples include displays of the “average score” for a book or the “number of people who liked” a particular movie. Statistics are generally most effective when they are simple and when they can be presented visually.
Social navigation includes a variety of technologies that make the behavior of other customers visible. In the bricks-and-mortar world, we can see customers clustering around a bargain table. In the virtual world, this can be done by displaying markers of past usage or indicators of current customers.
Custom proprietary recommender systems take advantage of expert knowledge of a domain to evaluate candidates. Ticketmaster, which recommends seats at a concert or sporting venue, and DoubleClick, which recommends ads, are two examples. They employ a set of confidential strategies based on extensive data and preference analysis.
Machine-learning techniques build a model of customer behavior from a set of data and then apply that model to future data. The techniques vary widely. Some techniques, such as neural networks, build a usable model that humans cannot directly understand. Others, such as rule-induction learning systems, produce sets of rules that humans may read to understand what has been learned.
Information-filtering techniques allow users to specify or demonstrate their preferences. The filters then scan vast quantities of material, looking for matches. In content domains such as news, an information filter might be instructed that the user wants to read any news about Chinese telephony, or might learn that the user tends to read articles with terms such as “telephone switches.” In product domains, these systems may learn or be instructed that the user tends to buy men’s clothing, in extra large, and is partial to short-sleeved shirts.
Collaborative filtering refers to a set of algorithms that uses the preferences of a community to recommend items to specific individuals. While there are manual collaborative-filtering systems that depend on people explicitly making or requesting recommendations, most commercial applications use automated systems that gather customer preferences, identify customers with similar tastes, and use their experiences to recommend products for each individual.
Combination recommenders can employ a variety of the above techniques. Sometimes different techniques are used separately and the results are merged (e.g., a list of ten recommendations may include five generated manually and five more from automated collaborative filtering). In more sophisticated systems, the techniques are combined based on how much evidence there is of the correctness of a particular technique for that use. Hence, a customer with a detailed profile may get mostly machine-learning recommendations, while a new customer may get mostly statistical summaries.
Interfaces: Inputs and Outputs
Naturally, we don’t expect customers to know or even recognize all the recommenders we’ve just described. What’s important from the customers’ vantage point is what preference and product information they need to supply, and how recommendations are presented to them. In this section, we’ll describe the three different types of inputs (explicit, implicit, and community-based) and four outputs (suggestion, prediction, ratings, and reviews).
EXPLICIT AND IMPLICIT
Inputs are simply the ways customers demonstrate their preferences. These inputs can be explicit (specifically elicited for the purpose of building a profile) or implicit (observed inputs generated from a customer’s natural interaction with a site). The most common explicit inputs are ratings, numerical or symbolic assessments of a product, and keywords/attributes, declared interests of the customer. The most common implicit inputs are purchase history and navigation. Purchase history tells which products a customer found valuable, and navigation (including both products and information viewed and items placed in shopping carts) helps identify the customer’s current interests.
Other inputs reflect the community. These include the ratings, purchase history, and navigation of others, as well as reviews those others may have written. The classification of products itself (films and books sorted into genres, for example) often is derived from community-wide standards. And popularity measures such as boxoffice sales or best-seller status help customers understand what the community finds valuable.
The simplest output type is a suggestion; this is just the mention of a product, possibly not even explicitly identified as a recommendation. For example, when a product appears in the “would you like this while you’re checking out” area, it is usually a suggestion, as would be a product selected to appear on the home page.
Some systems go farther than simple suggestions by actually attaching a numerical or symbolic prediction of how well you’ll like the product. The Zagat restaurant guide, for example, posts numbers in the food, service, and d?cor categories.
RATINGS AND REVIEWS
A number of systems allow their customers to view directly the ratings or reviews entered by other customers. This is particularly common in venues where there are many different items to rate. Amazon.com, for example, encourages its customers to rate and review books; this information is then made available to other customers. eBay encourages buyers and sellers to rate (and review) each other, presenting both a summary of the ratings and the complete set of reviews for others considering doing business with the same party.
As we discussed earlier, recommendations can be proactively pushed to a customer, made available for the customer to pull, or simply placed in a natural location where they will appear passively.
Examples of pushing recommendations include the variety of Email interfaces where businesses promote a set of products as well as the obnoxious pop-up windows that force you to acknowledge a suggested product before continuing. Annotations (starring recommended products in an unpersonalized listing, for example) are a far more subtle means of securing the customer’s gaze. Query interfaces or links (to a top-ten products list, for example) allow customers to pull recommendations actively, giving them even more autonomy.
As recommendations become more pervasive and less novel, marketers are moving toward more passive displays. Just as supermarkets don’t put a sign on the eye-level shelf saying “These products were placed at eye level because we think you’ll buy them,” Web sites, too, are finding that they can simply place personally selected products in appropriate spots and increase business.
Recommenders in Action
Now that you’ve been introduced to the types of recommenders out on the market, we’ll examine them in their natural habitats. With the number and variety of companies we profile, you will be sure to find personalization strategies that fit your business.
Copyright © 2002 by Professors John Riedl and Joseph Konstan