Recommendation system

From Citizendium
Revision as of 14:13, 12 August 2010 by imported>Yash Prabhu (→‎General requirements for recommendation systems)
Jump to navigation Jump to search
All unapproved Citizendium articles may contain errors of fact, bias, grammar etc. A version of an article is unapproved unless it is marked as citable with a dedicated green template at the top of the page, as in this version of the 'Biology' article. Citable articles are intended to be of reasonably high quality. The participants in the Citizendium project make no representations about the reliability of Citizendium articles or, generally, their suitability for any purpose.

Nuvola apps kbounce green.png
Nuvola apps kbounce green.png
This article is currently being developed as part of an Eduzendium student project. The course homepage can be found at CZ:Special_Topics_2010.
To provide students with experience in collaboration, you are warmly invited to join in here, or to leave comments on the discussion page. The anticipated date of course completion is 13 August 2010. One month after that date at the latest, this notice shall be removed.
Besides, many other Citizendium articles welcome your collaboration!


This article is developing and not approved.
Main Article
Discussion
Related Articles  [?]
Bibliography  [?]
External Links  [?]
Citable Version  [?]
 
This editable Main Article is under development and subject to a disclaimer.

A recommendation system is a software program which attempts to narrow down selections for users based on their expressed preferences, past behavior, or other data which can be mined about the user or other users with similar interests.

History

Recommendation systems have their roots in "Usenet," a worldwide distributed discussion system originating at Duke University in the late 1970s. Usenet operated in a client/server format, allowing user input that was categorized into specific "newsgroups." In Usenet, the posts made by users are categorized into these newsgroups, which are then further divided into sub-categories, if needed.

Information Filtering (IF) is a way of sifting through the overabundance of data on the Web. As newsgroups grew exponentially, database administrators were scrambling for a way to reduce e-clutter. Some of the early solutions for data overload include: - Tapestry - developed by Xerox, they coined the phrase "collaborative filtering" - Lotus Notes - a component of this software had built-in collaborative filtering mechanisms - GroupLens - started in 1992, this Open Source project was built on the premise of Tapestry with the intention of simplifying Usenet data by using distributed networks that addressed privacy issues and making suggestions according to others' ratings

Pattie Maes was primarily responsible for collaborative filtering with the advent of her efforts at MIT on a system called "Firefly", a recommendation system for music lovers. Firefly was later purchased by Microsoft for an estimated 40 million dollars.

Through the 1990s and beyond, collaborative filtering recommendation systems included: Mosaic – First graphical browser allowing users to publish comments to Web pages HOMR – Helpful Online Music Recommendations; predecessor to Firefly Ringo – Social Information filtering system for music recommendations Firefly – Grew out of Ringo project, music and movies Yahoo! – Started by Princeton students David Filo and Jerry Yang Point’s Top 5% - NYC-based qualitative website rating PHOAKS – People Helping One Another Know Stuff Fab – Allowed users to create content-based filters Webdoggie – Helped people find websites according to their likes Alexa Internet – When someone visits a website, Alexa displays other websites they might be interested in

Recommendation systems are now an integral part of Amazon.com's purchasing power!

Classification

The current generation of recommendation methods can be broadly classifed into the following five categories, based on the knowledge sources they use to make recommendations.:
1. Content-based recommendations.
2. Collaborative recommendations.
3. Knowledge-based recommendations.
4. demographic recommendations.
5. Hybrid recommendations.

General requirements for recommendation systems

To make a viable recommendation, three things are needed:
(i) background information - the information that the system has before the recommendation process begins.
(ii) input information - the information that a user must enter to the system in order to trigger a recommendation.
(iii) an algorithm that combines background and input information to arrive at its suggestions.

History

Recommendation systems have their roots in "Usenet," a worldwide distributed discussion system originating at Duke University in the late 1970s. Usenet operated in a client/server format, allowing user input that was categorized into specific "newsgroups." In Usenet, the posts made by users are categorized into these newsgroups, which are then further divided into sub-categories, if needed.

Information Filtering (IF) is a way of sifting through the overabundance of data on the Web. As newsgroups grew exponentially, database administrators were scrambling for a way to reduce e-clutter. Some of the early solutions for data overload include: - Tapestry - developed by Xerox, they coined the phrase "collaborative filtering" - Lotus Notes - a component of this software had built-in collaborative filtering mechanisms - GroupLens - started in 1992, this Open Source project was built on the premise of Tapestry with the intention of simplifying Usenet data by using distributed networks that addressed privacy issues and making suggestions according to others' ratings

Pattie Maes was primarily responsible for collaborative filtering with the advent of her efforts at MIT on a system called "Firefly", a recommendation system for music lovers. Firefly was later purchased by Microsoft for an estimated 40 million dollars.

Through the 1990s and beyond, collaborative filtering recommendation systems included: Mosaic – First graphical browser allowing users to publish comments to Web pages HOMR – Helpful Online Music Recommendations; predecessor to Firefly Ringo – Social Information filtering system for music recommendations Firefly – Grew out of Ringo project, music and movies Yahoo! – Started by Princeton students David Filo and Jerry Yang Point’s Top 5% - NYC-based qualitative website rating PHOAKS – People Helping One Another Know Stuff Fab – Allowed users to create content-based filters Webdoggie – Helped people find websites according to their likes Alexa Internet – When someone visits a website, Alexa displays other websites they might be interested in

Recommendation systems are now an integral part of Amazon.com's purchasing power!

Classification

The current generation of recommendation methods can be broadly classifed into the following five categories, based on the knowledge sources they use to make recommendations.:
1. Content-based recommendations.
2. Collaborative recommendations.
3. Knowledge-based recommendations.
4. demographic recommendations.
5. Hybrid recommendations.

General requirements for recommendation systems

To make a viable recommendation, three things are needed:
(i) background information - the information that the system has before the recommendation process begins.
(ii) input information - the information that a user must enter to the system in order to trigger a recommendation.
(iii) an algorithm that combines background and input information to arrive at its suggestions.

1.Content-based recommendation

The user will be recommended items similar to the ones the user preferred in the past. For example, in a book recommendation application, in order to recommend books to user u, the content-based recommendation system looks for the similarities among the books user u has rated highly in the past (specific writers, genres, subject matter). Then only the books that have a high degree of similarity to whatever the user’s preferences are would be recommended. Content-based systems are designed mostly to recommend text-based items. The preferences that get evaluated are “keywords.” Content-based recommendations can either be:

  • Memory/Heuristic (uses frequency, inverse document frequency (TF-IDF) text retrieval method)
  • Model based (uses Decision trees, neural networks, Bayesian classifiers, Clustering or vector-based representations)
Advantages of Content-based recommendation.
Disadvantages of Content-based recommendation.

2.Collaborative RS

Collaborative recommendation systems recommend items that people with similar taste preferred in the past. See also: Collaborative filtering

3.Knowledge-based recommendation

Utilizes the knowledge about users and products and reasons out what products meet the users requirements. Some of the systems being used at present effectively walk the user down a discrimination tree of product attributes whereas others have adopted a quantitative decision support tool for this task.

Advantages of Knowledge-based recommendation.

It doesn't have the “ramp-up” problem since its recommendations don’t depend on any database of user ratings. Users are encouraged to explore and understand the information space and by doing so,they elaborate more on their needs.

Disadvantages of Knowledge-based recommendation.

It requires an engineered knowledge database to make useful recommendations. This knowledge base has to be updated to keep up with the ever-changing consumer ratings and preferences. This system tends to give static suggestions that limit the user to what is contained in the database.

4.Demographic-based recommendation

Categorizes the user based on personal attributes and makes recommendations based on demographic classes, e.g. college students, teenagers, women, men, etc. The advantages and disadvantages of this system are similar to those of Knowledge-based Recommendation Systems.

5.Hybrid RS

All the above mentioned Systems have complementary strengths and weaknesses. A Hybrid recommendation system combines two or more recommendation techniques to gain better system optimization and fewer of the weaknesses of any individual ones. The most popular Hybrids are those of content-based/collaborative filtering.


Methods/Strategies of Hybridization.

There are different strategies by which hybridization can be achieved and they are broadly classified into 7 categories.
1. Weighted
For Example, Implementing collaborative and content based methods separately and then combining their predictions.

2. Switching
A certain switching criterion is used by the system to interchange between two recommendation systems operating on the same object.

3. Feature Combination
Features from different recommendation systems data sources are put into a single recommendation algorithm.

4. Cascading
For this category,one recommendation system refines the results given by another.

5. Meta Level
In this case a feature such as a Model learned by one recommendation is used as input to another.It differs from Feature Augmentation System,in that the entire model is used as input.

6. Feature Augmentation
The output of one System is used as an input feature to another; for example, using the model generated by one to generate features that are used by another. 

7. Mixed
Incorporates two or more techniques at the same time; for example: content based and collaborative filtering.

Issues

The five most challenging issues recommendation systems face are: 1) Changing Data - Trying to keep pace with people's tastes and changing opinions 2) Lack of Data - Getting users to rate products and enter information about their likes and dislikes 3) Updating User Preferences - Initial preferences are stored but need to be updated by users 4) Unpredictable Results - How would you ever guess that someone who listens to Barry Manilow is also a Depeche Mode fan? 5) Lots of Work! - Although they might look simple, recommedation systems take lots and lots of computations

More RS Issues - Constructing accurate user models - Compatibility among models - Shilling - SMS Spam - Preferences crossing domains - Costs

Future

The future of recommendation systems is unclear. Options discussed include:

- OpenFolders (OpenCola) is a way of storing recommendations on a user's computer so that when he or she logs on each day, a folder containing current items of interest is available. This idea has had mixed reviews and controversy regarding privacy.

- Other recommendation systems might use commonality to promote diversity--This technique takes advantage of the commonalities between two parties and used to promote a product or service that one favors but not the other. It's based on a trust mechanism: if Sally and Sarah both like yoga, but Sally also likes motorcycle riding, maybe Sarah will, too.

Recommendation systems are also being targeted to the following industries: - Intelligent tourist and restaurant guides - Navigation aids - Shopping systems the recommend based on user behavior

Recent Press

Wired.com recently released a great article on Caterina Fake and her work with Hunch.com especially with respect to the cold start problem.[1]

References

1. Toward the Next Generation of Recommender Systems: A Survey of the State-of-the-Art and Possible Extensions
2. Privacy-enhanced personalization

2.Collaborative RS

Collaborative recommendation systems recommend items that people with similar taste preferred in the past. See also: Collaborative filtering

3.Knowledge-based recommendation

Utilizes the knowledge about users and products and reasons out what products meet the users requirements. Some of the systems being used at present effectively walk the user down a discrimination tree of product attributes whereas others have adopted a quantitative decision support tool for this task.

Advantages of Knowledge-based recommendation.

It doesn't have the “ramp-up” problem since its recommendations don’t depend on any database of user ratings. Users are encouraged to explore and understand the information space and by doing so,they elaborate more on their needs.

Disadvantages of Knowledge-based recommendation.

It requires an engineered knowledge database to make useful recommendations. This knowledge base has to be updated to keep up with the ever-changing consumer ratings and preferences. This system tends to give static suggestions that limit the user to what is contained in the database.

4.Demographic-based recommendation

Categorizes the user based on personal attributes and makes recommendations based on demographic classes, e.g. college students, teenagers, women, men, etc. The advantages and disadvantages of this system are similar to those of Knowledge-based Recommendation Systems.

5.Hybrid RS

All the above mentioned Systems have complementary strengths and weaknesses. A Hybrid recommendation system combines two or more recommendation techniques to gain better system optimization and fewer of the weaknesses of any individual ones. The most popular Hybrids are those of content-based/collaborative filtering.


Methods/Strategies of Hybridization.

There are different strategies by which hybridization can be achieved and they are broadly classified into 7 categories.
1. Weighted
For Example, Implementing collaborative and content based methods separately and then combining their predictions.

2. Switching
A certain switching criterion is used by the system to interchange between two recommendation systems operating on the same object.

3. Feature Combination
Features from different recommendation systems data sources are put into a single recommendation algorithm.

4. Cascading
For this category,one recommendation system refines the results given by another.

5. Meta Level
In this case a feature such as a Model learned by one recommendation is used as input to another.It differs from Feature Augmentation System,in that the entire model is used as input.

6. Feature Augmentation
The output of one System is used as an input feature to another; for example, using the model generated by one to generate features that are used by another. 

7. Mixed
Incorporates two or more techniques at the same time; for example: content based and collaborative filtering.

Issues

The five most challenging issues recommendation systems face are: 1) Changing Data - Trying to keep pace with people's tastes and changing opinions 2) Lack of Data - Getting users to rate products and enter information about their likes and dislikes 3) Updating User Preferences - Initial preferences are stored but need to be updated by users 4) Unpredictable Results - How would you ever guess that someone who listens to Barry Manilow is also a Depeche Mode fan? 5) Lots of Work! - Although they might look simple, recommedation systems take lots and lots of computations

More RS Issues - Constructing accurate user models - Compatibility among models - Shilling - SMS Spam - Preferences crossing domains - Costs

Future

The future of recommendation systems is unclear. Options discussed include:

- OpenFolders (OpenCola) is a way of storing recommendations on a user's computer so that when he or she logs on each day, a folder containing current items of interest is available. This idea has had mixed reviews and controversy regarding privacy.

- Other recommendation systems might use commonality to promote diversity--This technique takes advantage of the commonalities between two parties and used to promote a product or service that one favors but not the other. It's based on a trust mechanism: if Sally and Sarah both like yoga, but Sally also likes motorcycle riding, maybe Sarah will, too.

Recommendation systems are also being targeted to the following industries: - Intelligent tourist and restaurant guides - Navigation aids - Shopping systems the recommend based on user behavior

Recent Press

Wired.com recently released a great article on Caterina Fake and her work with Hunch.com especially with respect to the cold start problem.[1]

References

1. Toward the Next Generation of Recommender Systems: A Survey of the State-of-the-Art and Possible Extensions
2. Privacy-enhanced personalization