\section{Related Work} \label{rel}
There are, so far, some attempts to measure personalization in an online information systems that apply personalized recommendation. One study has attempted to quantify personalization in Google Search \cite{hannak2013measuring}. The study recruited 200 Amazon Mechanical Turk users with Gmail accounts to participate in searching task in Google Search. The study also used newly created accounts, which are considered unsuitable for personalization as a result of having no history, to use them to generate search results that were used as baseline. The first page of the returned results were compared against the baseline results and also against each other. They used jaccard similarity and edit distance as metrics. The study reports that about $\mathit{11.7\%}$ of the results showed variation due to personalization. %The study also investigated the factors that caused variation in in personalization and it reports that loggingin and geography were were the factors.
% Why recommender systems can have the risk of filter bubble, but they do offer an irresistible appeal in that they offer content targeted to one’s interest.
Another study \cite{nguyen2014exploring} examined the effect of recommender system on the diversity of recommended and consumed items. The study was done on MovieLens dataset \footnote{http://grouplens.org/datasets/movielens/}. They separated recommendation-takers and recommendation-ignorers and examined the content diversity in those two groups. Two movies are compared using their attributes (tag genome data), using Euclidean distance. The distances between groups are measured as the average of the distances between the groups movies.
The finding is that indeed recommender systems create a condition where recommended items and items rated by users became narrower (less diverse) over time. However, when comparing recommendation takers and non-takers, the study found out that recommendation takers had more diverse consumed items than non-takers. The study was conducted on a recommendation system that uses item to item collaborative filtering, and as such its conclusions should be taken with that caveat.
% Personalization is ubiquitous on the internet. Recommender systems, e-commerce sites, search engines and social media platforms use personalization to better provide users with content and products that are tailored to their needs. Big search engines such as google [1] and Bing[2] have also implemented personalization on on search results. Personalization in recommendation is a core part of company revenues. For example in 2012 NetFlix reported that 75% of what users watched came from recommendation[3], and Amazon reported that 35% of its sales came from recommendations[4]. Beyong sales, personalized recommendations have been found to have greater influence on users than peers and experts[5]. They lower users’ decision effort and increase users’ decision quality[6]
% The ubiquity,importance and influence of personalization, it makes sense to want to quantify it. There are so far some attempts to measure it in an information system. One study has attempted to quantify personalization level in Google Search engine. The study recruited 200 Amazon Mechanical Turk users with gmail accounts to participate in searching in google. The study also used newly created accounts to use them as a baseline. The first page of the returned results were compared against the baseline results and against each other too. They used jaccard similarity and edit distance as metrics. The study reports that about 11.7% of the results showed variation due to personalization. The study also investigated the factors that caused variation in in personalization and it reports that loggingin and geography were were the factors.
%
% Why are recommender systems can have the risk of filter bubble, but they do offer an irresistible appeal in that they offer content targeted to one’s interest.
% Another study examined the effect of recommender system on the diversity of recommended and consumed items. the study was done on movieLens dataset. The separated recommendation-takers and ignorers and examines the content diversity in those two groups. Two movies are compared using their attributes (tag genome data), by a Euclidean distance. The distances between groups are measured as the average distance between the groups’ movies. The finding is that indeed recommender systems create a condition where recommended items become narrower over time.
The potential for personalization \cite{teevan2010potential} investigated how much improvement would be obtained if search engines were to personalize their search results, as opposed to providing search results that are the same for everyone, or a group. It used three datasets: explicit relevance judgments, behavioral relevance judgments (clicks on items) and content-based relevance judgments. The work showed that there was a huge potential for personalization in all the relevance judgments.
The work used DCG to measure the quality of a search result ranking. The best ranking for a user is one that ranks the items in the order the user ranked them in the case of explicit judgments, or clicked them in the case of the behavioral relevance judgment or in order of the size of their similarity score to previously consumed content in the case of content-based relevance judgment. So the ideal normalized DCG score for a perfect system is $\mathit{1}$. When we attempt the best ranking for two or more people (a group), the DCG score will be less than $\mathit{1}$ as the members will have different rankings. The difference between this group DCG score and the individual score is what they called the potential for personalization. The research reports that there was a potential to improve the ranking by 70\%. In terms of the relevance judgments, it reports that the content-based relevance judgments show the highest potential followed by explicit judgments and the click-based judgments respectively.
The paper concludes that the behavior-based relevance judgment can serve as a good proxy to explicit judgments in operational search engines.