HCDA/sigir2016repo Changeset - 4328ade5acae · Centrum Wiskunde & Informatica (CWI)

Changeset - 4328ade5acae

Parent rev.

Child rev.

[Not reviewed]

0 1 0

Gebrekirstos Gebremeskel - 10 years ago 2016-02-11 16:36:48
destinycome@gmail.com

update

1 file changed with 3 insertions and 2 deletions:

main.tex

0 comments (0 inline, 0 general)

main.tex

➞

Show inline comments

 \hline
 &Berlin&politics&wirtschaft&sport&kultur&weltspiegel&meinung&medien&wissen\\
 \hline
 berlin&0.14&0.08&0.06&0.05&0.06&0.12&0.12&0.16&0.06\\
 politik&0.2&0.39&0.06&0.12&0.04&0.3&0&0.73&0.1\\
 wirtschaft&0.15&0.4&0.07&0.13&0.36&0.13&0.46&0.21&0\\
 sport&0.14&0.27&0&0.68&0.05&0.18&0&0.27&0.07\\
 kultur&0.11&0&0&0.06&0.07&0&0&0.07&0\\
 weltspiegel&0.24&0.27&0.06&0.13&0.17&0.13&0&0.4&0.18\\
 meinung&0.06&0&0&0&0&0&1.85&0.32&0\\
 medien&0.1&0.85&0&0.06&0&0.08&0&0.16&0\\
 wissen&0.02&0&0&0&0.11&0.15&0&0&0\\
  \hline
   \end{tabular}
 \end{table*}
 %  For example  if we look at the category of politics , we see that the CTR from politics to politics is the highest than from politics to any other category.  We also observe that the CTR from local category Berlin to politics is higher than from the local category Berlin to any other category including to itself. A little surprising result is the high CTR from media to politics.
 % The way we extracted our recommendations and clciks is a little uncertan. In the Plista setting, when click results are reported to users, they are not known whose recommendations are being clicked. So while we know our recommendation, we do not know for sure how much of the click notifications that we recieve belong to our recommendations. To extract our clciks, we introduced a time frame of 5 minutes. That is if the click notification happens in with in a range of time, in our case 5 minutes, we consider the clcik is on our recommendations. We consider the click information is a bit inflated for users might not stay for more than 5 minutes. While the actual CTR might be a bit inflated as a result of the inflated number of clicks, we consider the relative scores as indicative of the true difference.
 % To find out therelationship between base item recommendation pairs that resulted in high CTR scoores, we selected some item-recommendations pairs. To avoid selecting item-recommendation pairs that have very low views and clicks which is usually the type of combination that results in high CTR scores, we first sort our data according to views, and according to clicks. Using  cutt off values, we repeat the intersection until we find the items that have both the highest view and the hight clicks. Using this approach we selected 12 item-recommendation pairs and out of them we selected the 5 pairs that have the highest score. These pairs are presented in Table \ref{}
 \subsection{Item-level Base and Recommendation CTRs}
 % We look at the two types of item-level CTR's:the base item CTRs and the recommendation CTRs.  The base item CTR measures how likely the base item is to trigger clicks on recommendation. We assume that part if clicking on recommendations is a function of the item the user is reading. this is corroborated by the category-level CTr's that we looked at above in thesense that some categories do not generate clicks. even if the item are from clickable categories. The recommendation CTR's ameasures how likely the item is to recieve a click when recomened to a user regardless of the category of the base item.  But, should we not be concerned about the base item?
 % We plan to extract a sample of base items with  recommended and clicked items and separate them into clicked and rejected recommendations. We then compare the contenet of the clicked items with the contenet of the base item. We also do the same with the rejected items and see if there is any similarities/differences bertween these two categories.  The sepration of clicked and rejected items and comparing them to the base item is similar to the sepration of recommended moviews into viwed and ignored in \cite{nguyen2014exploring}.
+%
 % On the same dataset, there has been a study on the transition probababilities of users on the categories  This study was on genral reading. In this study 1) we repeat the same study on a dataset from a different time and 2) we analyze results in terms of similarity of content with the base items.
+%
+%
 % Question for myself: Is it maybe possible to compute the category CTR's? Like a hitmap of the CTRs where the recommendations are subsidvided to their categories and a CTR is computed? I think so. We can also go durther and look at the contenet similarities. Further, we can look at what type of items trigger more clicks by selecting some items which generated more clicks and analyzing them.
 At the item level, we tried to investigate whether there is a relationship, in triggering clicks on recommendations, between the base items and the recommended items. More specifically, are the base items that are more likely to trigger recommendation also the ones that are more likely to trigger recommendation upon recommendations? To accomplish this task, we first computed the CTRs separately  for base items and recommendation items and then intersected them to find the items that are in both. It is important to state here that we have more items in our recommendations than in our base items. This is  because we are only requested to provide recommendation to some items, while we have all items to use for our recommendation. We had a total of $\mathit{55708}$ items in our recommendation items and $\mathit{18967}$ on our base items. The intersection resulted in $\mathit{15221}$ items for which we looked at the CTRs they score when they are used as base items and as recommendation item.
 At the item level, we  investigated whether the %re is a relationship, in triggering clicks on recommendations, between the base items and the recommended items. More specifically, are
 the base items that are more likely to trigger recommendation are also the ones that are more likely to be clicked  upon recommendations. To accomplish this, we first computed the CTRs, separately,  for base items and recommendation items and then intersected them to find the items that are in both. It is important to state here that we have more items in our recommendations than in our base items. This is  because we are only requested to provide recommendation to some items, while we have all items to use for  recommendation. We had a total of $\mathit{55708}$ items in our recommendation items and $\mathit{18967}$ on our base items. The intersection resulted in $\mathit{15221}$ items for which we looked at the CTRs they score when they are used as base items and as recommendation items.
-To better visualize the results, we present  two plots. In Figure \ref{fig:view_click_base}, we present  plots generated by sorting the results by base CTR scores. Blue plot is base CTR and red plot is recommendation CTR.  What we observe here is that although the base items that are more likely to trigger clicks on recommendations are also the items that are more likely to trigger clicks upon their recommendations, there are many other items that are more likely to trigger clicks upon their recommendation, but not when they are base items. To visualize this better, we also sorted the results by recommendation CTRs, and we obtained the plots in Figure \ref{fig:view_click_reco}. We observe here the base items (the blue lines) that are more likely to trigger clicks on recommendation are a subset of  the recommendation items that are more likely to trigger clicks upon their recommendation. The discrepancy we observed might have to do with the fact that we had a limited access to the base items while we have a full access to the items for recommendation.
+To better visualize the results, we present  two plots. In Figure \ref{fig:view_click_base}, we present  plots generated by sorting the results by base CTR scores. The blue plot is for base CTR and red plot is for recommendation CTR.  What we observe here is that although the base items that are more likely to trigger clicks on recommendations are also the items that are more likely to trigger clicks upon their recommendations, there are many other items that are more likely to trigger clicks upon their recommendation, but they do not do so as base items. To visualize this better, we also sorted the results by recommendation CTRs, and we obtained the plots in Figure \ref{fig:view_click_reco}. We observe here the base items (the blue line) that are more likely to trigger clicks on recommendation are a subset of  the recommendation items that are more likely to trigger clicks upon their recommendation. %The discrepancy we observe might have to do with the fact that we had a limited access to  base items while we have a full access to the items for recommendation.
  \begin{figure} [t]
 \centering
 \includegraphics[scale=0.45]{img/base_reco_ctr_sorted_by_base.pdf}
 \caption{Plots of CTRs on base items and recommended items. Plots are generated by first sorting results according to base CTRs. Blue plot is base CTR and red plot is recommendation CTR. \label{fig:view_click_base}}
 \end{figure}
  \begin{figure} [t]
 \centering
 \includegraphics[scale=0.45]{img/base_reco_ctr_sorted_by_reco.pdf}
 \caption{Plots of CTRs on base items and recommended items. Plots are generated by first sorting results according to recommendation CTRs. Blue plot is base CTR and red plot is recommendation CTR. \label{fig:view_click_reco}}
 \end{figure}
 \section{Discussion and Conclusion}
 In this study, we attempted to explain the factors that cause  clicks on recommendations. We specifically investigated whether  clicks on recommendations are a function of the base items, or a function of the recommended items. We attempted to explain that by looking at the categories of items and the transitions between the categories. We found that indeed the category of the items explains some of the discrepancy between the likelihoods of triggering clicks both as base items and recommendation items in the sense that some base categories and some recommendation categories are more likely to trigger clicks than others.
 There is, however,  a difference between the categories in their likelihood to trigger clicks as  base category and in recommendation category. As base category, the politics category is the most likely to trigger clicks on recommendations followed by media. In recommendations, however, it is the media followed by politics that trigger clicks upon their recommendation. The results suggest that  click on recommendations is a function of both the base items and the recommended items. This is indicated by the fact that some categories are less or more  likely to generate clicks on recommendation whether as base, which means we should not recommend to those items, nor as recommendations, that means we should not recommend them. The results also show that the performance of the categories as base and recommendations are not similar.
 The investigation of the transition CRTs shows which categories are more likely to trigger clicks on which categories. This result suggests that recommendation can be improved by recommending some categories to those categories where they are more likely to get clicked. For example, we observe that it is more likely to receive clicks if we recommend media items to the  politics category and to the local category (Berlin). Similarly, we observe that recommending sports items to sports items is much more likely to trigger clicks. These results suggest that there is a way to improve recommender system by leveraging category information of items.
 We hope that this work can contribute to the understanding of factors affecting recommender systems. We have shown that category-level information can take us a long way in explaining why clicks on recommendations happen. Item level information also showed that there is a relationship between base items that are more likely to trigger clicks and those recommendation items that are more likely to trigger clicks. This all suggests that leveraging information at both the category and item levels is important to improve recommender systems. As a future work, we would like to investigate the factors that lead to clicks on recommendation using a larger  dataset and at the content level of the  items.
 % An idea, maybe show the variance of the categories in terms of their CTR? Another thing we can do is to explore the high achieving base itemsand the high achiving recommended itesm and see if they are some how the same items. We also do similar thing with lowe achving base item and recommended items. Is this holds, then clearly it indicates that a big factor is not about the current context, but just the nature of the items themselves, both ion the base items, and in the recommended items.  This is going to gold , as it already shows in the groups. But, we can also zoom in on the politics items and see if that holds too. Another thing we can consider is find base items and recommended items with big variance and study them with the view to finding the causes in terms of categories and also in terms of contenet. The variance of a recommendation item tells us information that is it is recommended to some values it makes sense, but if to others, it does not. This can also be studied at a particlat group's
 \bibliographystyle{abbrv}
 \bibliography{ref}
 \end{document}

0 comments (0 inline, 0 general)