From d9b84600c51014e9194a03001f5807a0a3604193 2014-06-12 05:48:34 From: Arjen P. de Vries Date: 2014-06-12 05:48:34 Subject: [PATCH] conclusions "done" --- diff --git a/mypaper-final.tex b/mypaper-final.tex index 097ed36ba4598b0cc7c96b266406af01a1a5e4f5..c4ebf2a9fe4dfe3074dd0657ba457406fcab3742 100644 --- a/mypaper-final.tex +++ b/mypaper-final.tex @@ -1131,15 +1131,18 @@ circumstances under which vital documents can not be retrieved? Cleansing may remove (parts of) the contents of documents, making them irretrievable. However, because of the introduction of false positives, gaining recall by filtering the raw corpus instead of the -cleansed one and developing richer entity profiles, does not necessarily translate to overall -performance gains. The overall conclusion on this is mixed in the +cleansed one, as well as developing richer entity profiles, does not necessarily translate to overall +performance gains. The conclusion is mixed in the sense that cleansing has helped to improve the recall on vital documents and Wikipedia entities, but at the same time reduces the recall on Twitter entities and the relative category of relevance ranking. Vital and relevant documents show a difference in retrieval performance, where vital documents appear to be easier to filter than -relevant ones. Notice that in the context of the CCR task, the vital documents are -most important. +relevant ones. (Notice that in the context of the CCR task, the vital documents are +most important.) The bottom line is that improving the filtering step +with respect to recall has shown that current entity oriented +retrieval approaches need to be improved to better classify and rank the ``new'' +documents that make it into the working set. Despite an exhaustive attempt to identify as many vital-relevant