HCDA/cikm-paper Changeset - b3d9215866be · Centrum Wiskunde & Informatica (CWI)

Changeset - b3d9215866be

Parent rev.

Child rev.

[Not reviewed]

0 1 0

Arjen de Vries (arjen) - 11 years ago 2014-06-12 02:08:49
arjen.de.vries@cwi.nl

a few more minor abstract improvements

1 file changed with 8 insertions and 7 deletions:

mypaper-final.tex

0 comments (0 inline, 0 general)

mypaper-final.tex

➞

Show inline comments

@@ @@ -102,35 +102,36 @@ @@
 % Just remember to make sure that the TOTAL number of authors
 % is the number that will appear on the first page PLUS the
 % number that will appear in the \additionalauthors section.
 \maketitle
 \begin{abstract}
 Cumulative citation recommendation refers to the problem faced by
 knowledge base curators, who need to continuously screen the media for
 updates regarding the knowledge base entries they manage. Automatic
 system support for this entity-centric information processing problem
 requires complex pipe\-lines involving both natural language
 processing and information retrieval components. The default pipeline
 processing and information retrieval components. The pipeline
 encountered in a variety of systems that approach this problem
 involves four stages: filtering, classification, ranking (or scoring),
 and evaluation. Filtering is an initial step, that reduces the
+and evaluation. Filtering is only an initial step, that reduces the
 web-scale corpus of news and other relevant information sources that
 may contain entity mentions into a working set of documents that should
 be more manageable for the subsequent stages.
 This step has a large impact on the recall that can be achieved.
 Keeping the subsequent steps constant, we therefore zoom in into the
 filtering stage, and conduct an in-depth analysis of the main design
 decisions here:
 cleansing noisy web data, the methods to create entity profiles, the
 Nevertheless, this step has a large impact on the recall that can be
 maximally attained! Therefore, in this study, we have focused on just
 this filtering stage and conduct an in-depth analysis of the main design
 decisions here: how to cleans the noisy text obtained online,
 the methods to create entity profiles, the
 types of entities of interest, document type, and the grade of
 relevance of the document-entity pair under consideration.
 We analyze how these factors (and the design choices made in their
 corresponding system components) affect filtering performance.
 We identify and characterize the relevant documents that do not pass
 the filtering stage by examing their contents. This way, we give
 estimate a practical upper-bound of recall for entity-centric stream
 filtering.
 \end{abstract}
 % A category with the (minimum) three required fields
 \category{H.4}{Information Filtering}{Miscellaneous}

0 comments (0 inline, 0 general)