Changeset - 3eb20e9cca3d
[Not reviewed]
0 1 0
Arjen de Vries (arjen) - 11 years ago 2014-06-12 03:57:35
arjen.de.vries@cwi.nl
and another one
1 file changed with 3 insertions and 3 deletions:
0 comments (0 inline, 0 general)
mypaper-final.tex
Show inline comments
 
@@ -423,27 +423,27 @@ Redirect  &49 \\
 
The collection contains a total number of 121 Wikipedia entities.
 
Every entity has a corresponding DBpedia label.  Only 82 entities have
 
a name string and only 49 entities have redirect strings. (Most of the
 
entities have only one string, except for a few cases with multiple
 
redirect strings; Buddy\_MacKay, has the highest (12) number of
 
redirect strings.) 
 

	
 
We combine the different name variants we extracted to form a set of
 
strings for each KB entity. For Twitter entities, we used the display
 
names that we collected. 
 

	
 
We consider the names of the entities that
 
are part of the URL as canonical. For example in
 
http://en.wikipedia.org/wiki/Benjamin\_Bronfman, Benjamin Bronfman is
 
a canonical name of the entity. From the combined name variants and
 
are part of the URL as canonical. For example in entity\\
 
\url{http://en.wikipedia.org/wiki/Benjamin_Bronfman}\\
 
Benjamin Bronfman is a canonical name of the entity. From the combined name variants and
 
the canonical names, we  created four sets of profiles for each
 
entity: canonical(cano) canonical partial (cano-part), all name
 
variants combined (all) and partial names of all name
 
variants(all-part). We refer to the last two profiles as name-variant
 
and name-variant partial. The names in parentheses are used in table
 
captions.
 

	
 
\subsection{Annotation Corpus}
 
 
The annotation set is a combination of the annotations from before the Training Time Range(TTR) and Evaluation Time Range (ETR) and consists of 68405 annotations.  Its breakdown into training and test sets is  shown in Table \ref{tab:breakdown}.
 
 
0 comments (0 inline, 0 general)