Changeset - 4342a47f2ca8
[Not reviewed]
0 1 0
Gebrekirstos Gebremeskel - 11 years ago 2014-06-12 03:59:32
destinycome@gmail.com
updated
1 file changed with 24 insertions and 0 deletions:
0 comments (0 inline, 0 general)
mypaper-final.tex
Show inline comments
 
@@ -418,6 +418,30 @@ Redirect  &49 \\
 
We have a total of 121 Wikipedia entities.  Every entity has a DBpedia label.  Only 82 entities have a name string and only 49 entities have redirect strings. Most of the entities have only one string, but some have several redirect sterings. One entity, Buddy\_MacKay, has the highest (12) number of redirect strings. 6 entities have  birth names, 1 entity has a nick name, 1 entity has alias and  4 entities have alternative names.
 
 
We combined the different name variants  we extracted to form a set of strings for each KB entity.  For Twitter entities, we used the display names that we collected . We consider the names of the entities that are part of the URL as canonical. For example in http://en.wikipedia.org/wiki/Benjamin\_Bronfman, Benjamin Bronfman is a canonical name of the entity.  From the combined name variants and the canonical names, we  created four sets of profiles for each entity: canonical(cano) canonical partial (cano-part), all name variants combined (all) and partial names of all name variants(all-part). We refer to the last two profiles as name-variant and name-variant partial. The names in paranthesis are used in table captions.
 
 
\begin{table*}
 
\caption{Example entity profiles (upper part Wikipedia, lower part Twitter)}
 
\begin{center}
 
\begin{tabular}{l*{3}{c}}
 
 &Wikipedia&Twitter \\
 
\hline
 
 
 &Benjamin\_Bronfman& roryscovel\\
 
  cano&[Benjamin Bronfman] &[roryscovel]\\
 
  cano-part &[Benjamin, Bronfman]&[roryscovel]\\
 
  all&[Ben Brewer, Benjamin Zachary Bronfman] &[Rory Scovel] \\
 
  all-part& [Ben, Brewer, Benjamin, Zachary, Bronfman]&[Rory, Scovel]\\
 
			   
 
                  
 
   \hline                      
 
\end{tabular}
 
\end{center}
 
\label{tab:breakdown}
 
\end{table*}
 
 
 
 
 
\subsection{Annotation Corpus}
 
 
The annotation set is a combination of the annotations from before the Training Time Range(TTR) and Evaluation Time Range (ETR) and consists of 68405 annotations.  Its breakdown into training and test sets is  shown in Table \ref{tab:breakdown}.
0 comments (0 inline, 0 general)