

REACTION Mário J. Silva REACTION Task 1: Mining Ressources Progress & Plans REACTION Informa1on Mining •  Development of robust linguis;c resources to process different types and genres of texts –  knowledge resources about media personali;es: recognizing and resolving references to named-­‐
en11es; –  sen;ment lexicons and grammars: detec1ng the polarity of opinions about relevant personali1es –  annotated corpora: training different text classifiers and evalua1ng classifica1on procedures REACTION Mining Resources •  POWER -­‐ Poli;cal Ontology for Web En;ty Retrieval (enriched) •  Sen;Lex-­‐PT – Sen1ment Lexicon (refined and enlarged with new sen;ment words and expressions from sports) •  Sen;Corpus-­‐PT – A sen1ment annotated corpus of tweets targe1ng the Portuguese football players at Euro 2012 (in progress) REACTION POWER enrichement Popula1on of the ontology with poli1cal actors from NEWS using Voxx
(SAPO) •  For each individual in POWER having a correspondence with a Voxx name, we query the Voxx web service and extract the alterna1ve forms (nicknames and ergonyms) used to men1on that en1ty in media. •  The names recognized by Voxx are matched against the names of POWER, using the Jaccard similarity coefficient (Tan, Steinbach & Kumar 2005). Matches with over 25% similarity are considered a candidate associa1on. •  We apply a sta1s1cal model to select the most likely among candidates. Only the most likely associa1on obtained in this process is added to the POWER knowledge base. Precision and recall of 97%. The Chi-­‐square test of independence indicates a significance level of 0.1% (p-­‐value<0.001). autogolo,autogolo.PoS=N;FLEX=ms;TG=HUM:N0;POL:N0=-­‐1;ANOT=MAN;DOM=DESP autogolos,autogolo.PoS=N;FLEX=mp;TG=HUM:N0;POL:N0=-­‐1;ANOT=MAN;DOM=DESP balázio,balázio.PoS=N;FLEX=ms;TG=HUM:N0;POL:N0=1;ANOT=MAN;DOM=DESP balázios,balázio.PoS=N;FLEX=mp;TG=HUM:N0;POL:N0=1;ANOT=MAN;DOM=DESP boa exibição,boa exibição.PoS=N;FLEX=fs;TG=HUM:N0;POL:N0=1;ANOT=MAN;DOM=DESP boa jogada,boa jogada.PoS=N;FLEX=fs;TG=HUM:N0;POL:N0=1;ANOT=MAN;DOM=DESP boas jogadas,boa jogada.PoS=N;FLEX=fp;TG=HUM:N0;POL:N0=1;ANOT=MAN;DOM=DESP cartão amarelo,cartão amarelo.PoS=N;FLEX=ms;TG=HUM:N0;POL:N0=-­‐1;ANOT=MAN;DOM=DESP coxo,coxo.PoS=Adj;FLEX=ms;TG=HUM:N0;POL:N0=-­‐1;ANOT=MAN;DOM=DESP coxos,coxo.PoS=Adj;FLEX=mp;TG=HUM:N0;POL:N0=-­‐1;ANOT=MAN;DOM=DESP craque,craque.PoS=Adj;FLEX=ms;TG=HUM:N0;POL:N0=1;ANOT=MAN;DOM=DESP craques,craque.PoS=Adj;FLEX=mp;TG=HUM:N0;POL:N0=1;ANOT=MAN;DOM=DESP entrega-­‐se ao jogo,entregar-­‐se ao jogo.PoS=IDIOM;FLEX=P2s|P4s|P3s|
Y2s;TG=HUM:N0;POL:N0=1;ANOT=MAN;DOM=DESP fez falta,fazer falta.PoS=IDIOM.FLEX=J4s|P3s;TG=HUM:N0;POL:N0=-­‐1;ANOT=MAN;DOM=DESP fez uma cueca,fazer uma cueca.PoS=IDIOM.FLEX=J4s|
P3s;TG=HUM:N0;POL:N0=1;ANOT=MAN;DOM=DESP falhou o penal;,falhar o penal1.PoS=IDIOM.FLEX=J:4s|P:
3s;TG=HUM:N0;POL:N0=-­‐1;ANOT=MAN;DOM=DESP REACTION Sen1Lex enlargement REACTION Sen1Lex refinement (work in progress) Human AVachment Index -­‐ degree of appropriateness and relevance of an adjec1ve to a human en1ty. General assump;ons: •  The greater is the probability of an adjec1ve ending a clause or a sentence the greater will be the probability of being a relevant human modifier. •  Relevant human adjec1ves mostly appear in post-­‐adnominal posi1on, within an aoribu1ve construc1on. He is crazy. vs He is crazy about soccer. Um jogador perfeito vs. Um perfeito idiota Main goal: Iden1fying the sen1ment predicates that are more relevant(and less noisy) for OM applica1ons. •  Manually annotated corpus of tweets targe1ng poli1cians 3000 tweets (2 annotators) IAA >80% More accurate guidelines •  Manually annotated corpus of tweets targe1ng football players (work in progress) 1400 tweets (June 2 PT-­‐Turkey) Characteriza1on of both collec1ons, par1cularly concerning the expression of sen1ment in each domain. REACTION Sen1Corpus-­‐PT POWER ! 
Crea1on of an interface allowing the visualiza1on of data, in an integrated manner: ! 
Cita1ons from poli1cians in newspapers (Voxx) ! 
NER both in conven1onal and social media ! 
Opinion poli1cal mining in social media, par1cularly in Twioer. Sen1Corpus ! 
New release (#Euro 2012) ! 
Contras1ve study (poli1cs vs. football) REACTION Next steps 

Documentos relacionados