Exploring Onto.PT

Transcrição

Exploring Onto.PT
Exploring Onto.PT
Hugo Gonçalo Oliveiraa, Leticia Antón Pérez & Paulo Gomes
[email protected], [email protected],
[email protected]
Cognitive & Media Systems Group
CISUC, University of Coimbra, Portugal
a
Hugo Gonçalo Oliveira is supported by FCT, grant SFRH/BD/44955/2008, co-funded by FSE.
Introduction
I
Public lexical ontology for Portuguese
I
Freely available from
http://ontopt.dei.uc.pt/
Modelled after Princeton WordNet
I Populated automatically after exploiting
public lexical resources
I
OntoBusca interface
Retrieves synsets containing a queried lexical item
I Expands relations where they are involved
I Cloud with most frequent searches
I
1. Semantic relations, connecting lemmas,
extracted from dictionaries [1]
2. Attachment of synonymy relations to suitable
synsets in a thesaurus and discovery of new
synsets in the unattached synonymy relations [2]
3. Assignment of lemmas in relations to the most
suitable synsets [3]
Integrated resources
I
Relations from dictionaries – CARTÃO [1]
I
I
I
I
PAPEL [4] (v.3.0)
Dicionário Aberto [5]
Wiktionary.PTa
Handcrafted thesauri
I
I
TeP 2.0 [6]
OpenThesaurus.PTb
Current contents (v.0.3.1)
≈160,000 lexical items
I ≈110,000 synsets
I
I
I
I
I
I
Noun (64,865)
Verb (25,342)
Adjective (17,952)
Adverb (1,993)
RDF/OWL model
I
Based on the W3C WordNet RDF/OWL Basic [7].
≈170,000 instances of semantic relations
I
relation types from PAPEL 3.0
Group
Hypernym
Relation
Instances
n hiperonimoDe n
83,552
n parteDe n
3,672
Part
n parteDeAlgoComProp adj
4,911
adj propDeAlgoParteDe n
91
n membroDe n
5,847
Member
n membroDeAlgoComProp adj
106
adj propDeAlgoMembroDe n
909
n contidoEm n
355
Contains
n contidoEmAlgoComProp adj
264
Material
n materialDe n
835
n causadorDe n
1,347
n causadorDeAlgoComProp adj
26
Causation
adj propDeAlgoQueCausa n
619
n causadorDaAccao v
56
v accaoQueCausa n
8,052
Place
n localOrigemDe n
1,293
Antonym
adj antonimoAdjDe adj
538
n produtorDe n
1,718
Producer n produtorDeAlgoComProp adj
88
adj propDeAlgoProdutorDe n
529
n fazSeCom n
6,551
n fazSeComAlgoComProp adj
79
Purpose
v finalidadeDe n
7,271
v finalidadeDeAlgoComProp adj
322
n temQualidade n
934
Quality
n devidoAQualidade adj
1,059
n temEstado n
327
State
n devidoAEstado adj
197
adv maneiraPorMeioDe n
1,833
Manner
adv maneiraComProp adj
1,561
Manner
adv maneiraSem n
216
without
adv maneiraSemAccao v
14
adj dizSeSobre n
9,145
Property
adj dizSeDoQue v
25,014
a
b
http://pt.wiktionary.org/
http://openthesaurus.caixamagica.pt/
References
[1] H. Gonçalo Oliveira, L. Antón Pérez, H. Costa, and P. Gomes, “Uma rede léxico-semântica de grandes dimensões para o português, extraı́da a
partir de dicionários electrónicos,” Linguamática, vol. 3, pp. 23–38, December 2011.
[2] H. Gonçalo Oliveira and P. Gomes, “Automatically enriching a thesaurus with information from dictionaries,” in Proc 15th Portuguese Conference on
Artificial Intelligence (EPIA 2011), vol. 7026 of LNCS, pp. 462–475, Springer, October 2011.
[3] H. Gonçalo Oliveira and P. Gomes, “Ontologising relational triples into a portuguese thesaurus,” in Proc. 15th Portuguese Conference on Artificial
Intelligence, EPIA 2011, (Lisbon, Portugal), pp. 803–817, APPIA, October 2011.
[4] H. Gonçalo Oliveira, D. Santos, and P. Gomes, “Extracção de relações semânticas entre palavras a partir de um dicionário: o PAPEL e sua
avaliação,” Linguamática, vol. 2, pp. 77–93, May 2010.
[5] A. Simões, A. I. Sanromán, and J. J. ao Almeida, “Dicionário-aberto: A source of resources for the portuguese language processing,” in Proc. of
PROPOR 2012, vol. 7243 of LNCS, (Coimbra, Portugal), pp. 121–127, Springer, April 2012.
[6] E. G. Maziero, T. A. Pardo, A. Di Felippo, and B. C. Dias-da-Silva, “A Base de Dados Lexical e a Interface Web do TeP 2.0 - Thesaurus Eletrônico
para o Português do Brasil,” in VI TIL, pp. 390–392, 2008.
[7] M. van Assem, A. Gangemi, and G. Schreiber, “RDF/OWL representation of WordNet,” working draft, W3C, June 2006.
Demo Session of 10th International Conference on Computational Processing of the Portuguese Language (PROPOR), 17-20 April 2012, Coimbra, Portugal

Documentos relacionados

Onto.PT: integrating lexical-semantic knowledge to build a public

Onto.PT: integrating lexical-semantic knowledge to build a public About 170,000 synset-based relational triples Same relations as in PAPEL/CARTÃO Relations

Leia mais

Onto.PT: towards the automatic construction of a lexical ontology for

Onto.PT: towards the automatic construction of a lexical ontology for Onto.PT: towards the automatic construction of a lexical ontology for Portuguese Hugo Gonçalo Oliveira1

Leia mais