DBpedia @ GSoC14 deadline is approaching

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view

DBpedia @ GSoC14 deadline is approaching

Sebastian Hellmann
Dear all,

The GSoC deadline is in three days (March 21st) [1] and there is still
time to apply.
The DBpedia GSoC students are quite active this year too [2] but we can
certainly handle more :)
Please forward our ideas page [3] to students (Bachelor, master or PhD)
  working on Semantic Web & Linked Data.

Best regards,
Sebastian and Dimitris

[1] https://www.google-melange.com/gsoc/events/google/gsoc2014
[2] http://sourceforge.net/p/dbpedia/mailman/dbpedia-gsoc/?limit=250
[3] wiki.dbpedia.org/gsoc2014/ideas <http://wiki.dbpedia.org/gsoc2014/ideas>

Sebastian Hellmann
AKSW/NLP2RDF research group
Insitute for Applied Informatics (InfAI) affiliated with DBpedia
* *21st March, 2014*: LD4LT Kick-Off
@European Data Forum
* *Sept. 1-5, 2014* Conference Week in Leipzig, including
** *Sept 2nd*, MLODE 2014
** *Sept 3rd*, 2nd DBpedia Community Meeting
** *Sept 4th-5th*, SEMANTiCS (formerly i-SEMANTICS) <http://semantics.cc/>
Venha para a Alemanha como PhD: http://bis.informatik.uni-leipzig.de/csf
Projects: http://dbpedia.org, http://nlp2rdf.org,
http://linguistics.okfn.org, https://www.w3.org/community/ld4lt 
Homepage: http://aksw.org/SebastianHellmann
Research Group: http://aksw.org
Wiktionary-l mailing list
[hidden email]
Reply | Threaded
Open this post in threaded view

Re: DBpedia @ GSoC14 deadline is approaching

Federico Leva (Nemo)
Thanks for the reminder! I guess the most relevant idea here is
<http://wiki.dbpedia.org/gsoc2014/ideas#h359-15> i.e.:


Wiktionary is a large-scale, multilingual, crowd-sourced dictionary. It
features 18,689,141 articles in 171 languages maintained by 4184 active
users. Dictionary entries may contain definitions and examples, part of
speech, idioms and proverbs, synonyms, antonyms, hyperonyms and
hyponyms, related terms, phonological information in IPA notation or as
soundfile, word formation, flexion tables, etymology, images, as well as
translations into other languages. Wiktionary is an invaluable source of
dictionary data.

To make further use of the data, it should to be transferred from its
current semi-structured document format to a semantic data format like
RDF. This can be achieved by already existing transformation software
[2] maintained by the DBpedia project. However, the structure of every
single language Wiktionary is different. Articles contain a varying
degree of information in varying forms. That's why the conversion
software allows for mapping the structure of Wiktionary articles to the
final RDF structure via custom mappings. At the moment, these mappings
exist for English, German, French, Russian, Greek, Vietnamese. This
means that 165 language mappings representing over 60% of the articles
are still missing.

Mappings are written in XML, using a simple regular expression syntax to
match the wiki markup. Up to this point, they were developed by native
speakers that are also versed in XML and programming.

To make the mapping approach more scaleable and allow for better
maintenance of existing mappings, the student responsible for this task
needs to develop a system that allows for easy mapping and taking into
account the diverse languages. This system might be a community project
like a mapping wiki, a mapping pipeline, a GUI or a combination thereof.
As proof of concept, a few new mappings, especially in the European
languages, should also be developed.

[1] http://meta.wikimedia.org/wiki/Wiktionary#List_of_Wiktionaries
[2] http://dbpedia.org/Wiktionary
Mentors: Kyungtae Lim, Jim O’Regan (co-mentor)



Wiktionary-l mailing list
[hidden email]