Supervised Project
Generation of simplified texts |
Loria - Synalp
| Claire Gardent
Categorie
URL
Don't put volumes
Summary
NLG Task:
This paper describes Language Agnostic Delexicalisation an SP for Multilingual RDF to text generation
Training Data:
The training data used is provided by the organizer of the WebNLG Challenge 2020. The Russian part of the data has been created by translating English with a Machine Translation system and then post edited by crowdsourcing then spell checked. The data can be found here
https://gitlab.com/shimorina/webnlg-dataset/-/tree/master/release_v3.0
Model Description:
Use a transformer as architecture. fairseq toolkit is used. There is 2 encoder and 1 decoder. 4 layers 256 hidden size, 3072 size for the feed forward layers. Trained with 0.4 dropout and 0.1 attention dropout
Key Contribution:
The LAD approach outperforms other solutions for English but cannot be tested for Russian because they didn’t have enough data.
Result:
Update