Supervised Project
Generation of simplified texts |
Loria - Synalp
| Claire Gardent
Categorie
URL
Don't put volumes
Summary
NLG Task:
WebNLG+ offers two challenges:
(i) mapping sets of RDF triples to English or Russian text (generation) and
(ii) converting English or Russian text to sets of RDF triples (semantic parsing) (15 groups)
Training Data:
The English challenge data uses the version 3.01 of the WebNLG corpus (Gardent et al., 2017a).
Russian WebNLG was translated from English WebNLG for nine DBpedia categories: Airport, Astronaut, Building, CelestialBody, ComicsCharacter, Food, Monument, SportsTeam, and University
Model Description:
Blinov (2020) focuses on generation into Russian. They used the pre-trained Russian GPT language model (Radford et al., 2019) augmented with a classification head and fine-tuned on the WebNLG+ RDF-to-Russian dataset. The author experimented with various sampling methods and with data augmentation. For data augmentation, they use the Baidu SKE dataset (194,747 RDF/Chinese text pairs) and automatically translate its text part into Russian
Key Contribution:
Neural vs Rule based approaches
The former models seem to automatically generate text comparable in quality with human texts in terms of adequacy, i.e., the generated texts express exactly the communicative goals contained in the input tripleset. On the other hand, novel neural approaches produce text comparable to human texts in terms of fluency
Results:
Update