Supervised Project
Generation of simplified texts |
Loria - Synalp
| Claire Gardent
Categorie
URL
Don't put volumes
Summary
NLG Task
Produce a text in multiple languages that are accurate realisations of the abstract semantic information given in the input (Meaning Representation, MR).
Training Data
– WOZ 2.0 English, German, Italian – MultiWOZ + CrossWOZ English, Chinese – WebNLG 17 / WebNLG 20 English, Russian
Model Description
They adapt the universal encoder-decoder framework .The input and output are first delexicalised using pretrained language-independent embeddings, and (option-ally) ordered. The multilingual generation model is trained on the delexicalised training data, and the output is relexicalised using automatic value post-editing to ensure that the values fit the context. For matching the MR values with corresponding words in the text, the system maps MR values to n-grams based on the similarity of their representations. Specifically, it calculates the similarity between a value v and all word n-grams wi . . . wj in the text. The adoption of generic placeholders creates problems for relexicalisation as it becomes unclear which input value should replace which placeholder.They address this by ordering the model’s input based on the graph formed by its RDF triples, again by following Trisedya et al. (2018).They traverse every edge in the graph, starting from the node with the least incoming edges (or randomly in case of ties) and then visit all nodes via BFS (breadth-first search).
Key Contribution
Overcome the problem of input verbatim required in the output text for delexicalisation. Able to achieve state-of-the-art results with improvements up to 29 bleu points over competitive baselines on unseen cases.
Update