Supervised Project
Generation of simplified texts |
Loria - Synalp
| Claire Gardent
Categorie
URL
Don't put volumes
Summary
NLG Task
The task is to use resource-rich monolingual Abstractive Sentence Summarization (ASSUM system) to teach low resource cross-lingual summarization system on both summary word generation and attention. https://github.com/KelleyYin/Cross-lingual-Summarization
Training Data
Chinese-to-English summarization system is used, which takes Chinese sentence as input, and outputs English abstractive summary. They build evaluation sets for this task by manually translating English sentences of the existing English evaluation sets into Chinese inputs. They also use Gigaword corpus and DUC-2004 as another English data set only for testing.Data is also collected from the Chinese microblogging website Sina Weibo with 2.4M sentence-summary pairs for training and 725 pairs for testing.
Model Description
Transformer is employed .Six layers are stacked in both the encoder and decoder, and the dimensions of the embedding vectors and all hidden vectors are set 512.Then set eight heads in the multi-head attention. The source embedding, the target embedding and the linear sublayer are shared in the teacher networks, while are not shared in the student networks. Byte-pair encoding is employed with a vocabulary of about 32k tokens on English side and Chinese side respectively.
Genuine summaries paired with the generated pseudo sources to train the cross-lingual summarisation system. They use the teacher-student framework in which the monolingual summarisation system is taken as the teacher and the cross-lingual summarisation system is the student. The teacher let the student to simulate both the summary word distribution and attention weights according to those of the teacher network. They use a back-translation procedure that generates pseudo source sentences paired with the true summaries to build a training corpus for the cross-lingual ASSUM.
Key Contribution
Proposed a new loss function on generative probability distribution and attention. It performs significantly better (around 2 points on Rouge1 and Rouge2) than several baselines, and is able to significantly reduce the performance gap between the cross-lingual ASSUM and the monolingual ASSUM over the benchmark datasets.
Results
Gigaword : Rouge score 1 - 30.1 Rouge score 2- 12.2 Rough score 3- 27.7
DUC2004 :Rouge score 1 - 26 Rouge score 2- 8 Rough score 3- 23.1
Update