Today we're looking at the work done within the group which was reported in EMNLP2018: "A Deep Neural Network Sentence Level Classification Method with Context Information", authored by Xingyi Song, Johann Petrak and Angus Roberts, all of the University of Sheffield.
Xingyi, S., Petrak, J. & Roberts, A. A Deep Neural Network Sentence Level Classification Method with Context Information. in EMNLP2018 – 2018 Conference on Empirical Methods in Natural Language Processing 00, 0-000 (2018).
Understanding complex bodies of text is a difficult task,
especially those in which the context of a statement can greatly influence its
meaning. While methods exist that examine the context surrounding a phrase, the authors present a new approach that makes use of much larger contexts than these. This allows for greater confidence in the results of such a method, especially when dealing with complicated subject matter. Medical records are one such area in which complex judgements on
appropriate treatments are made across several sentences. It is vital therefore
to fully understand the context of each individual statement to be able to
collate meaning and accurately understand the sentiment of the entire body of
text and the conclusion that should be drawn from it
Although grounded in its use in the medical domain, this new technique can be demonstrated to be more widely applicable. An evaluation of the technique in non-medical domains showed
a solid improvement of over six percentage points over its nearest competitor
technique despite requiring 33% less training time.
But how does it work? At its core, this novel method analyses not only the target sentence but also an amount of text on either side of it. This context is encoded using an adapted Fixed-size Ordinally Forgetting Encoding (FOFE), turning it from a variable length context into a fixed length embedding. This is processed along with the target, before being concatenated and post-processed to produce an output.
Experimentation on this new technique was then performed, in comparison to peer techniques. These results showed markedly improved performance compared to LSTM-CNN methods, despite taking almost the same amount of time. The performance of this new Context-LSTM-CNN technique even surpassed an L-LSTM-CNN method despite a substantial reduction in required time.
Average test accuracy and training time. Best values are marked as bold, standard deviations in parentheses |
In conclusion, a new technique is presented, Context-LSTM-CNN, that combines the strength
of LSTM and CNN with the lightweight context
encoding algorithm, FOFE. The model shows a
consistent improvement over either a non-context
based model and a LSTM context encoded model,
for the sentence classification task.