2017, Article in monograph or in proceedings (Proceedings of the 13th International Conference on Semantic Systems)In this short paper, we address the interpretability of hidden layer representations in deep text mining: deep neural networks applied to text mining tasks. Following earlier work predating deep learning methods, we exploit the internal neural network activation (latent) space as a source for performing k-nearest neighbor search, looking for representative, explanatory training data examples with similar neural layer activations as test inputs. We deploy an additional semantic document similarity metric for establishing document similarity between the textual representations of these nearest neighbors and the test inputs. We argue that the statistical analysis of the output of this measure provides insight to engineers training the networks, and that nearest neighbor search in latent space combined with semantic document similarity measures offers a mechanism for presenting explanatory, intelligible examples to users.
2017, Article in monograph or in proceedings (2nd International Workshop on Extraction and Processing of Rich Semantics from Medical Texts)We present a multilingual, open source system for cancer forum thread analysis, equipped with a biomedical entity tagger and a module for textual summarization. This system allows users to investi- gate textual co-occurrences of biomedical entities in forum posts, and to browse through summaries of long discussions. It is applied to a number of online cancer patient fora, including a gastro-intestinal cancer forum and a breast cancer forum. We propose that the system can serve as an extra source of information for medical hypothesis formulation, and as a facility for boosting patient empowerment.