Dmytro Lande, Zijiang Yang, Shiwei Zhu, Jianping Guo, Moji Wei
Информационные технологии и безопасность. Материалы XVIII Международной научно-практической конференции ИТБ-2018. - К.: ООО "Инжиниринг", 2018.
- C. 255-271.
Chinese legal information automatic summarization
Article is devoted to a method for automatic text summarization of the legal information provided in Chinese. The structure of the summary and the model of its formation is considered. Two approaches are offered. First one is determination of weight of separate hieroglyphs instead of words in the texts of documents and summaries for sentences importance level determination process. Second approach is to consider a model of document as a network of sentences for detection of the most important sentences by parameters of this network. Various methods of automatic text summarization are performed and tested. A cosine measure and Jensen-Shannon's divergence are applied as two estimates of summaries quality without participation of experts. Compared to other summarizing methods, given one on the basis of the offered network model of the document was the best by criteria of a cosine measure and Jensen-Shannon's distances for summaries which volume exceeds 2 sentences. The offered approach, with minimal modifications, can be applied to texts on any subject of scientific, technical or news information.
Keywords: Automatic text summarization, Legal information, Chinese language, Cosine measure, Jensen-Shannon's divergence