應用門控機制與多層卷積深度學習模型於中文命名實體辨識之研究

Multi-Stack Convolution with Gating Mechanism for Chinese Named Entity Recognition

Author: 張智皓

Publish Year: 2019-07

Update by: March 27, 2025

摘要

Traditional Chinese Named Entity Recognition based on machine learning usually relies on large amounts of hand-craft features, even dictionaries created by experts specific for entity, and then, uses linear regression and statistical models to gather important features and Chinese semantic rules. However, two obvious flaws can be observed. Firstly, it is extremely time-consuming and complicated to extract features from Chinese texts. Secondly, the usefulness of the models completely depends on the recognition efficiency based on hand-craft features; as a result, it is difficult to improve its accuracy due to semantic confusion that is characteristic in Chinese and unknown vocabularies. In English, spaces are used for word segmentation, and Chinese does not have similar word segmentation. However, Chinese words are highly interdependent and demonstrate semantic differences (homographs, polysemy) based on the context. Therefore, a great challenge as well as a possibility is how to recognize Chinese named entities in large corpora. To provide a solution to the challenge and flaws mentioned above, this study employs deep learning structure to complete Chinese Named Entity Recognition. Firstly, the deep learning model is combined with unsupervised learning to embed a large amount of pre-training words in the vocabulary. Then, the vocabulary is used to numeralize words before using multi-stack convolution to extract textual features. Gating mechanism is also incorporated between layers to generalize features and automatically extract features without employing feature engineering. The purpose of doing so is to reduce the dependency on hand-craft features in Named Entity Recognition and avoid hand-craft Chinese recognition features. This method can be effectively applied to recognizing different types of entities. This study uses documents from SIGHAN Bakeoff-3 and utilizes customized crawler programs to capture internet articles for training data. Electronic files of newspaper articles are used as testing data and form the standard by which the efficiency of different models can be evaluated. The results show that the F1-Measure model proposed by the study reaches outstanding an overall efficiency of 90.76% in SIGHAN and 90.42% in electronic files of newspaper articles.