ISSN 0201-7385. ISSN 2074-6636
En Ru
ISSN 0201-7385. ISSN 2074-6636
Developing a research corpus in China: from a frequency dictionary to a linguistic corpus

Developing a research corpus in China: from a frequency dictionary to a linguistic corpus


Frequency dictionaries are published in China to popularize “civic education.” At the end of the 20th century, computers reduced the time needed to collect and organize various types of language material. At the same time, a machine-readable corpus of Chinese came into being. Today, there are three large corpora in China, each containing more than 100 million real materials. The ability to use the corpus is considered one of the most important skills to research the language.


Golovin B.N. Yazyk i statistika [Language and statistics]. Moscow: Prosveshchenie, 1971. 189 p. (In Russian).

Nelyubin L.L. Tolkovyj perevodcheskij slovar’ [Explanatory dictionary of translation]. 3-e izd., pererab. Moscow: Flinta: Nauka, 2003. 320 p. (In Russian).

Case BCC. URL: mode of access: (In Russian).

Case CCL. URL: mode of access: (In Russian).

Case online. URL: mode of access: (In Russian). 冯志伟,胡凤国 《数理语言学》,商务印书馆,2012, 491 页.

冯志伟 《中国语料库研究与现状》,语言文字应用,2002, 43–62 页

Received: 04/01/2019

Accepted: 05/01/2019

Accepted date: 30.06.2019

Keywords: corpus, frenquecy dictionary, computatuional linguistics

Available in the on-line version with: 30.03.2019

  • To cite this article:
Issue 2, 2019