statistics / hellog～英語史ブログ

最終更新時間: 2026-07-15 01:27

2017-11-23 Thu

■ #3132. 暗号学と言語学 (2) [cryptology][linguistics][statistics][chaos_theory][information_theory]

　最近，言語学とカオス理論 (chaos_theory) について少し調べているが，フラクタル図形「マンデルブロー集合」 (Mandelbrot set) で知られる数学者 Benoît Mandelbrot (1924--2010) が，情報理論や言語学に関する論考を著わしていることを知った（「マンデルブロー集合」については「#3123. カオスとフラクタル」 ([2017-11-14-1]) を参照）．
　Mandelbrot はその論考で，暗号学 (cryptology) と言語学の接点という話題にも触れている．本ブログでも「#2699. 暗号学と言語学」 ([2016-09-16-1]) の記事で，両分野の密接な関係について考えたことがあったので，ここで再び取り上げたい．その記事の第2段落で述べたことと Mandelbrot (552) の次の1節は，よく符合する．

. . . let us grant for the moment that the encoding and decoding machines may be as complicated as the designer may wish, and that the memory of the human links---using the common sense of the word "memory"---is unbounded. Under those ideal circumstances, it is obvious that any improvement of our understanding of the structure of language and of discourse will bring a possibility of improvement of the performance of the cryptographer or stenographer. For example, a knowledge of the rules of grammar will show that a given phrase will never be encountered in grammatically correct discourse; thus, if his employer were to speak only grammatical English, a stenographer would not need any special set of signs to designate the incorrect sentences. Similarly, a knowledge of the statistics of discourse will suggest that the "cliché" be represented by special short signs; in this way, the stenogram will be shortened and---since deciphering is very much helped by cliché---the code will be strengthened. That is, the ideal cryptographer and stenographer should make the utmost use of any available linguistic information.

　暗号作成者は，言語の性質を知っていればいるほど，その性質の裏をかいた暗号文を作成できるし，逆に暗号解読者も，言語の性質を知っていればいるほど，そのように裏をかかれる可能性を減らすことができる．この意味で，暗号学は舞台を変えた言語学ともいえるのである．

　・ Mandelbrot, Benoît. "Information Theory and Psycholinguistics." Scientific Psychology: Principles and Approaches. Ed. Benjamin B. Wolman and Ernest Nagel. New York: Basic Books, 1965. 550--62.

Referrer (Inside): [2020-03-13-1]

Rank	Language	Primary Country	Countries	Speakers (20th ed, 2017)	Speakers (16th ed, 2009)	(13th ed, 1996)
1	Chinese	China	37	1,284 million	1,213	1,123
2	Spanish	Spain	31	437	329	266
3	English	United Kingdom	106	372	328	322
4	Arabic	Saudi Arabia	57	295	221	202
5	Hindi	India	5	260	182 (242.6 with Urdu)	(236 with Urdu)
6	Bengali	Bangladesh	4	242	181	189
7	Portuguese	Portugal	13	219	178	170
8	Russian	Russian Federation	19	154	144	288
9	Japanese	Japan	2	128	122	125
10	Lahnda	Pakistan	6	119	78.3
11	Javanese	Indonesia	3	84.4	84.6
12	Korean	Korea	7	77.2	66.3
13	German	Germany	27	76.8	90.3	98
14	French	France	53	76.1	67.8	72
15	Telugu	India	2	74.2	69.8
16	Marathi	India	1	71.8	68.1
17	Turkish	Turkey	8	71.1	50.8
18	Urdu	Pakistan	6	69.1	60.6
19	Vietnamese	Viet Nam	3	68.1	68.6
20	Tamil	India	7	68.0	65.7
21	Italian	Italy	13	63.4	61.7	63
22	Persian	Iran	30	61.9
23	Malay	Malaysia	16	60.8	39.1	47

	GSL	CELEX2
1%	47.05%	69.36%
0.1%	14.60%	43.57%

statistics - hellog～英語史ブログ

■ #3132. 暗号学と言語学 (2) [cryptology][linguistics][statistics][chaos_theory][information_theory]

■ #3062. 1665年のペストに関する Samuel Pepys の記録 [black_death][pepys][literature][history][demography][statistics]

■ #3041. 近現代における semicolon の盛衰 [punctuation][statistics]

■ #3009. 母語話者数による世界トップ25言語（2017年版） [statistics][world_languages][demography][japanese]

■ #2966. 英語語彙の世界性 (2) [lexicology][loan_word][borrowing][statistics][link]

■ #2876. 英語語彙の頻度分布に関する格差上位1%のシェア [lexicology][statistics][frequency][corpus]

■ #2875. 英語語彙の頻度分布の格差をジニ係数とローレンツ曲線でみる [lexicology][statistics][frequency][zipfs_law][corpus]

■ #2783. 世界で最も "popular" な言語は？ [world_languages][demography][statistics]

■ #2705. カエサル暗号機（hellog 版） [cryptology][grammatology][cgi][web_service][statistics]

■ #2699. 暗号学と言語学 [cryptology][linguistics][statistics][grammatology]

■ #2693. 古ノルド語借用語の統計 [lexicology][statistics][old_norse][french][loan_word][contact]

■ #2690. N-gram Tool [cgi][n-gram][statistics][corpus][web_service][frequency][cgi]

■ #2667. Chaucer の用いた語彙の10--15%がフランス借用語 [chaucer][french][loan_word][statistics][popular_passage][me_text]

■ #2661. Swadesh (1952) の選んだ言語年代学用の200語 [glottochronology][lexicology][frequency][statistics]

■ #2660. glottochronology と基本語彙 [glottochronology][lexicology][statistics][history_of_linguistics][frequency][anthropology]

■ #2659. glottochronology と lexicostatistics [glottochronology][lexicology][statistics][terminology][speed_of_change][frequency]

■ #2646. オランダ借用語に関する統計 [loan_word][borrowing][dutch][flemish][afrikaans][statistics]

■ #2621. ドイツ語の英語への本格的貢献は19世紀から [loan_word][borrowing][statistics][german]

■ #2618. 文字をもたない言語の数は？ (2) [world_languages][writing][statistics][language_planning][language_myth][medium]

■ #2615. 英語語彙の世界性 [lexicology][loan_word][borrowing][statistics][link]