frequency / hellog～英語史ブログ

最終更新時間: 2026-02-23 11:18

2025-01-18 Sat

■ #5745. アルファベットの文字頻度 [corpus][link][alphabet][frequency][statistics][letter_frequency][bnc][morse_code]

　AからZまでのアルファベット文字のなかで，最も頻度の高い文字，低い文字は何か．この文字頻度 (letter_frequency) の話題については，「#308. 現代英語の最頻英単語リスト」 ([2010-03-01-1]) の下部に Letter Frequencies (rankings for various languages) へのリンクを挙げたとおり，様々な言語やコーパスでの順位表が作り出されている．例えば，BNC に依拠すると "etaoinsrhldcumfpgwybvkxjqz" の順位表が得られる．
　Crystal (277) には，The Cambridge Encyclopedia (1st ed.) の全テキスト，150万語をコーパスとした文字頻度表が掲げられている．累積頻度順位 (Cumulative) のみならず，文学，宗教，政治，物理学，化学の各々のテーマごとの頻度や Morse code (morse_code) の頻度も合わせて示されている．以下のグラフは，X軸に沿って累積頻度順 (= "eatinorslhdcmufpgbywvkxjq") に文字を並べ，Y軸を各テーマ内での頻度割合（百分率）としたものである（頻度表はソース HTML を参照）．

　累積頻度順に照らしてテーマごとの特徴を見てみるとと，政治が最も標準的である．文学と政治がそれに続く．標準から遠ざかっていくのが，化学，物理学，そして Morse code となる．
　個々の文字をみると興味深い点が多々ある．相対的に宗教では <h> が多く (holy?) <l> が少ないこと，文学では <w> が多いことは何を意味するのだろうか？　物理学や化学はラテン・ギリシア語系の単語が多く含まれているために，その他一般とは若干異なる文字頻度を示しているのかもしれない．人工的な Morse code は，他のテーマとは目に見えて異なる線を描いていることがわかる．

　・ Crystal, D. The Cambridge Encyclopedia of the English Language. 3rd ed. CUP, 2018.

Referrer (Inside): [2026-01-01-1]

	Type		Token
	Be	have	Be	have
OE	16% (11)	84% (57)	21% (18)	79% (85)
EME	11% (12)	89% (92)	24% (69)	76% (214)
LME	11% (9)	89% (70)	11% (12)	89% (96)
EModE	8% (10)	92% (115)	4% (13)	96% (319)
19th C	3% (8)	97% (311)	4% (38)	96% (839)

Commas	47
Full stops	45
Dashes	2
Parentheses	2
Semi-colons	2
Question marks	1
Colons	1
Exclamation marks	1

	Phonological route	Lexical route
Converts written units	To phonemes	To meanings
Also known as	Assembled phonology	Addressed phonology
Needs	Mental rules	Mental lexicon of items
Works by	Correspondence rules	Matching
Can handle	Any novel combination	Only familiar symbols
Used with	Any words	High frequency words

14^th	15^th	16^th	17^th	18^th	19^th	20^th
knowe	suppose	know	know	think	think	think
witen	trust	think	think	believe	suppose	know
thinke (p)	trow	trow	find suppose	know	suppose
seme	understand	trust/wot	believe	know	believe	believe
wene	wot	believe	suppose	guess	guess	guess
trowe	hope	wene	fancy
thinke (i)	know	suppose	guess
understonde	deme/think/wene	guess	trust
deme	deme
mene	doubt
trust	believe
hope	guess
gessen
leve
undertake
suppose
beleven

	Min.	1st Qu.	Median	Mean	3rd Qu.	Max.
Top_100	1.0	2.0	3.0	3.1	4.0	5.0
Top_200	1.00	3.00	4.00	3.77	4.00	10.00
Top_500	1.000	4.000	4.000	4.498	5.000	10.000
Top_1K	1.000	4.000	5.000	4.968	6.000	15.000
Top_2K	1.000	4.000	5.000	5.406	7.000	15.000
Top_5K	1.000	5.000	6.000	6.014	7.000	16.000
Top_10K	1.000	5.000	6.000	6.488	8.000	16.000
Top_20K	1.000	5.000	7.000	6.954	8.000	17.000
Top_50K	1.000	6.000	7.000	7.622	9.000	20.000

frequency - hellog～英語史ブログ

■ #5745. アルファベットの文字頻度 [corpus][link][alphabet][frequency][statistics][letter_frequency][bnc][morse_code]

■ #5557. 秋元実治（著）『増補 文法化とイディオム化』（ひつじ書房，2014年） [toc][grammaticalisation][idiomatisation][idiom][composite_predicate][syntax][lexicology][frequency][lexicalisation][preposition][phrasal_verb][voicy][heldio]

■ #5379. blend は形態理論や韻律理論にとっても有意義な現象である [blend][morphology][frequency][word_formation][prosody][analogy]

■ #5312. 「ゆる言語学ラジオ」最新回は「不規則動詞はなぜ存在するのか？」 [yurugengogakuradio][verb][inflection][conjugation][sobokunagimon][frequency][voicy][heldio][youtube][link][notice][numeral][suppletion][analogy]

■ #4752. which vs that --- 関係代名詞の選択の陰にひそむ使用域 [relative_pronoun][frequency][corpus][youtube][syntax][genre][ame_bre]

■ #4679. 言語における塊現象とゆらぎ [complex_system][computational_linguistics][statistics][frequency][1/f][terminology][keyword]

■ #4678. 言語における塊現象と長相関 [complex_system][computational_linguistics][statistics][frequency][information_structure][article][terminology]

■ #4479. 不規則動詞の過去形は直接記憶保存されている [frequency][suppletion][verb][inflection][be][preterite]

■ #4478. 頻度でみる be 完了の衰退の歴史 [perfect][be][verb][aspect][tense][auxiliary_verb][frequency]

■ #4273. the --- 英語で最も重要な語 [article][frequency][hellog_entry_set][definiteness]

■ #4245. 頻度と漸近双曲線 (A-curve) [lexical_diffusion][zipfs_law][frequency][statistics][language_change][uniformitarian_principle]

■ #3891. 現代英語の様々な句読記号の使用頻度 [punctuation][alphabet][diacritical_mark][net_speak][brown][corpus][frequency][statistics][exclamation_mark]

■ #3884. 文字解読の「2経路」の対比 [spelling][grammatology][alphabet][reading][writing][psycholinguistics][kanji][frequency]

■ #3859. なぜ言語には不規則な現象があるのですか？ [sobokunagimon][frequency][suppletion]

■ #3662. "Recency Illusion" と "Frequency Illusion" [language_myth][language_change][frequency]

■ #3562. may 祈願文の生産性 [optative][productivity][frequency][bnc][auxiliary_verb][may]

■ #3512. 認識動詞の種類と頻度の通時的変化 [frequency][verb][comment_clause][semantic_field]

■ #3254. 高頻度がもたらす縮小効果と保存効果 [frequency][grammaticalisation][auxiliary_verb][suppletion][zipfs_law]

■ #3180. 徐々に高頻度語の仲間入りを果たしてきたフランス・ラテン借用語 [french][latin][loan_word][borrowing][frequency][statistics][lexicology][hc][bnc]

■ #3174. 高頻度語はスペリングが短い (2) [frequency][spelling][orthography][zipfs_law][statistics][lexicology][corpus]

■ #5557. 秋元実治（著）『増補　文法化とイディオム化』（ひつじ書房，2014年） [toc][grammaticalisation][idiomatisation][idiom][composite_predicate][syntax][lexicology][frequency][lexicalisation][preposition][phrasal_verb][voicy][heldio]