Kloka Maja och andra berättelser by Frans Hedberg : Difficulty Assessment for Swedish Learners

How difficult is Kloka Maja och andra berättelser for Swedish learners? We have performed multiple tests on its full text (freely available here) of approximately 41,165, crunched all the numbers for you and present the results below.

Read the Full Text Now for Free!

Difficulty Assessment Summary

We have estimated Kloka Maja och andra berättelser to have a difficulty score of 58. Here're its scores:

Measure Score
easy difficult (1 - 100)
Overall Difficulty 58% 58
Vocabulary Difficulty 72% 72
Grammatical Difficulty 44% 44

Vocabulary Difficulty: Breakdown

72%

Vocabulary difficulty: 72%

This score has been calculated based on frequency vocabulary (the top most frequently used words in Swedish). It combines various measures of Kloka Maja och andra berättelser's text analyzed in terms of frequency vocabulary: a plain vocabulary score, frequency-weighted vocabulary score, banded frequency vocabulary scores based on vocabulary of the text falling in the top 1,000 or 2,000 most frequent words, etc. Here's a further breakdown of how often the top most frequently used words in Swedish appear in the full text of Kloka Maja och andra berättelser:

Vocabulary difficulty breakdown for Kloka Maja och andra berättelser: a test for Swedish top frequency vocabulary

We have also calculated the following approximate data on the vocabulary in Kloka Maja och andra berättelser:

Measure Score
Measure Score
Number of words 41,165
Number of unique words 6,606
Number of recognized words for names/places/other entities 1,583
Number of very rare non-entity words 1,723
Number of sentences 5,138
Average number of words/sentence 8

There is some research suggesting that that you need to know about 98% of a text's vocabulary in order to be able to infer the meaning of unknown words when reading. If true, this means that you would need to know around 6,473 words (where all the forms of the word are still counted as unique words) in Swedish to be able to read Kloka Maja och andra berättelser without a dictionary and fully understand it.

Grammatical Difficulty: Breakdown

44%

Grammatical difficulty: 44%

Here is the further grammatical comparison on this text. You can find an explanation of all these scores below.

Measure Score
Measure Score
Automated Readability Index 2
Coleman-Liau Index 5
Type/Token Ratio (TTR) 0.160476
Root type/Token Ratio (RTTR) 0.00000389836
Corrected type/Token Ratio (CTTR) 0.00000194918
MTLD Index 57
HDD Index 63
Yule's I Index 67
Lexical Diversity Index (MTLD + HD-D + Yule's I) 62

The type-token ratio (TTR) of Kloka Maja och andra berättelser is 0.160476. The TTR is the most basic measure of lexical diversity. To calculate it, we divide the number of unique words by the number of words in the text. For example, for this text, the number of unique words is 6,606, while the number of words is 41,165, so the TTR is 6,606 / 41,165 = 0.160476. However, the TTR is a very crude measure, as it is extremely dependent on text length. The longer the text, the lower the TTR is usually going to be, since common words tend to often repeat. Especially since the number of words in this text is more than 1,000, the TTR is not likely to give an accurate measure.

The root type-token ratio (RTTR) and corrected type-token ratio (CTTR) are measures which were suggested by researchers to partially address the problem of TTR's variance on text length. In the RTTR, the number of unique words is divided by a square of the number of words (therefore, 6,606 / (41,165 * 41,165) = 0.00000389836), while in CTTR, it is divided by a square of the number of words, multiplied twice 6,606 / 2 * (41,165 * 41,165) = 0.00000194918). However, these measures are not as easily readable, and also there is a growing body of research asserting that CTTR and RTTR do not effectively address the problems of text length. Therefore, while we do provide the full text's TTR, RTTR and CTTR on this page, these fiqures do not form part of our final calculations.

The Automated Readability Index (ARI) is one readability measure that has been developed by researchers over the years. The formula for calculating the ARI is as follows:
Formula for calculating the Automated Readability Index

The ARI should compute a reading level approximately corresponding to the reader's grade level (assuming the reader undertakes formal education). Thus, for example, a value of 1 is kindergarten level, while a value of 12 or 13 is the last year of school, and 14 is a sophomore at college. The current ARI of this text is 2, making it understandable for 2-grade students at their expected level of education.

The Coleman Liau Index (CLI) is a similar index designed by Meri Coleman and T. L. Liau, and it is supposed to compute the grade level of the reader (thus, for example, sophomore level material would be around grade 14, or year 14 of formal education, while kindergarten / primary school level material would be close to grade 1 in the CLI). The CLI is usually slightly higher than the ARI. The CLI is computed with this formula:
Formula for calculating the Coleman-Liau Readability Index

It is notable that other indexes exist, such as the Flesch-Kincaid Reading Ease, Gunning-Fog Score, and others, but we have chosen not to include them, since, contrary to the ARI and CLI, such other indexes are based on a syllable count and therefore arguably only work for English and not Swedish.

We compute a further compound lexical diversity index, which should range from 1 to a 100 (with the standard deviation being around 10, and its average value being around 50) - it is 62 in the present case. The compound lexical diversity index consists of the following indexes, averaged out (and also provided in the table above):

  • the Measure of Textual Lexical Diversity (MTLD) index - a measure which is based on computing the TTR for increasingly larger parts of the text until the TTR drops below a certain threshold point (around 0.7 in our case) - in which case, the TTR is reset, and the overall counter is increased; the counter is at the end divided by the number of words in text; as a result, the MTLD does not significantly vary by text length;
  • the Yule's I index (based on Yule's K characteristic inverted) - an index based on the work of the statistician G.U. Yule, who published his index of Frequency Vocabulary in his paper "The statistical study of literary vocabulary"; Yule's I takes into account the number of words in the text, and a compound summed measure of word frequency;
  • the Hypergeometric Distribution D (HD-D) index (based on vocd) - an index which assesses the contribution of each word to the diversity of the text; to calculate such contributions, a hypergeometric distribution is used to compute probabilities of each word appearing in word samples extracted from the text; then such distributions are divided by sample sizes and added up;

Our overall measure of grammatical diversity is based on a combination of the compound lexical diversity index (which includes the MTLD, Yule's I and HD-D indexes), the ARI and CLI, all normalized and given certain weight. The score should normally range from 1 to 100. In this case, the score is 44.

Other Information about Kloka Maja och andra berättelser by Frans Hedberg

We provide you a sample of the text below, however, the full text of the Kloka Maja och andra berättelser is also available free of charge on our website.

Sample of text:

Hade inte det där stockholmsspektakle" kommi emellan, så hade han väl allt varit mästerlots länge sen — men si lotschefen var så arg, så han ville rakt köra borfen med en gång!» »Håhå! Det fick väl inte gå utan dom och rannsakning ändå, tänker jag! Nog ä’ di styfva af sig, ämbets-männema, men ser du, den allmänna meningen —» »Nej, hör du, bror Holm! Är det meningen att vi ska gräla igen! Annars tror jag det är klokast att vi håller oss ifrån poletiken! Och vill du som jag, så bryter ...

Top most frequently used words in Kloka Maja och andra berättelser by Frans Hedberg*

Position Word Repetitions Part of all words
Position Word Repetitions Part of all words
1 och 1,581 3.84%
2 det 998 2.42%
3 att 782 1.9%
4 jag 690 1.68%
5 som 670 1.63%
6 644 1.56%
7 en 584 1.42%
8 han 526 1.28%
9 520 1.26%
10 inte 482 1.17%
11 med 456 1.11%
12 den 411 1%
13 för 409 0.99%
14 var 378 0.92%
15 sig 363 0.88%
16 hon 356 0.86%
17 om 323 0.78%
18 till 280 0.68%
19 af 275 0.67%
20 du 271 0.66%
21 de 260 0.63%
22 hade 222 0.54%
23 där 207 0.5%
24 ett 201 0.49%
25 när 200 0.49%
26 men 198 0.48%
27 nu 196 0.48%
28 173 0.42%
29 mig 169 0.41%
30 honom 151 0.37%
31 skulle 151 0.37%
32 har 147 0.36%
33 ut 145 0.35%
34 kan 144 0.35%
35 är 141 0.34%
36 min 141 0.34%
37 väl 121 0.29%
38 nog 117 0.28%
39 sin 111 0.27%
40 te 110 0.27%
41 henne 109 0.26%
42 ska 108 0.26%
43 ni 105 0.26%
44 åt 104 0.25%
45 än 98 0.24%
46 vara 96 0.23%
47 vi 93 0.23%
48 man 92 0.22%
49 vid 90 0.22%
50 far 90 0.22%
51 Ja 89 0.22%
52 öfver 87 0.21%
53 kunde 87 0.21%
54 86 0.21%
55 gumman 84 0.2%
56 från 84 0.2%
57 kom 83 0.2%
58 efter 82 0.2%
59 ha 79 0.19%
60 här 79 0.19%
61 svarade 77 0.19%
62 igen 76 0.18%
63 mycket 76 0.18%
64 såg 75 0.18%
65 upp 74 0.18%
66 di 73 0.18%
67 mor 72 0.17%
68 eller 71 0.17%
69 gick 71 0.17%
70 hvad 70 0.17%
71 blef 70 0.17%
72 se 69 0.17%
73 också 69 0.17%
74 ju 69 0.17%
75 fäll 68 0.17%
76 sedan 67 0.16%
77 in 65 0.16%
78 några 65 0.16%
79 hur 65 0.16%
80 själf 64 0.16%
81 dig 64 0.16%
82 utan 64 0.16%
83 alla 63 0.15%
84 sade 62 0.15%
85 får 61 0.15%
86 allt 61 0.15%
87 aldrig 60 0.15%
88 vet 60 0.15%
89 varit 59 0.14%
90 litet 58 0.14%
91 56 0.14%
92 hennes 55 0.13%
93 sitt 54 0.13%
94 något 54 0.13%
95 ville 54 0.13%
96 Magnus 53 0.13%
97 frågade 53 0.13%
98 sina 52 0.13%
99 dem 52 0.13%
100 gamla 52 0.13%
101 bara 52 0.13%
102 hem 51 0.12%
103 under 50 0.12%
104 me 50 0.12%
105 gång 49 0.12%
106 ännu 49 0.12%
107 vill 48 0.12%
108 fick 48 0.12%
109 ner 48 0.12%
110 andra 47 0.11%
111 fram 47 0.11%
112 va 47 0.11%
113 Maja 46 0.11%
114 säga 46 0.11%
115 Österman 46 0.11%
116 icke 45 0.11%
117 Edla 45 0.11%
118 bli 44 0.11%
119 ropade 44 0.11%
120 alltid 44 0.11%
121 er 43 0.1%
122 någon 43 0.1%
123 Lina 43 0.1%
124 si 43 0.1%
125 åf 43 0.1%
126 ute 42 0.1%
127 kommer 41 0.1%
128 bra 41 0.1%
129 någe 40 0.1%
130 hans 40 0.1%
131 lilla 39 0.09%
132 tog 39 0.09%
133 ta 38 0.09%
134 komma 38 0.09%
135 bort 38 0.09%
136 Jonsson 38 0.09%
137 tyckte 38 0.09%
138 Anders 38 0.09%
139 Nej 37 0.09%
140 därför 37 0.09%
141 unga 37 0.09%
142 oss 36 0.09%
143 fått 36 0.09%
144 emot 36 0.09%
145 gjorde 36 0.09%
146 ifrån 35 0.09%
147 både 35 0.09%
148 ändå 35 0.09%
149 borta 35 0.09%
150 mej 34 0.08%
151 satt 34 0.08%
152 låg 34 0.08%
153 par 34 0.08%
154 göra 33 0.08%
155 tror 33 0.08%
156 heller 33 0.08%
157 Augusta 33 0.08%
158 går 33 0.08%
159 Pålson 33 0.08%
160 ty 32 0.08%
161 gammal 32 0.08%
162 Lotta 32 0.08%
163 tillbaka 32 0.08%
164 hela 31 0.08%
165 medan 31 0.08%
166 kära 31 0.08%
167 nästan 31 0.08%
168 dej 31 0.08%
169 mera 30 0.07%
170 god 30 0.07%
171 hand 30 0.07%
172 mitt 30 0.07%
173 voro 30 0.07%
174 riktigt 30 0.07%
175 omkring 30 0.07%
176 Stina 30 0.07%
177 blir 30 0.07%
178 mer 30 0.07%
179 opp 29 0.07%
180 samma 29 0.07%
181 helt 29 0.07%
182 stora 29 0.07%
183 tänker 29 0.07%
184 just 29 0.07%
185 tycker 29 0.07%
186 lika 28 0.07%
187 ena 28 0.07%

This list excludes punctuation or single-letter words, also some different-case repeats of the same words.

If you think the text would be accessible to you, you can read it on our site (click on the cover to access):

Cover of Kloka Maja och andra berättelser by Frans Hedberg

Other resources and languages

If you like this analysis, you should have a look at out our lists of Swedish short stories and Swedish books.

If you like literature as a means to learn languages - please take a look at our project Interlinear Books. We even have a Swedish Interlinear book available for purchase.