Hennerson. Historien om en gårdskarl by Mikael Lybeck : Difficulty Assessment for Swedish Learners

How difficult is Hennerson. Historien om en gårdskarl for Swedish learners? We have performed multiple tests on its full text (freely available here) of approximately 19,485, crunched all the numbers for you and present the results below.

Read the Full Text Now for Free!

Difficulty Assessment Summary

We have estimated Hennerson. Historien om en gårdskarl to have a difficulty score of 69. Here're its scores:

Measure Score
easy difficult (1 - 100)
Overall Difficulty 69% 69
Vocabulary Difficulty 80% 80
Grammatical Difficulty 58% 58

Vocabulary Difficulty: Breakdown

80%

Vocabulary difficulty: 80%

This score has been calculated based on frequency vocabulary (the top most frequently used words in Swedish). It combines various measures of Hennerson. Historien om en gårdskarl's text analyzed in terms of frequency vocabulary: a plain vocabulary score, frequency-weighted vocabulary score, banded frequency vocabulary scores based on vocabulary of the text falling in the top 1,000 or 2,000 most frequent words, etc. Here's a further breakdown of how often the top most frequently used words in Swedish appear in the full text of Hennerson. Historien om en gårdskarl:

Vocabulary difficulty breakdown for Hennerson. Historien om en gårdskarl: a test for Swedish top frequency vocabulary

We have also calculated the following approximate data on the vocabulary in Hennerson. Historien om en gårdskarl:

Measure Score
Measure Score
Number of words 19,485
Number of unique words 5,122
Number of recognized words for names/places/other entities 961
Number of very rare non-entity words 1,008
Number of sentences 3,445
Average number of words/sentence 6

There is some research suggesting that that you need to know about 98% of a text's vocabulary in order to be able to infer the meaning of unknown words when reading. If true, this means that you would need to know around 5,019 words (where all the forms of the word are still counted as unique words) in Swedish to be able to read Hennerson. Historien om en gårdskarl without a dictionary and fully understand it.

Grammatical Difficulty: Breakdown

58%

Grammatical difficulty: 58%

Here is the further grammatical comparison on this text. You can find an explanation of all these scores below.

Measure Score
Measure Score
Automated Readability Index 4
Coleman-Liau Index 7
Type/Token Ratio (TTR) 0.262869
Root type/Token Ratio (RTTR) 0.0000134908
Corrected type/Token Ratio (CTTR) 0.00000674542
MTLD Index 84
HDD Index 69
Yule's I Index 84
Lexical Diversity Index (MTLD + HD-D + Yule's I) 79

The type-token ratio (TTR) of Hennerson. Historien om en gårdskarl is 0.262869. The TTR is the most basic measure of lexical diversity. To calculate it, we divide the number of unique words by the number of words in the text. For example, for this text, the number of unique words is 5,122, while the number of words is 19,485, so the TTR is 5,122 / 19,485 = 0.262869. However, the TTR is a very crude measure, as it is extremely dependent on text length. The longer the text, the lower the TTR is usually going to be, since common words tend to often repeat. Especially since the number of words in this text is more than 1,000, the TTR is not likely to give an accurate measure.

The root type-token ratio (RTTR) and corrected type-token ratio (CTTR) are measures which were suggested by researchers to partially address the problem of TTR's variance on text length. In the RTTR, the number of unique words is divided by a square of the number of words (therefore, 5,122 / (19,485 * 19,485) = 0.0000134908), while in CTTR, it is divided by a square of the number of words, multiplied twice 5,122 / 2 * (19,485 * 19,485) = 0.00000674542). However, these measures are not as easily readable, and also there is a growing body of research asserting that CTTR and RTTR do not effectively address the problems of text length. Therefore, while we do provide the full text's TTR, RTTR and CTTR on this page, these fiqures do not form part of our final calculations.

The Automated Readability Index (ARI) is one readability measure that has been developed by researchers over the years. The formula for calculating the ARI is as follows:
Formula for calculating the Automated Readability Index

The ARI should compute a reading level approximately corresponding to the reader's grade level (assuming the reader undertakes formal education). Thus, for example, a value of 1 is kindergarten level, while a value of 12 or 13 is the last year of school, and 14 is a sophomore at college. The current ARI of this text is 4, making it understandable for 4-grade students at their expected level of education.

The Coleman Liau Index (CLI) is a similar index designed by Meri Coleman and T. L. Liau, and it is supposed to compute the grade level of the reader (thus, for example, sophomore level material would be around grade 14, or year 14 of formal education, while kindergarten / primary school level material would be close to grade 1 in the CLI). The CLI is usually slightly higher than the ARI. The CLI is computed with this formula:
Formula for calculating the Coleman-Liau Readability Index

It is notable that other indexes exist, such as the Flesch-Kincaid Reading Ease, Gunning-Fog Score, and others, but we have chosen not to include them, since, contrary to the ARI and CLI, such other indexes are based on a syllable count and therefore arguably only work for English and not Swedish.

We compute a further compound lexical diversity index, which should range from 1 to a 100 (with the standard deviation being around 10, and its average value being around 50) - it is 79 in the present case. The compound lexical diversity index consists of the following indexes, averaged out (and also provided in the table above):

  • the Measure of Textual Lexical Diversity (MTLD) index - a measure which is based on computing the TTR for increasingly larger parts of the text until the TTR drops below a certain threshold point (around 0.7 in our case) - in which case, the TTR is reset, and the overall counter is increased; the counter is at the end divided by the number of words in text; as a result, the MTLD does not significantly vary by text length;
  • the Yule's I index (based on Yule's K characteristic inverted) - an index based on the work of the statistician G.U. Yule, who published his index of Frequency Vocabulary in his paper "The statistical study of literary vocabulary"; Yule's I takes into account the number of words in the text, and a compound summed measure of word frequency;
  • the Hypergeometric Distribution D (HD-D) index (based on vocd) - an index which assesses the contribution of each word to the diversity of the text; to calculate such contributions, a hypergeometric distribution is used to compute probabilities of each word appearing in word samples extracted from the text; then such distributions are divided by sample sizes and added up;

Our overall measure of grammatical diversity is based on a combination of the compound lexical diversity index (which includes the MTLD, Yule's I and HD-D indexes), the ARI and CLI, all normalized and given certain weight. The score should normally range from 1 to 100. In this case, the score is 58.

Other Information about Hennerson. Historien om en gårdskarl by Mikael Lybeck

We provide you a sample of the text below, however, the full text of the Hennerson. Historien om en gårdskarl is also available free of charge on our website.

Sample of text:

Sedan tog han på sig en ren, hvit skjorla, trädde ett par snörsmala ringar af gummi öfver ärmarna, för att dessa icke skulle glida ned, bytte om kläder och skodon — förstdå var psalmsjungandets efterlängtade stund inne. Kvällsmålet förtärdes under isande tystnad, och så drog han sig åter tillbaka i ensamheten. Bibeln väntade. Han satt vid bordet med hopknäppta händer, ljudlöst formande textens ord med läpparna, syrendoften strömmade emot honom genom det öppna fönstret. Stundom glömde han läsningen och höll hufvudet upplyft, länge, länge, blicken i de bruna ögonen beslöjad. ...

Top most frequently used words in Hennerson. Historien om en gårdskarl by Mikael Lybeck*

Position Word Repetitions Part of all words
Position Word Repetitions Part of all words
1 och 490 2.51%
2 det 379 1.95%
3 jag 326 1.67%
4 en 307 1.58%
5 han 289 1.48%
6 att 284 1.46%
7 237 1.22%
8 är 204 1.05%
9 sig 198 1.02%
10 som 188 0.96%
11 till 176 0.9%
12 med 174 0.89%
13 för 163 0.84%
14 157 0.81%
15 hon 155 0.8%
16 ett 154 0.79%
17 den 152 0.78%
18 af 145 0.74%
19 inte 144 0.74%
20 var 135 0.69%
21 mig 115 0.59%
22 honom 101 0.52%
23 har 99 0.51%
24 om 97 0.5%
25 Hennerson 90 0.46%
26 Frida 90 0.46%
27 hade 84 0.43%
28 de 68 0.35%
29 men 67 0.34%
30 sin 65 0.33%
31 du 62 0.32%
32 Lackau 58 0.3%
33 nu 55 0.28%
34 hvad 50 0.26%
35 vid 49 0.25%
36 där 47 0.24%
37 öfver 44 0.23%
38 Gud 44 0.23%
39 dem 43 0.22%
40 henne 43 0.22%
41 vara 42 0.22%
42 min 42 0.22%
43 icke 42 0.22%
44 man 42 0.22%
45 än 41 0.21%
46 efter 40 0.21%
47 utan 39 0.2%
48 skulle 38 0.2%
49 när 38 0.2%
50 under 37 0.19%
51 ja 37 0.19%
52 ju 37 0.19%
53 ha 36 0.18%
54 kunde 36 0.18%
55 själf 35 0.18%
56 syster 35 0.18%
57 vi 35 0.18%
58 Siri 33 0.17%
59 bara 33 0.17%
60 mycket 33 0.17%
61 eller 33 0.17%
62 baronen 32 0.16%
63 från 31 0.16%
64 ut 31 0.16%
65 sitt 30 0.15%
66 hans 30 0.15%
67 denna 29 0.15%
68 29 0.15%
69 åt 28 0.14%
70 kan 28 0.14%
71 baron 28 0.14%
72 dig 28 0.14%
73 Samuel 27 0.14%
74 nej 27 0.14%
75 här 27 0.14%
76 Cornelius 26 0.13%
77 ingen 25 0.13%
78 också 25 0.13%
79 allt 25 0.13%
80 fröken 24 0.12%
81 hur 24 0.12%
82 mer 24 0.12%
83 mot 24 0.12%
84 sina 24 0.12%
85 mitt 23 0.12%
86 ord 23 0.12%
87 vill 23 0.12%
88 något 22 0.11%
89 vet 22 0.11%
90 hennes 21 0.11%
91 blef 21 0.11%
92 upp 21 0.11%
93 kanske 21 0.11%
94 gång 21 0.11%
95 väl 21 0.11%
96 just 21 0.11%
97 aldrig 21 0.11%
98 gjorde 20 0.1%
99 oss 20 0.1%
100 måste 19 0.1%
101 alltid 19 0.1%
102 någon 19 0.1%
103 19 0.1%
104 andra 19 0.1%
105 helt 18 0.09%
106 Klas 18 0.09%
107 hvar 18 0.09%
108 någonting 18 0.09%
109 nog 17 0.09%
110 länge 17 0.09%
111 år 17 0.09%
112 kom 17 0.09%
113 satt 17 0.09%
114 ögonen 17 0.09%
115 igen 17 0.09%
116 gaf 17 0.09%
117 Helena 16 0.08%
118 all 16 0.08%
119 tog 16 0.08%
120 varit 16 0.08%
121 liksom 16 0.08%
122 ändå 16 0.08%
123 nästan 16 0.08%
124 såg 16 0.08%
125 Uggelberg 16 0.08%
126 ofta 16 0.08%
127 alla 16 0.08%
128 mellan 16 0.08%
129 två 16 0.08%
130 många 16 0.08%
131 Dagmar 15 0.08%
132 gick 15 0.08%
133 endast 15 0.08%
134 ska 15 0.08%
135 in 15 0.08%
136 ens 15 0.08%
137 alldeles 15 0.08%
138 genom 15 0.08%
139 Herren 15 0.08%
140 gör 15 0.08%
141 15 0.08%
142 rätt 15 0.08%
143 lilla 14 0.07%
144 par 14 0.07%
145 riktigt 14 0.07%
146 annat 14 0.07%
147 stund 13 0.07%
148 lät 13 0.07%
149 stod 13 0.07%
150 namn 13 0.07%
151 Guds 13 0.07%
152 skull 13 0.07%
153 får 13 0.07%
154 menar 13 0.07%
155 strax 13 0.07%
156 göra 13 0.07%
157 ta 13 0.07%
158 känner 13 0.07%
159 säga 13 0.07%
160 litet 13 0.07%
161 redan 12 0.06%
162 hos 12 0.06%
163 ena 12 0.06%
164 del 12 0.06%
165 stor 12 0.06%
166 lite 12 0.06%
167 ingenting 12 0.06%
168 lycklig 12 0.06%
169 Hildegard 12 0.06%
170 tid 12 0.06%
171 hela 12 0.06%
172 fram 12 0.06%
173 Emellertid 12 0.06%
174 genast 12 0.06%
175 säkert 12 0.06%
176 djupt 12 0.06%
177 tack 11 0.06%
178 hufvudet 11 0.06%
179 ned 11 0.06%

This list excludes punctuation or single-letter words, also some different-case repeats of the same words.

If you think the text would be accessible to you, you can read it on our site (click on the cover to access):

Cover of Hennerson. Historien om en gårdskarl by Mikael Lybeck

Other resources and languages

If you like this analysis, you should have a look at out our lists of Swedish short stories and Swedish books.

If you like literature as a means to learn languages - please take a look at our project Interlinear Books. We even have a Swedish Interlinear book available for purchase.