Lifvets seger. Berättelse by Helena Westermarck : Difficulty Assessment for Swedish Learners

How difficult is Lifvets seger. Berättelse for Swedish learners? We have performed multiple tests on its full text (freely available here) of approximately 24,211, crunched all the numbers for you and present the results below.

Read the Full Text Now for Free!

Difficulty Assessment Summary

We have estimated Lifvets seger. Berättelse to have a difficulty score of 65. Here're its scores:

Measure Score
easy difficult (1 - 100)
Overall Difficulty 65% 65
Vocabulary Difficulty 73% 73
Grammatical Difficulty 58% 58

Vocabulary Difficulty: Breakdown

73%

Vocabulary difficulty: 73%

This score has been calculated based on frequency vocabulary (the top most frequently used words in Swedish). It combines various measures of Lifvets seger. Berättelse's text analyzed in terms of frequency vocabulary: a plain vocabulary score, frequency-weighted vocabulary score, banded frequency vocabulary scores based on vocabulary of the text falling in the top 1,000 or 2,000 most frequent words, etc. Here's a further breakdown of how often the top most frequently used words in Swedish appear in the full text of Lifvets seger. Berättelse:

Vocabulary difficulty breakdown for Lifvets seger. Berättelse: a test for Swedish top frequency vocabulary

We have also calculated the following approximate data on the vocabulary in Lifvets seger. Berättelse:

Measure Score
Measure Score
Number of words 24,211
Number of unique words 5,698
Number of recognized words for names/places/other entities 925
Number of very rare non-entity words 1,288
Number of sentences 3,797
Average number of words/sentence 6

There is some research suggesting that that you need to know about 98% of a text's vocabulary in order to be able to infer the meaning of unknown words when reading. If true, this means that you would need to know around 5,584 words (where all the forms of the word are still counted as unique words) in Swedish to be able to read Lifvets seger. Berättelse without a dictionary and fully understand it.

Grammatical Difficulty: Breakdown

58%

Grammatical difficulty: 58%

Here is the further grammatical comparison on this text. You can find an explanation of all these scores below.

Measure Score
Measure Score
Automated Readability Index 5
Coleman-Liau Index 9
Type/Token Ratio (TTR) 0.235348
Root type/Token Ratio (RTTR) 0.00000972069
Corrected type/Token Ratio (CTTR) 0.00000486034
MTLD Index 77
HDD Index 67
Yule's I Index 78
Lexical Diversity Index (MTLD + HD-D + Yule's I) 74

The type-token ratio (TTR) of Lifvets seger. Berättelse is 0.235348. The TTR is the most basic measure of lexical diversity. To calculate it, we divide the number of unique words by the number of words in the text. For example, for this text, the number of unique words is 5,698, while the number of words is 24,211, so the TTR is 5,698 / 24,211 = 0.235348. However, the TTR is a very crude measure, as it is extremely dependent on text length. The longer the text, the lower the TTR is usually going to be, since common words tend to often repeat. Especially since the number of words in this text is more than 1,000, the TTR is not likely to give an accurate measure.

The root type-token ratio (RTTR) and corrected type-token ratio (CTTR) are measures which were suggested by researchers to partially address the problem of TTR's variance on text length. In the RTTR, the number of unique words is divided by a square of the number of words (therefore, 5,698 / (24,211 * 24,211) = 0.00000972069), while in CTTR, it is divided by a square of the number of words, multiplied twice 5,698 / 2 * (24,211 * 24,211) = 0.00000486034). However, these measures are not as easily readable, and also there is a growing body of research asserting that CTTR and RTTR do not effectively address the problems of text length. Therefore, while we do provide the full text's TTR, RTTR and CTTR on this page, these fiqures do not form part of our final calculations.

The Automated Readability Index (ARI) is one readability measure that has been developed by researchers over the years. The formula for calculating the ARI is as follows:
Formula for calculating the Automated Readability Index

The ARI should compute a reading level approximately corresponding to the reader's grade level (assuming the reader undertakes formal education). Thus, for example, a value of 1 is kindergarten level, while a value of 12 or 13 is the last year of school, and 14 is a sophomore at college. The current ARI of this text is 5, making it understandable for 5-grade students at their expected level of education.

The Coleman Liau Index (CLI) is a similar index designed by Meri Coleman and T. L. Liau, and it is supposed to compute the grade level of the reader (thus, for example, sophomore level material would be around grade 14, or year 14 of formal education, while kindergarten / primary school level material would be close to grade 1 in the CLI). The CLI is usually slightly higher than the ARI. The CLI is computed with this formula:
Formula for calculating the Coleman-Liau Readability Index

It is notable that other indexes exist, such as the Flesch-Kincaid Reading Ease, Gunning-Fog Score, and others, but we have chosen not to include them, since, contrary to the ARI and CLI, such other indexes are based on a syllable count and therefore arguably only work for English and not Swedish.

We compute a further compound lexical diversity index, which should range from 1 to a 100 (with the standard deviation being around 10, and its average value being around 50) - it is 74 in the present case. The compound lexical diversity index consists of the following indexes, averaged out (and also provided in the table above):

  • the Measure of Textual Lexical Diversity (MTLD) index - a measure which is based on computing the TTR for increasingly larger parts of the text until the TTR drops below a certain threshold point (around 0.7 in our case) - in which case, the TTR is reset, and the overall counter is increased; the counter is at the end divided by the number of words in text; as a result, the MTLD does not significantly vary by text length;
  • the Yule's I index (based on Yule's K characteristic inverted) - an index based on the work of the statistician G.U. Yule, who published his index of Frequency Vocabulary in his paper "The statistical study of literary vocabulary"; Yule's I takes into account the number of words in the text, and a compound summed measure of word frequency;
  • the Hypergeometric Distribution D (HD-D) index (based on vocd) - an index which assesses the contribution of each word to the diversity of the text; to calculate such contributions, a hypergeometric distribution is used to compute probabilities of each word appearing in word samples extracted from the text; then such distributions are divided by sample sizes and added up;

Our overall measure of grammatical diversity is based on a combination of the compound lexical diversity index (which includes the MTLD, Yule's I and HD-D indexes), the ARI and CLI, all normalized and given certain weight. The score should normally range from 1 to 100. In this case, the score is 58.

Other Information about Lifvets seger. Berättelse by Helena Westermarck

We provide you a sample of the text below, however, the full text of the Lifvets seger. Berättelse is also available free of charge on our website.

Sample of text:

Arbetet med barnen hade väkt hennes själfkänsla, som nu fått träda i den forna obetingade undergifvenhetens ställe. Hon kunde både befalla och fordra, och när det ned-ärfda heta krigarsinnet rann på henne, blef tiden mellan ord och handling sällan lång. Men denna maktutöfning skedde städse under fullaste öfverty-gelse om att alla åtgärderna i den något gammaldags handgripliga uppfostran vidtogos till Guds ära och den gamla släktens fortbestånd. I början utförde hon sitt värf endast af omutlig pliktkänsla; mången gång vred hon sig under uppgiftens tyngd och bad mången bön till ...

Top most frequently used words in Lifvets seger. Berättelse by Helena Westermarck*

Position Word Repetitions Part of all words
Position Word Repetitions Part of all words
1 och 802 3.31%
2 att 456 1.88%
3 som 433 1.79%
4 den 401 1.66%
5 hon 353 1.46%
6 det 348 1.44%
7 med 331 1.37%
8 en 323 1.33%
9 259 1.07%
10 af 256 1.06%
11 till 243 1%
12 var 239 0.99%
13 hade 238 0.98%
14 för 236 0.97%
15 sig 231 0.95%
16 icke 203 0.84%
17 de 192 0.79%
18 han 173 0.71%
19 om 162 0.67%
20 ett 145 0.6%
21 fröken 124 0.51%
22 skulle 112 0.46%
23 henne 104 0.43%
24 101 0.42%
25 Henriette 100 0.41%
26 när 95 0.39%
27 gamla 95 0.39%
28 hennes 95 0.39%
29 sin 88 0.36%
30 kunde 88 0.36%
31 där 87 0.36%
32 Men 85 0.35%
33 från 74 0.31%
34 upp 73 0.3%
35 öfver 72 0.3%
36 ut 64 0.26%
37 eller 63 0.26%
38 mot 62 0.26%
39 under 62 0.26%
40 Mathias 61 0.25%
41 alla 60 0.25%
42 vid 59 0.24%
43 kom 56 0.23%
44 Marusja 55 0.23%
45 stora 54 0.22%
46 honom 54 0.22%
47 sina 53 0.22%
48 allt 53 0.22%
49 ty 52 0.21%
50 man 50 0.21%
51 varit 49 0.2%
52 ned 48 0.2%
53 hans 47 0.19%
54 liksom 47 0.19%
55 sitt 47 0.19%
56 gång 46 0.19%
57 unga 46 0.19%
58 fru 46 0.19%
59 sedan 45 0.19%
60 denna 45 0.19%
61 genom 43 0.18%
62 nu 43 0.18%
63 43 0.18%
64 såg 42 0.17%
65 än 41 0.17%
66 gick 40 0.17%
67 efter 40 0.17%
68 någon 40 0.17%
69 fram 39 0.16%
70 något 38 0.16%
71 måste 38 0.16%
72 blef 38 0.16%
73 åter 38 0.16%
74 aldrig 37 0.15%
75 ord 36 0.15%
76 lilla 36 0.15%
77 hela 35 0.14%
78 in 35 0.14%
79 blifvit 35 0.14%
80 assessorn 34 0.14%
81 stod 34 0.14%
82 endast 33 0.14%
83 komma 33 0.14%
84 jag 33 0.14%
85 byn 33 0.14%
86 är 33 0.14%
87 dem 32 0.13%
88 voro 32 0.13%
89 stranden 32 0.13%
90 se 32 0.13%
91 andra 31 0.13%
92 sade 31 0.13%
93 ingen 31 0.13%
94 här 31 0.13%
95 ju 30 0.12%
96 två 30 0.12%
97 huru 30 0.12%
98 barnen 30 0.12%
99 frun 29 0.12%
100 hvad 29 0.12%
101 gjorde 29 0.12%
102 detta 28 0.12%
103 kommit 28 0.12%
104 dessa 28 0.12%
105 Rosendahl 28 0.12%
106 barn 27 0.11%
107 själf 27 0.11%
108 Mina 27 0.11%
109 mycket 27 0.11%
110 taga 26 0.11%
111 små 26 0.11%
112 både 25 0.1%
113 äfven 25 0.1%
114 ute 24 0.1%
115 samma 24 0.1%
116 löjtnanten 24 0.1%
117 medan 24 0.1%
118 ännu 24 0.1%
119 ögon 24 0.1%
120 du 23 0.09%
121 tid 23 0.09%
122 vackra 22 0.09%
123 göra 22 0.09%
124 22 0.09%
125 mera 22 0.09%
126 såsom 22 0.09%
127 väl 22 0.09%
128 vara 22 0.09%
129 dag 21 0.09%
130 sådan 21 0.09%
131 kände 21 0.09%
132 hörde 21 0.09%
133 lif 21 0.09%
134 har 21 0.09%
135 visste 20 0.08%
136 plötsligt 20 0.08%
137 utan 20 0.08%
138 annat 19 0.08%
139 blifva 19 0.08%
140 ville 19 0.08%
141 faster 19 0.08%
142 bara 19 0.08%
143 19 0.08%
144 hafvet 19 0.08%
145 långa 19 0.08%
146 åt 18 0.07%
147 Skärnäset 18 0.07%
148 bort 18 0.07%
149 också 18 0.07%
150 ur 18 0.07%
151 stormen 17 0.07%
152 lifvet 17 0.07%
153 dess 17 0.07%
154 låg 17 0.07%
155 Vigströmskan 17 0.07%
156 emedan 17 0.07%
157 kan 17 0.07%
158 många 17 0.07%
159 herre 17 0.07%
160 hvarje 17 0.07%
161 annan 17 0.07%
162 kring 17 0.07%
163 alltid 17 0.07%
164 folket 16 0.07%
165 redan 16 0.07%
166 ehuru 16 0.07%
167 dagen 16 0.07%
168 hos 16 0.07%
169 igen 16 0.07%
170 fick 16 0.07%
171 gått 16 0.07%
172 talade 16 0.07%
173 år 16 0.07%
174 hem 16 0.07%
175 hand 16 0.07%
176 lik 15 0.06%
177 första 15 0.06%
178 numera 15 0.06%
179 tala 15 0.06%
180 några 15 0.06%
181 damen 15 0.06%
182 ro 15 0.06%
183 skall 15 0.06%
184 länge 15 0.06%

This list excludes punctuation or single-letter words, also some different-case repeats of the same words.

If you think the text would be accessible to you, you can read it on our site (click on the cover to access):

Cover of Lifvets seger. Berättelse by Helena Westermarck

Other resources and languages

If you like this analysis, you should have a look at out our lists of Swedish short stories and Swedish books.

If you like literature as a means to learn languages - please take a look at our project Interlinear Books. We even have a Swedish Interlinear book available for purchase.