Sigrid Persdotter Bjurcrona. En släktroman by Ernst Didring : Difficulty Assessment for Swedish Learners

How difficult is Sigrid Persdotter Bjurcrona. En släktroman for Swedish learners? We have performed multiple tests on its full text (freely available here) of approximately 78,768, crunched all the numbers for you and present the results below.

Read the Full Text Now for Free!

Difficulty Assessment Summary

We have estimated Sigrid Persdotter Bjurcrona. En släktroman to have a difficulty score of 50. Here're its scores:

Measure Score
easy difficult (1 - 100)
Overall Difficulty 50% 50
Vocabulary Difficulty 53% 53
Grammatical Difficulty 47% 47

Vocabulary Difficulty: Breakdown

53%

Vocabulary difficulty: 53%

This score has been calculated based on frequency vocabulary (the top most frequently used words in Swedish). It combines various measures of Sigrid Persdotter Bjurcrona. En släktroman's text analyzed in terms of frequency vocabulary: a plain vocabulary score, frequency-weighted vocabulary score, banded frequency vocabulary scores based on vocabulary of the text falling in the top 1,000 or 2,000 most frequent words, etc. Here's a further breakdown of how often the top most frequently used words in Swedish appear in the full text of Sigrid Persdotter Bjurcrona. En släktroman:

Vocabulary difficulty breakdown for Sigrid Persdotter Bjurcrona. En släktroman: a test for Swedish top frequency vocabulary

We have also calculated the following approximate data on the vocabulary in Sigrid Persdotter Bjurcrona. En släktroman:

Measure Score
Measure Score
Number of words 78,768
Number of unique words 9,934
Number of recognized words for names/places/other entities 4,151
Number of very rare non-entity words 1,263
Number of sentences 13,688
Average number of words/sentence 6

There is some research suggesting that that you need to know about 98% of a text's vocabulary in order to be able to infer the meaning of unknown words when reading. If true, this means that you would need to know around 9,735 words (where all the forms of the word are still counted as unique words) in Swedish to be able to read Sigrid Persdotter Bjurcrona. En släktroman without a dictionary and fully understand it.

Grammatical Difficulty: Breakdown

47%

Grammatical difficulty: 47%

Here is the further grammatical comparison on this text. You can find an explanation of all these scores below.

Measure Score
Measure Score
Automated Readability Index 3
Coleman-Liau Index 6
Type/Token Ratio (TTR) 0.126117
Root type/Token Ratio (RTTR) 0.00000160112
Corrected type/Token Ratio (CTTR) 0.000000800561
MTLD Index 58
HDD Index 63
Yule's I Index 69
Lexical Diversity Index (MTLD + HD-D + Yule's I) 63

The type-token ratio (TTR) of Sigrid Persdotter Bjurcrona. En släktroman is 0.126117. The TTR is the most basic measure of lexical diversity. To calculate it, we divide the number of unique words by the number of words in the text. For example, for this text, the number of unique words is 9,934, while the number of words is 78,768, so the TTR is 9,934 / 78,768 = 0.126117. However, the TTR is a very crude measure, as it is extremely dependent on text length. The longer the text, the lower the TTR is usually going to be, since common words tend to often repeat. Especially since the number of words in this text is more than 1,000, the TTR is not likely to give an accurate measure.

The root type-token ratio (RTTR) and corrected type-token ratio (CTTR) are measures which were suggested by researchers to partially address the problem of TTR's variance on text length. In the RTTR, the number of unique words is divided by a square of the number of words (therefore, 9,934 / (78,768 * 78,768) = 0.00000160112), while in CTTR, it is divided by a square of the number of words, multiplied twice 9,934 / 2 * (78,768 * 78,768) = 0.000000800561). However, these measures are not as easily readable, and also there is a growing body of research asserting that CTTR and RTTR do not effectively address the problems of text length. Therefore, while we do provide the full text's TTR, RTTR and CTTR on this page, these fiqures do not form part of our final calculations.

The Automated Readability Index (ARI) is one readability measure that has been developed by researchers over the years. The formula for calculating the ARI is as follows:
Formula for calculating the Automated Readability Index

The ARI should compute a reading level approximately corresponding to the reader's grade level (assuming the reader undertakes formal education). Thus, for example, a value of 1 is kindergarten level, while a value of 12 or 13 is the last year of school, and 14 is a sophomore at college. The current ARI of this text is 3, making it understandable for 3-grade students at their expected level of education.

The Coleman Liau Index (CLI) is a similar index designed by Meri Coleman and T. L. Liau, and it is supposed to compute the grade level of the reader (thus, for example, sophomore level material would be around grade 14, or year 14 of formal education, while kindergarten / primary school level material would be close to grade 1 in the CLI). The CLI is usually slightly higher than the ARI. The CLI is computed with this formula:
Formula for calculating the Coleman-Liau Readability Index

It is notable that other indexes exist, such as the Flesch-Kincaid Reading Ease, Gunning-Fog Score, and others, but we have chosen not to include them, since, contrary to the ARI and CLI, such other indexes are based on a syllable count and therefore arguably only work for English and not Swedish.

We compute a further compound lexical diversity index, which should range from 1 to a 100 (with the standard deviation being around 10, and its average value being around 50) - it is 63 in the present case. The compound lexical diversity index consists of the following indexes, averaged out (and also provided in the table above):

  • the Measure of Textual Lexical Diversity (MTLD) index - a measure which is based on computing the TTR for increasingly larger parts of the text until the TTR drops below a certain threshold point (around 0.7 in our case) - in which case, the TTR is reset, and the overall counter is increased; the counter is at the end divided by the number of words in text; as a result, the MTLD does not significantly vary by text length;
  • the Yule's I index (based on Yule's K characteristic inverted) - an index based on the work of the statistician G.U. Yule, who published his index of Frequency Vocabulary in his paper "The statistical study of literary vocabulary"; Yule's I takes into account the number of words in the text, and a compound summed measure of word frequency;
  • the Hypergeometric Distribution D (HD-D) index (based on vocd) - an index which assesses the contribution of each word to the diversity of the text; to calculate such contributions, a hypergeometric distribution is used to compute probabilities of each word appearing in word samples extracted from the text; then such distributions are divided by sample sizes and added up;

Our overall measure of grammatical diversity is based on a combination of the compound lexical diversity index (which includes the MTLD, Yule's I and HD-D indexes), the ARI and CLI, all normalized and given certain weight. The score should normally range from 1 to 100. In this case, the score is 47.

Other Information about Sigrid Persdotter Bjurcrona. En släktroman by Ernst Didring

We provide you a sample of the text below, however, the full text of the Sigrid Persdotter Bjurcrona. En släktroman is also available free of charge on our website.

Sample of text:

»Alltså, mina damer.» Erik bugade sig för hennes nåd. Han kunde icke undgå att få hennes hand mot sina läppar. »Så intressant det där var», sade hennes nåd. Hon följde honom ut ur salongen, nedför trapporna, genom vestibulen och ut på den stora trappan. Hon ville veta mer om hjärtat. Erik gav henne några flera fakta. Hela tiden upprepade hennes nåd — åh, så intressant! eller — tänk, så märkvärdigt! Mycket gjorde nu också till att han bjöd henne armen utför trappan till nedre våningen. Det var synd om den tunga människan. ...

Top most frequently used words in Sigrid Persdotter Bjurcrona. En släktroman by Ernst Didring*

Position Word Repetitions Part of all words
Position Word Repetitions Part of all words
1 och 2,238 2.84%
2 att 1,844 2.34%
3 det 1,476 1.87%
4 1,316 1.67%
5 var 1,263 1.6%
6 hon 1,125 1.43%
7 icke 991 1.26%
8 han 983 1.25%
9 som 934 1.19%
10 en 833 1.06%
11 sig 823 1.04%
12 med 788 1%
13 hade 763 0.97%
14 för 703 0.89%
15 till 692 0.88%
16 Sigrid 668 0.85%
17 om 659 0.84%
18 den 536 0.68%
19 jag 533 0.68%
20 är 531 0.67%
21 500 0.63%
22 av 466 0.59%
23 skulle 428 0.54%
24 honom 413 0.52%
25 de 405 0.51%
26 henne 349 0.44%
27 ett 334 0.42%
28 men 333 0.42%
29 du 313 0.4%
30 sade 303 0.38%
31 har 295 0.37%
32 när 272 0.35%
33 Johan 267 0.34%
34 hennes 265 0.34%
35 kunde 265 0.34%
36 där 258 0.33%
37 ut 244 0.31%
38 något 242 0.31%
39 såg 237 0.3%
40 upp 223 0.28%
41 kan 212 0.27%
42 mig 211 0.27%
43 mycket 199 0.25%
44 gick 197 0.25%
45 Boström 196 0.25%
46 skall 194 0.25%
47 Gustaf 190 0.24%
48 över 190 0.24%
49 kom 190 0.24%
50 vad 181 0.23%
51 fröken 180 0.23%
52 från 176 0.22%
53 ville 175 0.22%
54 hans 169 0.21%
55 man 168 0.21%
56 sin 165 0.21%
57 vid 165 0.21%
58 allt 162 0.21%
59 Bjurnäs 162 0.21%
60 nu 156 0.2%
61 dem 156 0.2%
62 än 155 0.2%
63 in 153 0.19%
64 blev 151 0.19%
65 149 0.19%
66 vi 148 0.19%
67 efter 147 0.19%
68 nåd 147 0.19%
69 också 144 0.18%
70 inte 144 0.18%
71 måste 141 0.18%
72 själv 138 0.18%
73 fram 138 0.18%
74 eller 135 0.17%
75 vara 134 0.17%
76 mot 133 0.17%
77 frågade 125 0.16%
78 aldrig 124 0.16%
79 fick 121 0.15%
80 ha 120 0.15%
81 bli 118 0.15%
82 bara 117 0.15%
83 några 114 0.14%
84 mer 113 0.14%
85 utan 112 0.14%
86 tog 112 0.14%
87 ni 110 0.14%
88 se 110 0.14%
89 hur 110 0.14%
90 ned 110 0.14%
91 Erik 108 0.14%
92 varit 107 0.14%
93 säga 106 0.13%
94 Eva 104 0.13%
95 bort 103 0.13%
96 102 0.13%
97 99 0.13%
98 ännu 99 0.13%
99 hos 98 0.12%
100 Hansson 97 0.12%
101 Bjurcrona 97 0.12%
102 Ja 96 0.12%
103 alla 96 0.12%
104 här 96 0.12%
105 någon 96 0.12%
106 andra 95 0.12%
107 stod 95 0.12%
108 vet 94 0.12%
109 Per 92 0.12%
110 började 92 0.12%
111 vill 91 0.12%
112 gjorde 91 0.12%
113 sedan 91 0.12%
114 dig 91 0.12%
115 får 86 0.11%
116 sina 84 0.11%
117 min 83 0.11%
118 Nej 83 0.11%
119 ta 83 0.11%
120 satt 83 0.11%
121 alltid 82 0.1%
122 Greta 82 0.1%
123 stora 81 0.1%
124 göra 81 0.1%
125 farbror 81 0.1%
126 sitt 80 0.1%
127 voro 79 0.1%
128 nog 78 0.1%
129 patron 76 0.1%
130 tänkte 75 0.1%
131 ingen 74 0.09%
132 kanske 73 0.09%
133 åt 73 0.09%
134 ord 72 0.09%
135 redan 71 0.09%
136 Alby 71 0.09%
137 fått 70 0.09%
138 sagt 70 0.09%
139 låg 70 0.09%
140 kunna 69 0.09%
141 ur 68 0.09%
142 er 68 0.09%
143 kände 68 0.09%
144 annat 67 0.09%
145 Varför 67 0.09%
146 tala 67 0.09%
147 gott 66 0.08%
148 ingenting 66 0.08%
149 gården 65 0.08%
150 detta 65 0.08%
151 Torsell 65 0.08%
152 länge 64 0.08%
153 komma 63 0.08%
154 genom 63 0.08%
155 riktigt 63 0.08%
156 fanns 63 0.08%
157 gång 62 0.08%
158 Freja 62 0.08%
159 hela 62 0.08%
160 svarade 61 0.08%
161 kommer 59 0.07%
162 mamma 58 0.07%
163 igen 58 0.07%
164 många 58 0.07%
165 hand 56 0.07%
166 genast 56 0.07%
167 tillbaka 56 0.07%
168 år 56 0.07%
169 oss 56 0.07%
170 pappa 56 0.07%
171 bra 55 0.07%
172 annan 55 0.07%
173 god 54 0.07%
174 fru 54 0.07%
175 tyckte 54 0.07%
176 gamla 54 0.07%
177 ändå 53 0.07%
178 visst 53 0.07%
179 svårt 53 0.07%

This list excludes punctuation or single-letter words, also some different-case repeats of the same words.

If you think the text would be accessible to you, you can read it on our site (click on the cover to access):

Cover of Sigrid Persdotter Bjurcrona. En släktroman by Ernst Didring

Other resources and languages

If you like this analysis, you should have a look at out our lists of Swedish short stories and Swedish books.

If you like literature as a means to learn languages - please take a look at our project Interlinear Books. We even have a Swedish Interlinear book available for purchase.