Två Somrar i Norra Ishafvet, Första delen - Kung Karls land, Spetsbergens kringsegling by Alfred Nathorst : Difficulty Assessment for Swedish Learners

How difficult is Två Somrar i Norra Ishafvet, Första delen - Kung Karls land, Spetsbergens kringsegling for Swedish learners? We have performed multiple tests on its full text (freely available here) of approximately 79,675, crunched all the numbers for you and present the results below.

Read the Full Text Now for Free!

Difficulty Assessment Summary

We have estimated Två Somrar i Norra Ishafvet, Första delen - Kung Karls land, Spetsbergens kringsegling to have a difficulty score of 71. Here're its scores:

Measure Score
easy difficult (1 - 100)
Overall Difficulty 71% 71
Vocabulary Difficulty 82% 82
Grammatical Difficulty 61% 61

Vocabulary Difficulty: Breakdown

82%

Vocabulary difficulty: 82%

This score has been calculated based on frequency vocabulary (the top most frequently used words in Swedish). It combines various measures of Två Somrar i Norra Ishafvet, Första delen - Kung Karls land, Spetsbergens kringsegling's text analyzed in terms of frequency vocabulary: a plain vocabulary score, frequency-weighted vocabulary score, banded frequency vocabulary scores based on vocabulary of the text falling in the top 1,000 or 2,000 most frequent words, etc. Here's a further breakdown of how often the top most frequently used words in Swedish appear in the full text of Två Somrar i Norra Ishafvet, Första delen - Kung Karls land, Spetsbergens kringsegling:

Vocabulary difficulty breakdown for Två Somrar i Norra Ishafvet, Första delen - Kung Karls land, Spetsbergens kringsegling: a test for Swedish top frequency vocabulary

We have also calculated the following approximate data on the vocabulary in Två Somrar i Norra Ishafvet, Första delen - Kung Karls land, Spetsbergens kringsegling:

Measure Score
Measure Score
Number of words 79,675
Number of unique words 11,873
Number of recognized words for names/places/other entities 3,748
Number of very rare non-entity words 6,950
Number of sentences 12,300
Average number of words/sentence 6

There is some research suggesting that that you need to know about 98% of a text's vocabulary in order to be able to infer the meaning of unknown words when reading. If true, this means that you would need to know around 11,635 words (where all the forms of the word are still counted as unique words) in Swedish to be able to read Två Somrar i Norra Ishafvet, Första delen - Kung Karls land, Spetsbergens kringsegling without a dictionary and fully understand it.

Grammatical Difficulty: Breakdown

61%

Grammatical difficulty: 61%

Here is the further grammatical comparison on this text. You can find an explanation of all these scores below.

Measure Score
Measure Score
Automated Readability Index 6
Coleman-Liau Index 10
Type/Token Ratio (TTR) 0.149018
Root type/Token Ratio (RTTR) 0.00000187032
Corrected type/Token Ratio (CTTR) 0.000000935161
MTLD Index 72
HDD Index 70
Yule's I Index 86
Lexical Diversity Index (MTLD + HD-D + Yule's I) 76

The type-token ratio (TTR) of Två Somrar i Norra Ishafvet, Första delen - Kung Karls land, Spetsbergens kringsegling is 0.149018. The TTR is the most basic measure of lexical diversity. To calculate it, we divide the number of unique words by the number of words in the text. For example, for this text, the number of unique words is 11,873, while the number of words is 79,675, so the TTR is 11,873 / 79,675 = 0.149018. However, the TTR is a very crude measure, as it is extremely dependent on text length. The longer the text, the lower the TTR is usually going to be, since common words tend to often repeat. Especially since the number of words in this text is more than 1,000, the TTR is not likely to give an accurate measure.

The root type-token ratio (RTTR) and corrected type-token ratio (CTTR) are measures which were suggested by researchers to partially address the problem of TTR's variance on text length. In the RTTR, the number of unique words is divided by a square of the number of words (therefore, 11,873 / (79,675 * 79,675) = 0.00000187032), while in CTTR, it is divided by a square of the number of words, multiplied twice 11,873 / 2 * (79,675 * 79,675) = 0.000000935161). However, these measures are not as easily readable, and also there is a growing body of research asserting that CTTR and RTTR do not effectively address the problems of text length. Therefore, while we do provide the full text's TTR, RTTR and CTTR on this page, these fiqures do not form part of our final calculations.

The Automated Readability Index (ARI) is one readability measure that has been developed by researchers over the years. The formula for calculating the ARI is as follows:
Formula for calculating the Automated Readability Index

The ARI should compute a reading level approximately corresponding to the reader's grade level (assuming the reader undertakes formal education). Thus, for example, a value of 1 is kindergarten level, while a value of 12 or 13 is the last year of school, and 14 is a sophomore at college. The current ARI of this text is 6, making it understandable for 6-grade students at their expected level of education.

The Coleman Liau Index (CLI) is a similar index designed by Meri Coleman and T. L. Liau, and it is supposed to compute the grade level of the reader (thus, for example, sophomore level material would be around grade 14, or year 14 of formal education, while kindergarten / primary school level material would be close to grade 1 in the CLI). The CLI is usually slightly higher than the ARI. The CLI is computed with this formula:
Formula for calculating the Coleman-Liau Readability Index

It is notable that other indexes exist, such as the Flesch-Kincaid Reading Ease, Gunning-Fog Score, and others, but we have chosen not to include them, since, contrary to the ARI and CLI, such other indexes are based on a syllable count and therefore arguably only work for English and not Swedish.

We compute a further compound lexical diversity index, which should range from 1 to a 100 (with the standard deviation being around 10, and its average value being around 50) - it is 76 in the present case. The compound lexical diversity index consists of the following indexes, averaged out (and also provided in the table above):

  • the Measure of Textual Lexical Diversity (MTLD) index - a measure which is based on computing the TTR for increasingly larger parts of the text until the TTR drops below a certain threshold point (around 0.7 in our case) - in which case, the TTR is reset, and the overall counter is increased; the counter is at the end divided by the number of words in text; as a result, the MTLD does not significantly vary by text length;
  • the Yule's I index (based on Yule's K characteristic inverted) - an index based on the work of the statistician G.U. Yule, who published his index of Frequency Vocabulary in his paper "The statistical study of literary vocabulary"; Yule's I takes into account the number of words in the text, and a compound summed measure of word frequency;
  • the Hypergeometric Distribution D (HD-D) index (based on vocd) - an index which assesses the contribution of each word to the diversity of the text; to calculate such contributions, a hypergeometric distribution is used to compute probabilities of each word appearing in word samples extracted from the text; then such distributions are divided by sample sizes and added up;

Our overall measure of grammatical diversity is based on a combination of the compound lexical diversity index (which includes the MTLD, Yule's I and HD-D indexes), the ARI and CLI, all normalized and given certain weight. The score should normally range from 1 to 100. In this case, the score is 61.

Other Information about Två Somrar i Norra Ishafvet, Första delen - Kung Karls land, Spetsbergens kringsegling by Alfred Nathorst

We provide you a sample of the text below, however, the full text of the Två Somrar i Norra Ishafvet, Första delen - Kung Karls land, Spetsbergens kringsegling is also available free of charge on our website.

Sample of text:

Solen hade märkvärdigt hastigt smält undan den nyfallna snön på bergsluttningarne på västra sidan af fjorden, och det såg nu vida mindre vinterlikt ut än då vi kommo hit. Jag ämnade gå till Wijdebay för att landstiga vid Greyhook, och i vackert väder ångade vi förbi Verlegen Hook. När vi sedan passerade Mosselbay, spejade jag ifrigt genom kikaren efter det svenska huset från Nordenskiölds öfvervintring 1872—73, och när jag upptäckt det, lät jag hälsa med flaggan och kallade upp gunrumspersonalen på däck för att genom ett kraftigt fyrfaldigt hurrarop betyga vår aktning för ett svenskt manligt och trofast arbete i forskningens tjänst. När vi på eftermiddagen voro i Wijdebay öster om Greyhook, sattes en båt ut för att föra Hamberg och J. G. ...

Top most frequently used words in Två Somrar i Norra Ishafvet, Första delen - Kung Karls land, Spetsbergens kringsegling by Alfred Nathorst*

Position Word Repetitions Part of all words
Position Word Repetitions Part of all words
1 och 2,275 2.86%
2 af 1,802 2.26%
3 1,459 1.83%
4 att 1,454 1.82%
5 en 1,168 1.47%
6 den 991 1.24%
7 som 942 1.18%
8 till 847 1.06%
9 det 810 1.02%
10 med 800 1%
11 vi 761 0.96%
12 var 738 0.93%
13 för 715 0.9%
14 de 664 0.83%
15 jag 658 0.83%
16 hade 544 0.68%
17 ett 515 0.65%
18 från 412 0.52%
19 vid 387 0.49%
20 ej 385 0.48%
21 378 0.47%
22 om 372 0.47%
23 är 372 0.47%
24 sig 348 0.44%
25 men 330 0.41%
26 land 285 0.36%
27 där 256 0.32%
28 skulle 255 0.32%
29 252 0.32%
30 under 249 0.31%
31 äfven 246 0.31%
32 mot 237 0.3%
33 man 234 0.29%
34 nu 227 0.28%
35 öfver 227 0.28%
36 icke 210 0.26%
37 kunde 206 0.26%
38 såsom 197 0.25%
39 oss 189 0.24%
40 voro 183 0.23%
41 denna 179 0.22%
42 efter 178 0.22%
43 mig 177 0.22%
44 eller 176 0.22%
45 samt 162 0.2%
46 mycket 162 0.2%
47 här 158 0.2%
48 än 154 0.19%
49 detta 152 0.19%
50 hvilken 150 0.19%
51 ty 147 0.18%
52 genom 146 0.18%
53 upp 145 0.18%
54 han 145 0.18%
55 har 145 0.18%
56 hvilka 142 0.18%
57 dock 139 0.17%
58 Bay 138 0.17%
59 sedan 135 0.17%
60 något 132 0.17%
61 berg 130 0.16%
62 vara 129 0.16%
63 blef 127 0.16%
64 Kolthoff 125 0.16%
65 andra 125 0.16%
66 Karls 123 0.15%
67 ut 123 0.15%
68 Kung 123 0.15%
69 några 122 0.15%
70 varit 121 0.15%
71 åter 118 0.15%
72 äro 114 0.14%
73 Andersson 113 0.14%
74 113 0.14%
75 måste 112 0.14%
76 Antarctic 112 0.14%
77 ombord 111 0.14%
78 Kap 108 0.14%
79 ön 108 0.14%
80 sin 107 0.13%
81 någon 107 0.13%
82 hvilket 106 0.13%
83 dessa 105 0.13%
84 stranden 104 0.13%
85 hafva 104 0.13%
86 Spetsbergen 104 0.13%
87 blifvit 103 0.13%
88 Hamberg 102 0.13%
89 utan 102 0.13%
90 1898 101 0.13%
91 samma 100 0.13%
92 långt 98 0.12%
93 ännu 97 0.12%
94 medan 97 0.12%
95 endast 97 0.12%
96 stora 96 0.12%
97 redan 94 0.12%
98 sidan 89 0.11%
99 ganska 88 0.11%
100 kan 88 0.11%
101 se 86 0.11%
102 klockan 86 0.11%
103 två 86 0.11%
104 stor 86 0.11%
105 fotografi 86 0.11%
106 när 85 0.11%
107 alla 84 0.11%
108 ned 81 0.1%
109 kunna 81 0.1%
110 in 81 0.1%
111 kom 81 0.1%
112 båten 80 0.1%
113 densamma 80 0.1%
114 dess 80 0.1%
115 Kjellström 79 0.1%
116 isen 78 0.1%
117 första 77 0.1%
118 komma 77 0.1%
119 mellan 76 0.1%
120 morgonen 76 0.1%
121 min 75 0.09%
122 arbeten 75 0.09%
123 Van 74 0.09%
124 längre 73 0.09%
125 förut 73 0.09%
126 flere 73 0.09%
127 båda 72 0.09%
128 mindre 71 0.09%
129 Beeren 70 0.09%
130 hvarför 70 0.09%
131 geologiska 70 0.09%
132 dag 70 0.09%
133 sida 70 0.09%
134 därför 69 0.09%
135 östra 68 0.09%
136 fick 68 0.09%
137 allt 68 0.09%
138 Svenska 67 0.08%
139 del 67 0.08%
140 hela 66 0.08%
141 vår 66 0.08%
142 ju 65 0.08%
143 par 65 0.08%
144 fram 64 0.08%
145 fartyget 63 0.08%
146 nog 63 0.08%
147 väl 62 0.08%
148 utanför 61 0.08%
149 grund 61 0.08%
150 åt 61 0.08%
151 gick 60 0.08%
152 nämligen 60 0.08%
153 Eiland 60 0.08%
154 dem 60 0.08%
155 is 59 0.07%
156 tre 59 0.07%
157 större 59 0.07%
158 meter 58 0.07%
159 syntes 58 0.07%
160 helt 58 0.07%
161 först 58 0.07%
162 norra 56 0.07%
163 sina 56 0.07%
164 nära 56 0.07%
165 Recherche 54 0.07%
166 expeditionen 54 0.07%
167 augusti 53 0.07%
168 sitt 53 0.07%
169 deras 52 0.07%
170 mera 52 0.07%
171 låg 52 0.07%
172 kapten 51 0.06%
173 västra 51 0.06%
174 olika 51 0.06%
175 kommo 51 0.06%
176 snart 51 0.06%
177 51 0.06%
178 gjorde 51 0.06%

This list excludes punctuation or single-letter words, also some different-case repeats of the same words.

If you think the text would be accessible to you, you can read it on our site (click on the cover to access):

Cover of Två Somrar i Norra Ishafvet, Första delen - Kung Karls land, Spetsbergens kringsegling by Alfred Nathorst

Other resources and languages

If you like this analysis, you should have a look at out our lists of Swedish short stories and Swedish books.

If you like literature as a means to learn languages - please take a look at our project Interlinear Books. We even have a Swedish Interlinear book available for purchase.