Under kampen för bröd by J. L. Saxon : Difficulty Assessment for Swedish Learners

How difficult is Under kampen för bröd for Swedish learners? We have performed multiple tests on its full text (freely available here) of approximately 93,409, crunched all the numbers for you and present the results below.

Read the Full Text Now for Free!

Difficulty Assessment Summary

We have estimated Under kampen för bröd to have a difficulty score of 56. Here're its scores:

Measure		Score
	easy difficult	(1 - 100)
Overall Difficulty	56%	56
Vocabulary Difficulty	65%	65
Grammatical Difficulty	47%	47

Vocabulary Difficulty: Breakdown

65%

Vocabulary difficulty: 65%

This score has been calculated based on frequency vocabulary (the top most frequently used words in Swedish). It combines various measures of Under kampen för bröd's text analyzed in terms of frequency vocabulary: a plain vocabulary score, frequency-weighted vocabulary score, banded frequency vocabulary scores based on vocabulary of the text falling in the top 1,000 or 2,000 most frequent words, etc. Here's a further breakdown of how often the top most frequently used words in Swedish appear in the full text of Under kampen för bröd:

Vocabulary difficulty breakdown for Under kampen för bröd: a test for Swedish top frequency vocabulary

We have also calculated the following approximate data on the vocabulary in Under kampen för bröd:

Measure	Score
Measure	Score
Number of words	93,409
Number of unique words	13,311
Number of recognized words for names/places/other entities	4,074
Number of very rare non-entity words	2,752
Number of sentences	14,485
Average number of words/sentence	6

There is some research suggesting that that you need to know about 98% of a text's vocabulary in order to be able to infer the meaning of unknown words when reading. If true, this means that you would need to know around 13,044 words (where all the forms of the word are still counted as unique words) in Swedish to be able to read Under kampen för bröd without a dictionary and fully understand it.

Grammatical Difficulty: Breakdown

47%

Grammatical difficulty: 47%

Here is the further grammatical comparison on this text. You can find an explanation of all these scores below.

Measure	Score
Measure	Score
Automated Readability Index	2
Coleman-Liau Index	5
Type/Token Ratio (TTR)	0.142502
Root type/Token Ratio (RTTR)	0.00000152557
Corrected type/Token Ratio (CTTR)	0.000000762787
MTLD Index	61
HDD Index	66
Yule's I Index	74
Lexical Diversity Index (MTLD + HD-D + Yule's I)	67

The type-token ratio (TTR) of Under kampen för bröd is 0.142502. The TTR is the most basic measure of lexical diversity. To calculate it, we divide the number of unique words by the number of words in the text. For example, for this text, the number of unique words is 13,311, while the number of words is 93,409, so the TTR is 13,311 / 93,409 = 0.142502. However, the TTR is a very crude measure, as it is extremely dependent on text length. The longer the text, the lower the TTR is usually going to be, since common words tend to often repeat. Especially since the number of words in this text is more than 1,000, the TTR is not likely to give an accurate measure.

The root type-token ratio (RTTR) and corrected type-token ratio (CTTR) are measures which were suggested by researchers to partially address the problem of TTR's variance on text length. In the RTTR, the number of unique words is divided by a square of the number of words (therefore, 13,311 / (93,409 * 93,409) = 0.00000152557), while in CTTR, it is divided by a square of the number of words, multiplied twice 13,311 / 2 * (93,409 * 93,409) = 0.000000762787). However, these measures are not as easily readable, and also there is a growing body of research asserting that CTTR and RTTR do not effectively address the problems of text length. Therefore, while we do provide the full text's TTR, RTTR and CTTR on this page, these fiqures do not form part of our final calculations.

The Automated Readability Index (ARI) is one readability measure that has been developed by researchers over the years. The formula for calculating the ARI is as follows:
Formula for calculating the Automated Readability Index

The ARI should compute a reading level approximately corresponding to the reader's grade level (assuming the reader undertakes formal education). Thus, for example, a value of 1 is kindergarten level, while a value of 12 or 13 is the last year of school, and 14 is a sophomore at college. The current ARI of this text is 2, making it understandable for 2-grade students at their expected level of education.

The Coleman Liau Index (CLI) is a similar index designed by Meri Coleman and T. L. Liau, and it is supposed to compute the grade level of the reader (thus, for example, sophomore level material would be around grade 14, or year 14 of formal education, while kindergarten / primary school level material would be close to grade 1 in the CLI). The CLI is usually slightly higher than the ARI. The CLI is computed with this formula:
Formula for calculating the Coleman-Liau Readability Index

It is notable that other indexes exist, such as the Flesch-Kincaid Reading Ease, Gunning-Fog Score, and others, but we have chosen not to include them, since, contrary to the ARI and CLI, such other indexes are based on a syllable count and therefore arguably only work for English and not Swedish.

We compute a further compound lexical diversity index, which should range from 1 to a 100 (with the standard deviation being around 10, and its average value being around 50) - it is 67 in the present case. The compound lexical diversity index consists of the following indexes, averaged out (and also provided in the table above):

the Measure of Textual Lexical Diversity (MTLD) index - a measure which is based on computing the TTR for increasingly larger parts of the text until the TTR drops below a certain threshold point (around 0.7 in our case) - in which case, the TTR is reset, and the overall counter is increased; the counter is at the end divided by the number of words in text; as a result, the MTLD does not significantly vary by text length;
the Yule's I index (based on Yule's K characteristic inverted) - an index based on the work of the statistician G.U. Yule, who published his index of Frequency Vocabulary in his paper "The statistical study of literary vocabulary"; Yule's I takes into account the number of words in the text, and a compound summed measure of word frequency;
the Hypergeometric Distribution D (HD-D) index (based on vocd) - an index which assesses the contribution of each word to the diversity of the text; to calculate such contributions, a hypergeometric distribution is used to compute probabilities of each word appearing in word samples extracted from the text; then such distributions are divided by sample sizes and added up;

Our overall measure of grammatical diversity is based on a combination of the compound lexical diversity index (which includes the MTLD, Yule's I and HD-D indexes), the ARI and CLI, all normalized and given certain weight. The score should normally range from 1 to 100. In this case, the score is 47.

Other Information about Under kampen för bröd by J. L. Saxon

We provide you a sample of the text below, however, the full text of the Under kampen för bröd is also available free of charge on our website.

Sample of text:

» Hon talade bevekande till honom som aldrig förr. Men ingenting hjälpte. Vid midnattstid väcktes huset av att Andersson höll ett förfärligt liv utanför sin dörr. De grövsta eder växlade med de förfärligaste hotelser, om hustrun icke genast öppnade. Någon av grannarne sade: »Om fru Andersson öppnar, skola vi svara för, att ingenting ont skall hända henne.» Intet svar. »Hon har förmodligen gått ut», sade en annan av grannarne. Ingen ville hysa Andersson under natten, och så anstäldes en rad försök med olika nycklar för att få upp dörren. ...

Top most frequently used words in Under kampen för bröd by J. L. Saxon*

Position	Word	Repetitions	Part of all words
Position	Word	Repetitions	Part of all words
1	och	2,386	2.55%
2	att	2,212	2.37%
3	det	1,855	1.99%
4	en	1,258	1.35%
5	han	1,240	1.33%
6	som	1,224	1.31%
7	jag	1,174	1.26%
8	på	1,120	1.2%
9	så	1,089	1.17%
10	för	1,043	1.12%
11	var	982	1.05%
12	inte	887	0.95%
13	med	881	0.94%
14	är	873	0.93%
15	till	832	0.89%
16	den	699	0.75%
17	av	691	0.74%
18	sig	686	0.73%
19	hade	684	0.73%
20	de	614	0.66%
21	om	596	0.64%
22	du	586	0.63%
23	hon	568	0.61%
24	ett	514	0.55%
25	mig	506	0.54%
26	har	441	0.47%
27	Men	385	0.41%
28	skulle	383	0.41%
29	honom	352	0.38%
30	då	350	0.37%
31	där	333	0.36%
32	man	292	0.31%
33	än	289	0.31%
34	ej	281	0.3%
35	ut	277	0.3%
36	ha	272	0.29%
37	få	271	0.29%
38	henne	266	0.28%
39	dig	266	0.28%
40	skall	266	0.28%
41	något	254	0.27%
42	sade	249	0.27%
43	Erik	244	0.26%
44	sin	241	0.26%
45	nu	238	0.25%
46	icke	227	0.24%
47	kan	224	0.24%
48	vara	218	0.23%
49	hans	215	0.23%
50	upp	212	0.23%
51	ni	211	0.23%
52	vad	209	0.22%
53	kom	207	0.22%
54	ju	205	0.22%
55	kunde	200	0.21%
56	min	199	0.21%
57	över	191	0.2%
58	också	187	0.2%
59	mycket	184	0.2%
60	Ja	176	0.19%
61	när	175	0.19%
62	vi	173	0.19%
63	bli	170	0.18%
64	såg	169	0.18%
65	väl	166	0.18%
66	efter	161	0.17%
67	alt	160	0.17%
68	här	160	0.17%
69	vid	157	0.17%
70	fick	157	0.17%
71	åt	155	0.17%
72	från	154	0.16%
73	dem	151	0.16%
74	gick	149	0.16%
75	in	147	0.16%
76	varit	143	0.15%
77	gå	142	0.15%
78	utan	140	0.15%
79	Per	140	0.15%
80	er	140	0.15%
81	alla	138	0.15%
82	någon	136	0.15%
83	aldrig	133	0.14%
84	nog	131	0.14%
85	göra	131	0.14%
86	BRÖD	126	0.13%
87	ville	125	0.13%
88	hennes	125	0.13%
89	eller	125	0.13%
90	får	125	0.13%
91	sedan	122	0.13%
92	KAMPEN	122	0.13%
93	själv	121	0.13%
94	vill	119	0.13%
95	UNDER	116	0.12%
96	år	115	0.12%
97	detta	114	0.12%
98	se	113	0.12%
99	svarade	110	0.12%
100	sitt	108	0.12%
101	fått	108	0.12%
102	Nej	107	0.11%
103	mer	105	0.11%
104	komma	102	0.11%
105	andra	102	0.11%
106	säga	101	0.11%
107	ingen	100	0.11%
108	annat	96	0.1%
109	staden	93	0.1%
110	sina	92	0.1%
111	blott	92	0.1%
112	mot	92	0.1%
113	hos	91	0.1%
114	blev	91	0.1%
115	kunna	91	0.1%
116	gång	90	0.1%
117	voro	90	0.1%
118	alltid	88	0.09%
119	stod	87	0.09%
120	hem	86	0.09%
121	oss	86	0.09%
122	fram	86	0.09%
123	bara	85	0.09%
124	visste	84	0.09%
125	äro	84	0.09%
126	huru	84	0.09%
127	mitt	83	0.09%
128	ta	80	0.09%
129	vet	80	0.09%
130	blir	77	0.08%
131	denna	77	0.08%
132	hela	77	0.08%
133	blivit	76	0.08%
134	alldeles	76	0.08%
135	genom	75	0.08%
136	dag	75	0.08%
137	far	75	0.08%
138	frågade	74	0.08%
139	går	74	0.08%
140	tänkte	74	0.08%
141	din	73	0.08%
142	vart	73	0.08%
143	Ty	72	0.08%
144	många	69	0.07%
145	ingenting	69	0.07%
146	bra	68	0.07%
147	SAXON	68	0.07%
148	bättre	68	0.07%
149	gjorde	66	0.07%
150	gården	66	0.07%
151	gått	66	0.07%
152	därför	66	0.07%
153	tog	64	0.07%
154	några	61	0.07%
155	gjort	60	0.06%
156	tyckte	60	0.06%
157	kommer	59	0.06%
158	måste	59	0.06%
159	ur	59	0.06%
160	annan	59	0.06%
161	snart	58	0.06%
162	satt	58	0.06%
163	ögon	57	0.06%
164	Jo	57	0.06%
165	mor	57	0.06%
166	SAX	57	0.06%
167	kommit	56	0.06%
168	AT	56	0.06%
169	dock	56	0.06%
170	liv	56	0.06%
171	igen	55	0.06%
172	tala	55	0.06%
173	gott	54	0.06%

This list excludes punctuation or single-letter words, also some different-case repeats of the same words.

If you think the text would be accessible to you, you can read it on our site (click on the cover to access):

Other resources and languages

If you like this analysis, you should have a look at out our lists of Swedish short stories and Swedish books.

If you like literature as a means to learn languages - please take a look at our project Interlinear Books. We even have a Swedish Interlinear book available for purchase.