"- **ASL** = Average Sentence Length = (number of words) / (number of sentences)\n",
"- **ASW** = Average Syllables per Word = (number of syllables) / (number of words)\n",
"\n",
"📊 **Interpretation**:\n",
"- 90–100: Very easy\n",
"- 60–70: Standard\n",
"- 30–50: Difficult\n",
"- < 30: Very difficult\n",
"\n",
"---\n",
"\n",
"#### 2. 🟨 **LIX Index**\n",
"\n",
"Used widely in French and other European languages. Measures sentence length and lexical complexity.\n",
"\n",
"$\\text{LIX} = \\frac{\\text{number of words}}{\\text{number of sentences}} + \\frac{100 \\times \\text{number of long words (≥7 chars)}}{\\text{number of words}}$\n",
"\n",
"📊 **Interpretation**:\n",
"- $<$ 30: Easy\n",
"- 30–40: Medium\n",
"- $>$ 50: Difficult\n",
"\n",
"---\n",
"\n",
"#### 3. 🟥 **Kandel–Moles Index**\n",
"\n",
"A linear formula proposed for French readability:\n",
"\n",
"$\\text{Kandel–Moles} = 0.1935 \\times \\text{number of words} + 0.1672 \\times \\text{number of syllables} - 1.779$\n",
"\n",
"📊 **Interpretation**:\n",
"- Higher values indicate more complex texts.\n",
"\n",
"---\n",
"\n",
"These formulas help estimate how easily a French reader can understand a given passage. The metrics can be used to analyze textbooks, articles, instructional materials, etc."
]
},
{
"cell_type": "code",
"execution_count": 24,
"id": "b9052dc2-ce45-4af4-a0a0-46c60a13da12",
"metadata": {},
"outputs": [],
"source": [
"# Rewriting the readability metric functions here, without relying on downloading external resources\n",
"\n",
"import re\n",
"\n",
"# Naive sentence splitter (based on punctuation)\n",
"def naive_sentence_tokenize(text):\n",
" return re.split(r'[.!?]+', text.strip())\n",
"\n",
"# Naive word tokenizer (splits on whitespace and punctuation)\n",
-**ASL** = Average Sentence Length = (number of words) / (number of sentences)
-**ASW** = Average Syllables per Word = (number of syllables) / (number of words)
📊 **Interpretation**:
- 90–100: Very easy
- 60–70: Standard
- 30–50: Difficult
- < 30: Very difficult
---
#### 2. 🟨 **LIX Index**
Used widely in French and other European languages. Measures sentence length and lexical complexity.
$\text{LIX} = \frac{\text{number of words}}{\text{number of sentences}} + \frac{100 \times \text{number of long words (≥7 chars)}}{\text{number of words}}$
📊 **Interpretation**:
- $<$ 30: Easy
- 30–40: Medium
- $>$ 50: Difficult
---
#### 3. 🟥 **Kandel–Moles Index**
A linear formula proposed for French readability:
$\text{Kandel–Moles} = 0.1935 \times \text{number of words} + 0.1672 \times \text{number of syllables} - 1.779$
📊 **Interpretation**:
- Higher values indicate more complex texts.
---
These formulas help estimate how easily a French reader can understand a given passage. The metrics can be used to analyze textbooks, articles, instructional materials, etc.
"- **ASL** = Average Sentence Length = (number of words) / (number of sentences)\n",
"- **ASW** = Average Syllables per Word = (number of syllables) / (number of words)\n",
"\n",
"📊 **Interpretation**:\n",
"- 90–100: Very easy\n",
"- 60–70: Standard\n",
"- 30–50: Difficult\n",
"- < 30: Very difficult\n",
"\n",
"---\n",
"\n",
"#### 2. 🟨 **LIX Index**\n",
"\n",
"Used widely in French and other European languages. Measures sentence length and lexical complexity.\n",
"\n",
"$\\text{LIX} = \\frac{\\text{number of words}}{\\text{number of sentences}} + \\frac{100 \\times \\text{number of long words (≥7 chars)}}{\\text{number of words}}$\n",
"\n",
"📊 **Interpretation**:\n",
"- $<$ 30: Easy\n",
"- 30–40: Medium\n",
"- $>$ 50: Difficult\n",
"\n",
"---\n",
"\n",
"#### 3. 🟥 **Kandel–Moles Index**\n",
"\n",
"A linear formula proposed for French readability:\n",
"\n",
"$\\text{Kandel–Moles} = 0.1935 \\times \\text{number of words} + 0.1672 \\times \\text{number of syllables} - 1.779$\n",
"\n",
"📊 **Interpretation**:\n",
"- Higher values indicate more complex texts.\n",
"\n",
"---\n",
"\n",
"These formulas help estimate how easily a French reader can understand a given passage. The metrics can be used to analyze textbooks, articles, instructional materials, etc."
]
},
{
"cell_type": "code",
"execution_count": 24,
"id": "b9052dc2-ce45-4af4-a0a0-46c60a13da12",
"metadata": {},
"outputs": [],
"source": [
"# Rewriting the readability metric functions here, without relying on downloading external resources\n",
"\n",
"import re\n",
"\n",
"# Naive sentence splitter (based on punctuation)\n",
"def naive_sentence_tokenize(text):\n",
" return re.split(r'[.!?]+', text.strip())\n",
"\n",
"# Naive word tokenizer (splits on whitespace and punctuation)\n",
-**ASL** = Average Sentence Length = (number of words) / (number of sentences)
-**ASW** = Average Syllables per Word = (number of syllables) / (number of words)
📊 **Interpretation**:
- 90–100: Very easy
- 60–70: Standard
- 30–50: Difficult
- < 30: Very difficult
---
#### 2. 🟨 **LIX Index**
Used widely in French and other European languages. Measures sentence length and lexical complexity.
$\text{LIX} = \frac{\text{number of words}}{\text{number of sentences}} + \frac{100 \times \text{number of long words (≥7 chars)}}{\text{number of words}}$
📊 **Interpretation**:
- $<$ 30: Easy
- 30–40: Medium
- $>$ 50: Difficult
---
#### 3. 🟥 **Kandel–Moles Index**
A linear formula proposed for French readability:
$\text{Kandel–Moles} = 0.1935 \times \text{number of words} + 0.1672 \times \text{number of syllables} - 1.779$
📊 **Interpretation**:
- Higher values indicate more complex texts.
---
These formulas help estimate how easily a French reader can understand a given passage. The metrics can be used to analyze textbooks, articles, instructional materials, etc.