What are Derivations?
Derivation is the process of generating a string from a grammar by repeatedly applying production rules.
Simple Analogy
Think of derivation like cooking from a recipe:
- Recipe: Grammar rules
- Ingredients: Variables and terminals
- Cooking process: Apply rules step by step
- Final dish: The string generated
Each step replaces a variable with the right side of a rule!
Types of Derivations
1. Leftmost Derivation
Leftmost Derivation: Always replace the leftmost variable in each step
Notation: S ⇒ₗₘ α
Example
Grammar:
S → AB
A → aA | ε
B → bB | ε
Derive: “aabb”
Leftmost derivation:
S ⇒ₗₘ AB (apply S → AB)
⇒ₗₘ aAB (expand A, leftmost variable)
⇒ₗₘ aaAB (expand A again)
⇒ₗₘ aaB (apply A → ε)
⇒ₗₘ aabB (expand B)
⇒ₗₘ aabbB (expand B)
⇒ₗₘ aabb (apply B → ε)
Each step expands the leftmost variable!
2. Rightmost Derivation
Rightmost Derivation: Always replace the rightmost variable in each step
Notation: S ⇒ᵣₘ α
Also called: Canonical derivation
Example
Same grammar, derive: “aabb”
Rightmost derivation:
S ⇒ᵣₘ AB (apply S → AB)
⇒ᵣₘ AbB (expand B, rightmost variable)
⇒ᵣₘ AbbB (expand B again)
⇒ᵣₘ Abb (apply B → ε)
⇒ᵣₘ aAbb (expand A)
⇒ᵣₘ aaAbb (expand A)
⇒ᵣₘ aabb (apply A → ε)
Each step expands the rightmost variable!
3. Arbitrary Derivation
Can expand any variable at any step
Not restricted to left or right
Example: Same grammar
S ⇒ AB
⇒ aAB (expand A - left)
⇒ aAbB (expand B - right)
⇒ aaAbB (expand A - middle left)
⇒ aabB (expand A - middle left)
⇒ aabbB (expand B - right)
⇒ aabb (expand B - right)
Key Observations
Important:
- Different derivation orders can produce the same string
- Leftmost and rightmost derivations are canonical (unique order)
- Arbitrary derivations are flexible but not standard
Parse Trees (Derivation Trees)
Parse Tree is a graphical representation of a derivation.
Structure
- Root: Start symbol S
- Internal nodes: Variables (non-terminals)
- Leaves: Terminals (or ε)
- Edges: Applications of production rules
How to Construct
Given derivation: Build tree step by step
Example: S → AB, A → aA | ε, B → b
Derive: “aab”
Derivation:
S ⇒ AB ⇒ aAB ⇒ aaAB ⇒ aaB ⇒ aab
Parse Tree:
S
/ \
A B
/| |
a A b
/|
a A
|
ε
Reading leaves left-to-right gives: a, a, ε, b → “aab” ✓
Properties of Parse Trees
1. Unique representation: One parse tree can have multiple derivations (leftmost, rightmost, arbitrary)
2. Yield: Reading leaves left-to-right gives the derived string
3. Structure: Shows hierarchical relationship of grammar rules
4. Ambiguity: Multiple parse trees for same string means ambiguous grammar
Example: Arithmetic Expression
Grammar:
E → E + E | E * E | (E) | id
Derive: id + id * id
Parse Tree 1 (+ at root):
E
/||\
E + E
| /|\
id E * E
| |
id id
Meaning: (id + id) * id ❌ Wrong!
Parse Tree 2 (* at root):
E
/|\
E + E
| /|\
id E * E
| |
id id
Actually, let me fix the tree structure…
Parse Tree 1 (+ at root, then *):
E
/|\
E + E
| /|\
id E * E
| |
id id
Meaning: id + (id * id) ✓ Correct!
Parse Tree 2 (* at root, then +):
E
/|\
E * E
/|\ |
E + E id
| |
id id
Meaning: (id + id) * id ❌ Wrong precedence!
Problem: Grammar is ambiguous - multiple parse trees possible!
Ambiguity
Definition
A grammar G is ambiguous if there exists a string w ∈ L(G) that has:
- Two or more distinct leftmost derivations, OR
- Two or more distinct rightmost derivations, OR
- Two or more distinct parse trees
All three definitions are equivalent!
Why Ambiguity is a Problem
In compilers:
- Ambiguity leads to multiple meanings
- Same expression could evaluate differently
- Unpredictable behavior
Example: id + id * id
- Interpretation 1: (id + id) _ id = (2 + 3) _ 4 = 20
- Interpretation 2: id + (id _ id) = 2 + (3 _ 4) = 14
Different results! 🚨
Detecting Ambiguity
No algorithm exists to determine if arbitrary grammar is ambiguous!
Ambiguity is undecidable for CFGs
Must check manually:
- Find a string
- Try to construct multiple parse trees
- If successful → ambiguous
Examples of Ambiguous Grammars
Example 1: Arithmetic Expressions
Ambiguous Grammar:
E → E + E | E * E | id
String: id + id * id
Parse Tree 1: (id + id) _ id Parse Tree 2: id + (id _ id)
Ambiguous! ✗
Example 2: If-Then-Else
Ambiguous Grammar:
S → if E then S
| if E then S else S
| other
String: if E1 then if E2 then S1 else S2
Parse Tree 1: else matches second if
if E1 then (if E2 then S1 else S2)
Parse Tree 2: else matches first if
if E1 then (if E2 then S1) else S2
Dangling else problem! ✗
Example 3: String Concatenation
Ambiguous Grammar:
S → SS | a
String: aaa
Parse Tree 1:
S
/ \
S S
| / \
a S S
| |
a a
Parse Tree 2:
S
/ \
S S
/ \ |
S S a
| |
a a
Ambiguous! ✗
Removing Ambiguity
Goal: Rewrite grammar to be unambiguous
Note: Not all ambiguous grammars can be rewritten! Some languages are inherently ambiguous.
Strategy 1: Introduce Precedence
Problem: E → E + E | E * E | id (ambiguous)
Solution: Use multiple variables for precedence levels
Unambiguous Grammar:
E → E + T | T (addition, lowest precedence)
T → T * F | F (multiplication, higher precedence)
F → (E) | id (parentheses, highest precedence)
Why it works:
- Must build * first (deeper in tree)
- Then + at higher level
- Parentheses can override
Parse Tree for id + id * id:
E
/|\
E + T
| /|\
T T * F
| | |
F F id
| |
id id
Meaning: id + (id * id) ✓ Correct!
Only one parse tree possible! ✓
Strategy 2: Enforce Associativity
Left Associative (+ and - evaluate left-to-right):
E → E + T | T
T → T * F | F
F → (E) | id
Right Associative (e.g., exponentiation):
E → T ^ E | T
T → (E) | id
Example: 2 ^ 3 ^ 4
- Left associative: (2 ^ 3) ^ 4 = 8 ^ 4 = 4096
- Right associative: 2 ^ (3 ^ 4) = 2 ^ 81 = huge!
Exponentiation is naturally right associative ✓
Strategy 3: If-Then-Else (Dangling Else)
Ambiguous:
S → if E then S
| if E then S else S
| other
Solution 1: Match else with nearest if
S → if E then S else S | U
U → if E then S | if E then U else U | other
Solution 2: Use explicit delimiters
S → if E then S fi
| if E then S else S fi
| other
Solution 3: Require begin-end blocks (like Pascal)
S → if E then begin S end
| if E then begin S end else begin S end
| other
Strategy 4: Make Operations Non-Associative
Problem: S → SS | a (ambiguous)
Solution: Force left or right association
Left-associative:
S → SA | A
A → a
Right-associative:
S → AS | A
A → a
Inherently Ambiguous Languages
Definition: A language is inherently ambiguous if every grammar for it is ambiguous.
Example: Classic Inherently Ambiguous Language
L = {aⁿbⁿcᵐdᵐ | n, m ≥ 1} ∪ {aⁿbᵐcᵐdⁿ | n, m ≥ 1}
Meaning:
- Either n a’s match n b’s AND m c’s match m d’s
- OR n a’s match n d’s AND m b’s match m c’s
When n = m: Ambiguous!
Example: a²b²c²d²
- Could be: n=2, m=2 from first pattern (a²b² c²d²)
- Could be: n=2, m=2 from second pattern (a² b²c² d²)
Every grammar for this language must be ambiguous!
Comparing Derivations and Parse Trees
One Parse Tree, Multiple Derivations
Key Insight: Same parse tree can correspond to different derivation orders
Example:
Parse Tree:
S
/ \
A B
| |
a b
Leftmost derivation:
S ⇒ₗₘ AB ⇒ₗₘ aB ⇒ₗₘ ab
Rightmost derivation:
S ⇒ᵣₘ AB ⇒ᵣₘ Ab ⇒ᵣₘ ab
Same tree, different order!
Unambiguous Grammar Properties
For unambiguous grammar and string w:
- Exactly one parse tree
- Exactly one leftmost derivation
- Exactly one rightmost derivation
- All three correspond to each other
For ambiguous grammar and string w:
- Multiple parse trees possible
- Multiple leftmost derivations possible
- Multiple rightmost derivations possible
Practical Examples
Example 1: Programming Language Statement
Grammar:
Stmt → if (Expr) Stmt
| if (Expr) Stmt else Stmt
| Assign
| Block
Block → { StmtList }
StmtList → Stmt | Stmt StmtList
Assign → id = Expr ;
This is ambiguous! (dangling else)
Example 2: Expression Grammar (Unambiguous)
Grammar:
Expr → Expr + Term | Expr - Term | Term
Term → Term * Factor | Term / Factor | Factor
Factor → ( Expr ) | id | number
Features:
- ✓ Precedence: * / before + -
- ✓ Left associative: 1+2+3 = (1+2)+3
- ✓ Unambiguous
Example 3: JSON Structure
Grammar:
Value → Object | Array | String | Number | true | false | null
Object → { } | { Members }
Members → Pair | Pair , Members
Pair → String : Value
Array → [ ] | [ Elements ]
Elements → Value | Value , Elements
Unambiguous! ✓
Applications
1. Compiler Parsing
Parse tree is used for:
- Syntax checking
- Semantic analysis
- Code generation
Must be unambiguous for correct compilation!
2. Natural Language Processing
Parse trees show sentence structure:
Sentence → NP VP
NP → Det Noun
VP → Verb NP
"The cat chased the mouse"
Sentence
/ \
NP VP
/ \ / \
Det Noun Verb NP
| | | / \
The cat chased Det Noun
| |
the mouse
3. Mathematical Expressions
Parse tree determines evaluation order:
2 + 3 * 4
+
/ \
2 *
/ \
3 4
Evaluate: 3*4 = 12, then 2+12 = 14 ✓
4. XML/HTML Parsing
Nested structure requires parse trees:
<html>
<body>
<h1>Title</h1>
<p>Text</p>
</body>
</html>
Parse tree validates nesting!
Important Exam Points
- ✓ Leftmost derivation: Always expand leftmost variable
- ✓ Rightmost derivation: Always expand rightmost variable
- ✓ Parse tree: Graphical representation of derivation
- ✓ Yield: Leaves of parse tree (left-to-right) give derived string
- ✓ Ambiguous grammar: Multiple parse trees for same string
- ✓ Ambiguity undecidable: No algorithm to detect in general
- ✓ Remove ambiguity: Use precedence, associativity, restructuring
- ✓ Inherently ambiguous: Some languages have no unambiguous grammar
- ✓ Applications: Compilers, parsers, syntax analysis
- ✓ One tree, many derivations: Same parse tree, different orders
Common Mistakes to Avoid
✗ Mistake 1: Thinking ambiguity is always fixable
✓ Correct: Some languages are inherently ambiguous
✗ Mistake 2: Confusing derivation order with parse tree
✓ Correct: Same parse tree can have multiple derivation orders
✗ Mistake 3: Not expanding leftmost/rightmost variable
✓ Correct: Follow the rule strictly in leftmost/rightmost derivation
✗ Mistake 4: Assuming all grammars are unambiguous
✓ Correct: Many practical grammars are ambiguous (need fixing)
✗ Mistake 5: Reading parse tree leaves randomly
✓ Correct: Always read left-to-right for yield
✗ Mistake 6: Thinking precedence and associativity are same
✓ Correct: Precedence (which first), associativity (which direction)
Practice Questions
Q1: What’s the difference between leftmost and rightmost derivation?
Answer: Leftmost expands leftmost variable first, rightmost expands rightmost variable first.
Q2: Can one parse tree have multiple derivations?
Answer: Yes! Same parse tree can have leftmost, rightmost, and arbitrary derivations.
Q3: How to detect ambiguity?
Answer: Find a string with multiple parse trees (or multiple leftmost/rightmost derivations).
Q4: Is E → E + E | E * E | id ambiguous?
Answer: Yes! String “id + id * id” has multiple parse trees.
Q5: What’s the dangling else problem?
Answer: Ambiguity in “if E1 then if E2 then S1 else S2” - which if does else match?
Q6: Can all ambiguous grammars be rewritten?
Answer: No! Some languages are inherently ambiguous.
Q7: What’s the yield of a parse tree?
Answer: The string obtained by reading leaf nodes left-to-right.
Q8: Why is ambiguity bad for compilers?
Answer: Multiple parse trees = multiple meanings = unpredictable behavior.
Summary
- Derivations: Process of generating strings using grammar rules
- Leftmost/Rightmost: Order of variable expansion (canonical forms)
- Parse trees: Graphical representation showing structure
- Yield: Reading leaves gives derived string
- Ambiguity: Multiple parse trees for same string
- Problem: Undecidable in general, causes unpredictable behavior
- Solutions: Introduce precedence, enforce associativity, restructure grammar
- Inherent ambiguity: Some languages have no unambiguous grammar
- Applications: Compilers (syntax analysis), NLP, expression evaluation
- Key insight: Parse tree structure determines meaning
Understanding derivations, parse trees, and ambiguity is crucial for parser design and compiler construction!