What is Greibach Normal Form?
Greibach Normal Form (GNF) is another restricted form of context-free grammar where every production starts with a terminal followed by zero or more variables.
Formal Definition
A CFG is in Greibach Normal Form if every production is of the form:
A → aα
where:
- a is a terminal
- α is a (possibly empty) string of variables
Exception: S → ε is allowed if S doesn’t appear on any right side
Simple Analogy
Think of GNF like sentences starting with action words:
- CNF: “Noun Noun” or “Word” (binary structure)
- GNF: “Action [details]” (terminal first, then variables)
- Every derivation step produces one terminal immediately
Example: “eat [food] [quickly]” - action first!
GNF vs CNF Comparison
| Feature | CNF | GNF |
|---|---|---|
| Form | A → BC or A → a | A → aα |
| First symbol | Variable or terminal | Always terminal |
| Terminals | At end of derivation | One per step |
| Left recursion | Allowed | Not allowed |
| Parse tree | Binary | n-ary |
| Parsing | Bottom-up (CYK) | Top-down |
| String length n | 2n-1 steps | n steps |
GNF Examples
Valid GNF Productions
S → a ✓ (terminal only)
S → aAB ✓ (terminal + variables)
A → bBCD ✓ (terminal + multiple variables)
B → c ✓ (single terminal)
NOT Valid GNF
S → AB ✗ (no terminal at start)
S → a ✓ (this is OK)
S → ε ✗ (epsilon - unless S is start and not on right)
A → Aa ✗ (terminal not first)
B → abc ✗ (multiple terminals)
Example Grammar in GNF
Language: {aⁿbⁿ | n ≥ 1}
S → aSB | aB
B → b
Check GNF:
- S → aSB: starts with ‘a’ ✓
- S → aB: starts with ‘a’ ✓
- B → b: terminal only ✓
All in GNF! ✓
Derivation for “aabb”:
S ⇒ aSB
⇒ aaBB (S → aB)
⇒ aabB (first B → b)
⇒ aabb (second B → b)
Notice: Each step produces exactly one terminal!
Why GNF?
Advantages
- One terminal per step: Predictable derivation length
- Top-down parsing: Natural for recursive descent parsers
- No left recursion: Eliminates infinite loops in top-down parsing
- PDA construction: Easy to convert to pushdown automaton
- Derivation length = string length: For string of length n, exactly n steps
Applications
- Top-down parsers: Recursive descent
- PDA construction: Direct conversion from GNF to PDA
- Theoretical analysis: Simplifies some proofs
- Parser generators: Some tools use GNF internally
Key Property: No Left Recursion
Left recursion: A ⇒⁺ Aα (variable derives itself on left)
Example of Left Recursion
S → Sa | b
Problem: S → Sa → Saa → Saaa → … (infinite loop!)
Why GNF Prevents It
In GNF: Every production starts with terminal
A → aα (a is terminal)
So: A ⇒ aα (immediately produces terminal, can’t be A…)
Result: Left recursion impossible! ✓
Conversion to GNF
Converting any CFG to GNF is complex!
Multiple steps required:
Overview of Conversion Algorithm
1. Convert to CNF (Chomsky Normal Form)
2. Order variables: A₁, A₂, ..., Aₙ
3. Eliminate left recursion
4. Substitute variables to get terminals first
5. Clean up productions
Note: This is significantly more complex than CNF conversion!
Step 1: Convert to CNF
First, get grammar in Chomsky Normal Form:
- A → BC or A → a
Why?: Simplifies subsequent steps
Step 2: Order Variables
Assign order to variables: A₁, A₂, …, Aₙ
Example:
- A₁ = S
- A₂ = A
- A₃ = B
Step 3: Eliminate Left Recursion
Goal: Ensure Aᵢ → Aⱼα only if j > i
Method: For each pair i, j where j ≤ i:
- Replace Aᵢ → Aⱼα with Aᵢ → β₁α | β₂α | …
- where Aⱼ → β₁ | β₂ | …
Example: Eliminating Left Recursion
Grammar:
A₁ → A₁a | A₂b
A₂ → c
Problem: A₁ → A₁a (left recursive)
Solution: Use new variable A₁’
A₁ → A₂bA₁' | A₂b
A₁' → aA₁' | a
A₂ → c
Now: No left recursion! ✓
Step 4: Substitute to Get Terminals First
Goal: Make productions start with terminals
Method: Substitute variables on right side
Example
After eliminating left recursion:
A₁ → A₂bA₁' | A₂b
A₁' → aA₁' | a
A₂ → c
Substitute A₂ in A₁:
A₁ → cbA₁' | cb
A₁' → aA₁' | a
A₂ → c
Now in GNF! ✓
Detailed Example 1: Simple Conversion
Original Grammar:
S → AB
A → a
B → b
Step 1: Already in CNF ✓
Step 2: Order: S = A₁, A = A₂, B = A₃
Step 3: No left recursion ✓
Step 4: Substitute
For S → AB:
- Substitute A → a: S → aB
- But B is variable, not terminal!
- Can’t leave as is in GNF
Need: S → a followed by variables only
Actually, S → aB IS valid GNF! (a is terminal, B is variable)
So substitute A:
S → aB
For B → b: Already GNF ✓
Final GNF:
S → aB
A → a
B → b
Detailed Example 2: Eliminating Left Recursion
Original Grammar:
S → Sa | A
A → b
Problem: S → Sa (left recursion!)
Step 1: Eliminate left recursion
Use technique: A → Aα | β becomes A → βA’ and A’ → αA’ | α
Here: S → Sa | A
- α = a
- β = A
Transform:
S → AS' (S → β S')
S' → aS' | a (S' → α S' | α)
A → b
Step 2: Substitute A in S
S → bS'
S' → aS' | a
A → b
Final GNF:
S → bS'
S' → aS' | a
A → b
All productions start with terminal! ✓
Detailed Example 3: More Complex
Original Grammar (in CNF):
S → AB | a
A → SA | b
B → c
Order: S = A₁, A = A₂, B = A₃
Problem: A → SA (A₂ → A₁…)
- j = 1 < 2 = i (violates ordering)
Step 1: Fix A → SA
Substitute S:
- S → AB | a
- So A → SA becomes A → ABA | aA
New:
S → AB | a
A → ABA | aA | b
B → c
Still problem: A → ABA (left recursive!)
Step 2: Eliminate A → ABA | aA | b
Use: A → Aα | β becomes A → βA’ and A’ → αA’ | α
- α₁ = BA, α₂ = (none from aA)
- β₁ = aA, β₂ = b
Wait, this is tricky! Let me reconsider:
A → ABA | aA | b
Left-recursive: ABA Non-left-recursive: aA, b
Transform:
A → aAA' | bA' | aA | b
A' → BAA' | BA
Hmm, getting complex! Let me simplify:
Actually:
A → aAA' | bA'
A' → BAA' | BA
Now substitute B → c:
A → aAA' | bA'
A' → cAA' | cA
And S → AB: Need to substitute A:
S → aAA'B | bA'B | a
Wait, this is getting messy! The full GNF conversion is quite involved.
Simplified Final Form (concept):
S → a... (productions starting with a)
A → a... or b...
A' → c...
B → c
Systematic GNF Conversion Algorithm
For practical conversion, use this systematic approach:
Algorithm
Input: CFG G
Output: Equivalent G' in GNF
1. Convert G to CNF
2. Order variables A₁, A₂, ..., Aₙ
3. For i = 1 to n:
a. For j = 1 to i-1:
Replace Aᵢ → Aⱼα with Aᵢ → β₁α | β₂α | ...
where Aⱼ → β₁ | β₂ | ... (and each βₖ starts with Aₖ where k > j)
b. Eliminate direct left recursion for Aᵢ
4. For i = n down to 1:
Substitute variables to ensure productions start with terminals
5. Clean up and simplify
Note: This is complex! Most exam questions use simpler grammars.
Properties of GNF
1. Derivation Length
For string of length n: Exactly n steps in derivation
Why?: Each step produces exactly one terminal
Example: “aabb” (length 4) → 4 derivation steps
2. No Left Recursion
Guaranteed: No A ⇒⁺ Aα
Why?: Every production starts with terminal
3. Parse Tree Structure
Not binary (unlike CNF)
Each node can have variable number of children
4. Unique Terminal per Step
Each derivation step produces exactly one new terminal
Predictable: Step i produces the i-th terminal
GNF and Pushdown Automata
Important connection: Easy to convert GNF grammar to PDA!
Conversion: GNF → PDA
For grammar in GNF:
A → aα
Create PDA transition:
δ(q, a, A) = (q, α)
Meaning:
- Read input symbol ‘a’
- Pop variable A from stack
- Push string α onto stack
Example: {aⁿbⁿ}
Grammar in GNF:
S → aSB | aB
B → b
PDA transitions:
δ(q, a, S) = {(q, SB), (q, B)}
δ(q, b, B) = {(q, ε)}
Start configuration: (q, w, S) where w is input
Accept: When stack empty and input consumed
Advantages and Disadvantages
Advantages ✓
- Predictable derivation: Length = string length
- No left recursion: Safe for top-down parsing
- PDA conversion: Direct and easy
- One terminal per step: Clear structure
- Top-down parsing: Natural fit
Disadvantages ✗
- Complex conversion: Much harder than CNF
- Larger grammar: More productions than CNF
- Less readable: Unnatural structure
- Not unique: Multiple GNF forms possible
- Harder to understand: More abstract than original
GNF vs CNF: When to Use?
Use CNF when:
- Need bottom-up parsing (CYK)
- Want uniform binary structure
- Easier conversion acceptable
- Theoretical analysis (pumping lemma)
Use GNF when:
- Building top-down parsers
- Converting to PDA
- Need to eliminate left recursion
- Want predictable derivation length
In Practice:
- CNF more common (easier to work with)
- GNF mostly theoretical (conversion complex)
- Both important for understanding CFLs
Example: Complete Conversion
Original Grammar: {aⁿbⁿ | n ≥ 1}
S → aSb | ab
Step 1: Convert to CNF
S → AC | UV
C → SB
A → a
B → b
U → a
V → b
Step 2: Order variables S = A₁, C = A₂, A = A₃, B = A₄, U = A₅, V = A₆
Step 3-4: Eliminate left recursion and substitute
After complex transformations:
S → aSB | aB
B → b
Wait, we’re back to simple form!
Actually, this simple form IS already GNF! ✓
The conversion can sometimes simplify things!
Important Exam Points
- ✓ GNF form: A → aα (terminal first, then variables)
- ✓ Exception: S → ε if S is start and not on right
- ✓ Derivation length: n steps for string of length n
- ✓ No left recursion: Guaranteed by GNF structure
- ✓ One terminal per step: Predictable generation
- ✓ PDA conversion: Direct from GNF productions
- ✓ Parse tree: n-ary, not binary (unlike CNF)
- ✓ Top-down parsing: Natural fit for recursive descent
- ✓ Conversion complex: Much harder than CNF conversion
- ✓ Every CFL: Has equivalent GNF (except ε handling)
Common Mistakes to Avoid
✗ Mistake 1: Allowing A → BC (no terminal first)
✓ Correct: Must have A → aBC or similar
✗ Mistake 2: Multiple terminals: A → abc
✓ Correct: Only one terminal: A → aBC
✗ Mistake 3: Terminal not first: A → Aa
✓ Correct: Terminal must be first: A → aA
✗ Mistake 4: Forgetting to eliminate left recursion
✓ Correct: Use systematic algorithm
✗ Mistake 5: Confusing with CNF (A → BC)
✓ Correct: GNF is A → aα (terminal first)
✗ Mistake 6: Allowing empty right side
✓ Correct: Need at least one terminal (except S → ε)
Practice Questions
Q1: Is S → aBC in GNF?
Answer: Yes! Starts with terminal ‘a’, followed by variables B, C.
Q2: Is A → BC in GNF?
Answer: No, doesn’t start with terminal.
Q3: How many steps to derive “aaabbb” in GNF?
Answer: 6 steps (one per character).
Q4: Can GNF have left recursion?
Answer: No, impossible because every production starts with terminal.
Q5: Convert S → AB, A → a, B → b to GNF
Answer: S → aB, A → a, B → b
Q6: What’s easier to convert: CNF or GNF?
Answer: CNF is much easier; GNF conversion is complex.
Q7: Why is GNF useful for PDA?
Answer: Direct conversion: A → aα becomes δ(q, a, A) = (q, α).
Q8: Derivation length for string “abc” in GNF?
Answer: 3 steps (one per terminal).
Summary
- Greibach Normal Form: Productions of form A → aα
- Structure: Terminal first, followed by zero or more variables
- Exception: S → ε allowed if S not on right side
- Key property: No left recursion possible
- Derivation length: Exactly n steps for string length n
- Applications: Top-down parsing, PDA conversion
- Advantages: Predictable derivation, no left recursion, PDA connection
- Disadvantages: Complex conversion, larger grammar
- vs CNF: GNF for top-down, CNF for bottom-up
- Every CFL: Has equivalent GNF grammar
- Conversion: Complex multi-step process
- Key insight: Trade simplicity for parsing properties
GNF is particularly important for understanding the connection between CFGs and pushdown automata!