Greibach Normal Form

What is Greibach Normal Form?

Greibach Normal Form (GNF) is another restricted form of context-free grammar where every production starts with a terminal followed by zero or more variables.

Formal Definition

A CFG is in Greibach Normal Form if every production is of the form:

A → aα

where:

  • a is a terminal
  • α is a (possibly empty) string of variables

Exception: S → ε is allowed if S doesn’t appear on any right side

Simple Analogy

Think of GNF like sentences starting with action words:

  • CNF: “Noun Noun” or “Word” (binary structure)
  • GNF: “Action [details]” (terminal first, then variables)
  • Every derivation step produces one terminal immediately

Example: “eat [food] [quickly]” - action first!

GNF vs CNF Comparison

FeatureCNFGNF
FormA → BC or A → aA → aα
First symbolVariable or terminalAlways terminal
TerminalsAt end of derivationOne per step
Left recursionAllowedNot allowed
Parse treeBinaryn-ary
ParsingBottom-up (CYK)Top-down
String length n2n-1 stepsn steps

GNF Examples

Valid GNF Productions

S → a         ✓ (terminal only)
S → aAB       ✓ (terminal + variables)
A → bBCD      ✓ (terminal + multiple variables)
B → c         ✓ (single terminal)

NOT Valid GNF

S → AB        ✗ (no terminal at start)
S → a         ✓ (this is OK)
S → ε         ✗ (epsilon - unless S is start and not on right)
A → Aa        ✗ (terminal not first)
B → abc       ✗ (multiple terminals)

Example Grammar in GNF

Language: {aⁿbⁿ | n ≥ 1}

S → aSB | aB
B → b

Check GNF:

  • S → aSB: starts with ‘a’ ✓
  • S → aB: starts with ‘a’ ✓
  • B → b: terminal only ✓

All in GNF!

Derivation for “aabb”:

S ⇒ aSB
  ⇒ aaBB     (S → aB)
  ⇒ aabB     (first B → b)
  ⇒ aabb     (second B → b)

Notice: Each step produces exactly one terminal!

Why GNF?

Advantages

  1. One terminal per step: Predictable derivation length
  2. Top-down parsing: Natural for recursive descent parsers
  3. No left recursion: Eliminates infinite loops in top-down parsing
  4. PDA construction: Easy to convert to pushdown automaton
  5. Derivation length = string length: For string of length n, exactly n steps

Applications

  • Top-down parsers: Recursive descent
  • PDA construction: Direct conversion from GNF to PDA
  • Theoretical analysis: Simplifies some proofs
  • Parser generators: Some tools use GNF internally

Key Property: No Left Recursion

Left recursion: A ⇒⁺ Aα (variable derives itself on left)

Example of Left Recursion

S → Sa | b

Problem: S → Sa → Saa → Saaa → … (infinite loop!)

Why GNF Prevents It

In GNF: Every production starts with terminal

A → aα (a is terminal)

So: A ⇒ aα (immediately produces terminal, can’t be A…)

Result: Left recursion impossible! ✓

Conversion to GNF

Converting any CFG to GNF is complex!

Multiple steps required:

Overview of Conversion Algorithm

1. Convert to CNF (Chomsky Normal Form)
2. Order variables: A₁, A₂, ..., Aₙ
3. Eliminate left recursion
4. Substitute variables to get terminals first
5. Clean up productions

Note: This is significantly more complex than CNF conversion!

Step 1: Convert to CNF

First, get grammar in Chomsky Normal Form:

  • A → BC or A → a

Why?: Simplifies subsequent steps

Step 2: Order Variables

Assign order to variables: A₁, A₂, …, Aₙ

Example:

  • A₁ = S
  • A₂ = A
  • A₃ = B

Step 3: Eliminate Left Recursion

Goal: Ensure Aᵢ → Aⱼα only if j > i

Method: For each pair i, j where j ≤ i:

  • Replace Aᵢ → Aⱼα with Aᵢ → β₁α | β₂α | …
  • where Aⱼ → β₁ | β₂ | …

Example: Eliminating Left Recursion

Grammar:

A₁ → A₁a | A₂b
A₂ → c

Problem: A₁ → A₁a (left recursive)

Solution: Use new variable A₁’

A₁ → A₂bA₁' | A₂b
A₁' → aA₁' | a
A₂ → c

Now: No left recursion! ✓

Step 4: Substitute to Get Terminals First

Goal: Make productions start with terminals

Method: Substitute variables on right side

Example

After eliminating left recursion:

A₁ → A₂bA₁' | A₂b
A₁' → aA₁' | a
A₂ → c

Substitute A₂ in A₁:

A₁ → cbA₁' | cb
A₁' → aA₁' | a
A₂ → c

Now in GNF!

Detailed Example 1: Simple Conversion

Original Grammar:

S → AB
A → a
B → b

Step 1: Already in CNF ✓

Step 2: Order: S = A₁, A = A₂, B = A₃

Step 3: No left recursion ✓

Step 4: Substitute

For S → AB:

  • Substitute A → a: S → aB
  • But B is variable, not terminal!
  • Can’t leave as is in GNF

Need: S → a followed by variables only

Actually, S → aB IS valid GNF! (a is terminal, B is variable)

So substitute A:

S → aB

For B → b: Already GNF ✓

Final GNF:

S → aB
A → a
B → b

Detailed Example 2: Eliminating Left Recursion

Original Grammar:

S → Sa | A
A → b

Problem: S → Sa (left recursion!)

Step 1: Eliminate left recursion

Use technique: A → Aα | β becomes A → βA’ and A’ → αA’ | α

Here: S → Sa | A

  • α = a
  • β = A

Transform:

S → AS'       (S → β S')
S' → aS' | a  (S' → α S' | α)
A → b

Step 2: Substitute A in S

S → bS'
S' → aS' | a
A → b

Final GNF:

S → bS'
S' → aS' | a
A → b

All productions start with terminal!

Detailed Example 3: More Complex

Original Grammar (in CNF):

S → AB | a
A → SA | b
B → c

Order: S = A₁, A = A₂, B = A₃

Problem: A → SA (A₂ → A₁…)

  • j = 1 < 2 = i (violates ordering)

Step 1: Fix A → SA

Substitute S:

  • S → AB | a
  • So A → SA becomes A → ABA | aA

New:

S → AB | a
A → ABA | aA | b
B → c

Still problem: A → ABA (left recursive!)

Step 2: Eliminate A → ABA | aA | b

Use: A → Aα | β becomes A → βA’ and A’ → αA’ | α

  • α₁ = BA, α₂ = (none from aA)
  • β₁ = aA, β₂ = b

Wait, this is tricky! Let me reconsider:

A → ABA | aA | b

Left-recursive: ABA Non-left-recursive: aA, b

Transform:

A → aAA' | bA' | aA | b
A' → BAA' | BA

Hmm, getting complex! Let me simplify:

Actually:

A → aAA' | bA'
A' → BAA' | BA

Now substitute B → c:

A → aAA' | bA'
A' → cAA' | cA

And S → AB: Need to substitute A:

S → aAA'B | bA'B | a

Wait, this is getting messy! The full GNF conversion is quite involved.

Simplified Final Form (concept):

S → a... (productions starting with a)
A → a... or b...
A' → c...
B → c

Systematic GNF Conversion Algorithm

For practical conversion, use this systematic approach:

Algorithm

Input: CFG G
Output: Equivalent G' in GNF

1. Convert G to CNF
2. Order variables A₁, A₂, ..., Aₙ
3. For i = 1 to n:
   a. For j = 1 to i-1:
      Replace Aᵢ → Aⱼα with Aᵢ → β₁α | β₂α | ...
      where Aⱼ → β₁ | β₂ | ... (and each βₖ starts with Aₖ where k > j)
   b. Eliminate direct left recursion for Aᵢ
4. For i = n down to 1:
   Substitute variables to ensure productions start with terminals
5. Clean up and simplify

Note: This is complex! Most exam questions use simpler grammars.

Properties of GNF

1. Derivation Length

For string of length n: Exactly n steps in derivation

Why?: Each step produces exactly one terminal

Example: “aabb” (length 4) → 4 derivation steps

2. No Left Recursion

Guaranteed: No A ⇒⁺ Aα

Why?: Every production starts with terminal

3. Parse Tree Structure

Not binary (unlike CNF)

Each node can have variable number of children

4. Unique Terminal per Step

Each derivation step produces exactly one new terminal

Predictable: Step i produces the i-th terminal

GNF and Pushdown Automata

Important connection: Easy to convert GNF grammar to PDA!

Conversion: GNF → PDA

For grammar in GNF:

A → aα

Create PDA transition:

δ(q, a, A) = (q, α)

Meaning:

  • Read input symbol ‘a’
  • Pop variable A from stack
  • Push string α onto stack

Example: {aⁿbⁿ}

Grammar in GNF:

S → aSB | aB
B → b

PDA transitions:

δ(q, a, S) = {(q, SB), (q, B)}
δ(q, b, B) = {(q, ε)}

Start configuration: (q, w, S) where w is input

Accept: When stack empty and input consumed

Advantages and Disadvantages

Advantages ✓

  1. Predictable derivation: Length = string length
  2. No left recursion: Safe for top-down parsing
  3. PDA conversion: Direct and easy
  4. One terminal per step: Clear structure
  5. Top-down parsing: Natural fit

Disadvantages ✗

  1. Complex conversion: Much harder than CNF
  2. Larger grammar: More productions than CNF
  3. Less readable: Unnatural structure
  4. Not unique: Multiple GNF forms possible
  5. Harder to understand: More abstract than original

GNF vs CNF: When to Use?

Use CNF when:

  • Need bottom-up parsing (CYK)
  • Want uniform binary structure
  • Easier conversion acceptable
  • Theoretical analysis (pumping lemma)

Use GNF when:

  • Building top-down parsers
  • Converting to PDA
  • Need to eliminate left recursion
  • Want predictable derivation length

In Practice:

  • CNF more common (easier to work with)
  • GNF mostly theoretical (conversion complex)
  • Both important for understanding CFLs

Example: Complete Conversion

Original Grammar: {aⁿbⁿ | n ≥ 1}

S → aSb | ab

Step 1: Convert to CNF

S → AC | UV
C → SB
A → a
B → b
U → a
V → b

Step 2: Order variables S = A₁, C = A₂, A = A₃, B = A₄, U = A₅, V = A₆

Step 3-4: Eliminate left recursion and substitute

After complex transformations:

S → aSB | aB
B → b

Wait, we’re back to simple form!

Actually, this simple form IS already GNF! ✓

The conversion can sometimes simplify things!

Important Exam Points

  1. GNF form: A → aα (terminal first, then variables)
  2. Exception: S → ε if S is start and not on right
  3. Derivation length: n steps for string of length n
  4. No left recursion: Guaranteed by GNF structure
  5. One terminal per step: Predictable generation
  6. PDA conversion: Direct from GNF productions
  7. Parse tree: n-ary, not binary (unlike CNF)
  8. Top-down parsing: Natural fit for recursive descent
  9. Conversion complex: Much harder than CNF conversion
  10. Every CFL: Has equivalent GNF (except ε handling)

Common Mistakes to Avoid

Mistake 1: Allowing A → BC (no terminal first)
Correct: Must have A → aBC or similar

Mistake 2: Multiple terminals: A → abc
Correct: Only one terminal: A → aBC

Mistake 3: Terminal not first: A → Aa
Correct: Terminal must be first: A → aA

Mistake 4: Forgetting to eliminate left recursion
Correct: Use systematic algorithm

Mistake 5: Confusing with CNF (A → BC)
Correct: GNF is A → aα (terminal first)

Mistake 6: Allowing empty right side
Correct: Need at least one terminal (except S → ε)

Practice Questions

Q1: Is S → aBC in GNF?

Answer: Yes! Starts with terminal ‘a’, followed by variables B, C.

Q2: Is A → BC in GNF?

Answer: No, doesn’t start with terminal.

Q3: How many steps to derive “aaabbb” in GNF?

Answer: 6 steps (one per character).

Q4: Can GNF have left recursion?

Answer: No, impossible because every production starts with terminal.

Q5: Convert S → AB, A → a, B → b to GNF

Answer: S → aB, A → a, B → b

Q6: What’s easier to convert: CNF or GNF?

Answer: CNF is much easier; GNF conversion is complex.

Q7: Why is GNF useful for PDA?

Answer: Direct conversion: A → aα becomes δ(q, a, A) = (q, α).

Q8: Derivation length for string “abc” in GNF?

Answer: 3 steps (one per terminal).

Summary

  • Greibach Normal Form: Productions of form A → aα
  • Structure: Terminal first, followed by zero or more variables
  • Exception: S → ε allowed if S not on right side
  • Key property: No left recursion possible
  • Derivation length: Exactly n steps for string length n
  • Applications: Top-down parsing, PDA conversion
  • Advantages: Predictable derivation, no left recursion, PDA connection
  • Disadvantages: Complex conversion, larger grammar
  • vs CNF: GNF for top-down, CNF for bottom-up
  • Every CFL: Has equivalent GNF grammar
  • Conversion: Complex multi-step process
  • Key insight: Trade simplicity for parsing properties

GNF is particularly important for understanding the connection between CFGs and pushdown automata!