Properties of Context-Free Languages

Overview of CFL Properties

Context-Free Languages have important properties:

  1. Closure Properties: Operations that preserve CFLs
  2. Decision Properties: Problems that are decidable/undecidable
  3. Structural Properties: Pumping lemma, normal forms

Simple Analogy

Think of CFLs like a family business:

  • Closure: What happens when family members marry (union, concatenation)
  • Decision: Questions we can answer about the family (membership, emptiness)
  • Structural: Common traits all members share (pumping lemma)

Closure Properties Overview

CFL is closed under: ✓

  • Union
  • Concatenation
  • Kleene star
  • Reversal
  • Homomorphism
  • Inverse homomorphism

CFL is NOT closed under: ✗

  • Intersection
  • Complement
  • Difference

Special case: ✓

  • Intersection with regular language

Decision Properties Overview

Decidable (we can determine): ✓

  • Membership: Is w ∈ L(G)?
  • Emptiness: Is L(G) = ∅?
  • Finiteness: Is L(G) finite?

Undecidable (we cannot determine): ✗

  • Equivalence: Is L(G₁) = L(G₂)?
  • Universality: Is L(G) = Σ*?
  • Disjointness: Is L(G₁) ∩ L(G₂) = ∅?
  • Containment: Is L(G₁) ⊆ L(G₂)?
  • Ambiguity: Is G ambiguous?
  • Regularity: Is L(G) regular?

Membership Problem

Problem: Given CFG G and string w, is w ∈ L(G)?

Answer: YES (decidable!)

CYK Algorithm

Requirements: Grammar in CNF (Chomsky Normal Form)

Time complexity: O(n³|G|) where n = |w|

Method: Dynamic programming

How CYK Works (Brief)

1. Convert G to CNF
2. Build table T[i,j] = variables that derive substring of length j starting at position i
3. Fill table bottom-up:
   - Base case: T[i,1] = {A | A → wᵢ}
   - Recursive: T[i,j] = {A | A → BC, B ∈ T[i,k], C ∈ T[i+k,j-k] for some k}
4. Accept if S ∈ T[1,n]

Example: Checking “aabb” ∈ {aⁿbⁿ}

Grammar in CNF:

S → AB | AC
A → a
B → SB | b
C → b

CYK table construction (simplified):

Length 1: T[1,1] = {A}, T[2,1] = {A}, T[3,1] = {C}, T[4,1] = {C}
Length 2: T[1,2] = {...}, T[2,2] = {...}, T[3,2] = {...}
...
Length 4: T[1,4] = {S}  ✓ (S can derive whole string!)

Result: “aabb” ∈ L(G) ✓

Decidability

Theorem: Membership problem for CFLs is decidable in O(n³) time

Proof: CYK algorithm

Emptiness Problem

Problem: Given CFG G, is L(G) = ∅?

Answer: YES (decidable!)

Algorithm

1. Find all variables that can derive terminal strings:
   - Mark all terminals as "productive"
   - Repeat: If A → α and all symbols in α are productive, mark A productive
2. Check if start symbol S is productive
3. If S productive, L(G) ≠ ∅
   If S not productive, L(G) = ∅

Example

Grammar 1:

S → AB
A → a
B → bB

Analysis:

  • A productive (A → a)
  • B not productive (infinite recursion, no terminal base)
  • S not productive (needs B)
  • L(G) = ∅

Grammar 2:

S → AB
A → a
B → b

Analysis:

  • A productive (A → a)
  • B productive (B → b)
  • S productive (S → AB with both productive)
  • L(G) ≠ ∅

Complexity

Time: O(n²) where n = |G|

Space: O(n)

Theorem: Emptiness problem for CFLs is decidable in polynomial time

Finiteness Problem

Problem: Given CFG G, is L(G) finite?

Answer: YES (decidable!)

Algorithm

1. Check if L(G) = ∅ (if so, finite!)
2. Check for cycles in derivation graph:
   - If A ⇒⁺ αAβ where α, β generate terminals
   - Then can derive infinite strings
3. If cycles exist, L(G) infinite
   If no cycles, L(G) finite

Alternative: Pumping Length

1. Convert G to CNF
2. Let p = 2|V| (pumping length)
3. Check if any string of length ≥ p is in L(G)
4. If yes, L(G) infinite (by pumping lemma)
   If no, L(G) finite

Example

Grammar 1:

S → AB
A → a
B → b

Analysis:

  • No cycles (A, B can’t derive themselves)
  • L(G) = {ab} (finite!) ✓

Grammar 2:

S → aSb | ab

Analysis:

  • S ⇒ aSb ⇒ aaSbb ⇒ … (cycle!)
  • Can generate arbitrarily long strings
  • L(G) = {aⁿbⁿ | n ≥ 1} (infinite!) ✗

Theorem: Finiteness problem for CFLs is decidable

Equivalence Problem (Undecidable!)

Problem: Given CFGs G₁ and G₂, is L(G₁) = L(G₂)?

Answer: NO (undecidable!) ✗

Why Undecidable?

Proof sketch (by reduction from Post Correspondence Problem):

  1. Can encode PCP instance as two CFGs
  2. PCP has solution ⟺ L(G₁) = L(G₂)
  3. Since PCP undecidable, equivalence undecidable

Implications

Cannot automatically:

  • Verify two grammars generate same language
  • Optimize grammars by checking equivalence
  • Prove grammar correctness automatically

Must use: Manual proofs, testing, approximations

Special Cases

Decidable special cases:

  • If one grammar is regular (decidable!)
  • If both grammars are regular (decidable!)

Example: Can decide if L(G) = L(R) for CFG G and regex R

Universality Problem (Undecidable!)

Problem: Given CFG G, is L(G) = Σ*?

Answer: NO (undecidable!) ✗

Why Undecidable?

Proof: Reduction from complement problem

  • If universality decidable, can decide L(G) = Σ*
  • Equivalent to deciding if complement L̄(G) = ∅
  • But complement of CFL may not be CFL (not closed!)

Implications

Cannot determine:

  • If grammar accepts all strings
  • If grammar is “complete”
  • If language is unrestricted

Intersection and Complement (Not Closed!)

Key insight: CFLs are NOT closed under intersection or complement

Intersection: NOT Closed

Counterexample:

L₁ = {aⁿbⁿcᵐ | n, m ≥ 1}  (CFL)
L₂ = {aᵐbⁿcⁿ | n, m ≥ 1}  (CFL)

L₁ ∩ L₂ = {aⁿbⁿcⁿ | n ≥ 1}  (NOT CFL!)

Proof that {aⁿbⁿcⁿ} is not CFL: Use pumping lemma

Complement: NOT Closed

Proof: If CFLs closed under complement, then:

  • L₁ is CFL → L̄₁ is CFL
  • L₂ is CFL → L̄₂ is CFL
  • L̄₁ ∪ L̄₂ is CFL (closed under union)
  • L₁ ∩ L₂ = L̄₁ ∪ L̄₂ is CFL
  • But we know intersection can be non-CFL! ✗

Contradiction! So CFLs not closed under complement.

Special Case: Intersection with Regular

Theorem: If L is CFL and R is regular, then L ∩ R is CFL

Proof: Construct PDA for L ∩ R

  • Simulate PDA for L and DFA for R in parallel
  • Accept when both accept

Applications:

  • Can filter CFL with regular patterns
  • Useful for parsing with constraints

Other Decision Problems

Disjointness (Undecidable)

Problem: Given G₁, G₂, is L(G₁) ∩ L(G₂) = ∅?

Result: Undecidable ✗

Why: Would allow deciding equivalence

Containment (Undecidable)

Problem: Given G₁, G₂, is L(G₁) ⊆ L(G₂)?

Result: Undecidable ✗

Proof: Equivalence is special case (L₁ = L₂ iff L₁ ⊆ L₂ and L₂ ⊆ L₁)

Ambiguity (Undecidable)

Problem: Given CFG G, is G ambiguous?

Result: Undecidable ✗

Inherently ambiguous: Some languages have no unambiguous grammar

Example: {aⁿbⁿcᵐdᵐ} ∪ {aⁿbᵐcᵐdⁿ} is inherently ambiguous

Regularity (Undecidable)

Problem: Given CFG G, is L(G) regular?

Result: Undecidable ✗

Would be useful: Could optimize to DFA if possible

Summary Table: Decision Problems

ProblemInputQuestionDecidable?
MembershipG, ww ∈ L(G)?✓ Yes (O(n³))
EmptinessGL(G) = ∅?✓ Yes (O(n²))
FinitenessGL(G) finite?✓ Yes
EquivalenceG₁, G₂L(G₁) = L(G₂)?✗ No
UniversalityGL(G) = Σ*?✗ No
DisjointnessG₁, G₂L(G₁) ∩ L(G₂) = ∅?✗ No
ContainmentG₁, G₂L(G₁) ⊆ L(G₂)?✗ No
AmbiguityGG ambiguous?✗ No
RegularityGL(G) regular?✗ No

Summary Table: Closure Properties

OperationRegularCFLResult
Union✓ Closed
Concatenation✓ Closed
Kleene Star✓ Closed
Intersection✗ Not closed
Complement✗ Not closed
Difference✗ Not closed
Reversal✓ Closed
Homomorphism✓ Closed
Inverse Hom.✓ Closed
Inter. w/ Reg✓ Closed

Practical Implications

For Compiler Design

Can do ✓:

  • Check if string belongs to language (membership)
  • Verify language is non-empty (emptiness)
  • Parse in O(n³) time (CYK)
  • Combine languages (union, concat, star)

Cannot do ✗:

  • Automatically prove grammar correct (equivalence)
  • Verify grammar accepts all valid programs (universality)
  • Automatically detect ambiguity

For Language Theory

Understand:

  • CFLs more powerful than regular
  • But still have limitations (not closed under intersection)
  • Some problems fundamentally unsolvable

For Practical Parsing

Use:

  • CYK for general CFGs (O(n³))
  • Specialized algorithms for subclasses (LL, LR)
  • Testing instead of formal verification

Important Exam Points

  1. Membership: Decidable in O(n³) using CYK
  2. Emptiness: Decidable by checking productive variables
  3. Finiteness: Decidable by detecting cycles
  4. Equivalence: Undecidable (cannot compare two CFGs)
  5. Universality: Undecidable (cannot check L = Σ*)
  6. Closed under: Union, concat, star, reversal, homomorphism
  7. NOT closed under: Intersection, complement, difference
  8. Special: Closed under intersection with regular
  9. Ambiguity: Undecidable to detect
  10. Regularity: Undecidable if CFL is actually regular

Common Mistakes to Avoid

Mistake 1: Thinking CFLs closed under intersection
Correct: NOT closed! {aⁿbⁿcᵐ} ∩ {aᵐbⁿcⁿ} = {aⁿbⁿcⁿ} not CFL

Mistake 2: Thinking equivalence is decidable
Correct: Undecidable! Cannot automatically compare grammars

Mistake 3: Confusing membership with emptiness
Correct: Membership checks if specific w ∈ L; emptiness checks if any w ∈ L

Mistake 4: Thinking ambiguity is decidable
Correct: Undecidable! Some languages inherently ambiguous

Mistake 5: Forgetting special case of intersection with regular
Correct: CFL ∩ Regular = CFL (this IS closed!)

Mistake 6: Thinking finiteness is undecidable
Correct: Decidable! Check for cycles or use pumping

Practice Questions

Q1: Is membership problem decidable for CFLs?

Answer: Yes, decidable in O(n³) using CYK algorithm.

Q2: Can we determine if L(G₁) = L(G₂)?

Answer: No, equivalence problem is undecidable.

Q3: Are CFLs closed under intersection?

Answer: No! {aⁿbⁿcᵐ} ∩ {aᵐbⁿcⁿ} = {aⁿbⁿcⁿ} is not CFL.

Q4: How to check if L(G) = ∅?

Answer: Check if start symbol is productive (can derive terminals).

Q5: Is CFL ∩ Regular always a CFL?

Answer: Yes! This special case IS closed.

Q6: Can we detect if grammar is ambiguous?

Answer: No, ambiguity detection is undecidable.

Q7: Time complexity of CYK algorithm?

Answer: O(n³|G|) where n = string length, |G| = grammar size.

Q8: Are CFLs closed under union?

Answer: Yes! Use S → S₁ | S₂.

Summary

  • Decidable: Membership (O(n³)), emptiness, finiteness
  • Undecidable: Equivalence, universality, disjointness, containment, ambiguity, regularity
  • Closed under: Union, concatenation, star, reversal, homomorphism, intersection with regular
  • NOT closed under: Intersection, complement, difference
  • CYK algorithm: O(n³) for membership in CNF
  • Emptiness: Check productive variables
  • Finiteness: Detect cycles in derivations
  • Equivalence: Undecidable (major limitation)
  • Intersection: NOT closed ({aⁿbⁿcⁿ} counterexample)
  • Special case: CFL ∩ Regular = CFL
  • Practical: Can parse, but can’t verify equivalence
  • Key insight: CFLs powerful but have fundamental limitations

Understanding CFL properties is crucial for language theory and practical parsing!