Pumping Lemma for Context-Free Languages

What is the Pumping Lemma?

Pumping Lemma is a powerful tool to prove that certain languages are NOT context-free.

Simple Analogy

Think of pumping lemma like a weakness detector:

  • Regular languages: Can’t count unboundedly (a^nb^n fails)
  • Context-free languages: Can’t count TWO things independently (a^nb^nc^n fails)
  • Pumping lemma: Exposes this weakness!

What It Cannot Do

❌ Cannot prove language IS context-free
✓ Can only prove language is NOT context-free

To prove CFL: Construct grammar or PDA

The Statement

Pumping Lemma for CFLs: If L is a context-free language, then there exists a pumping length p such that:

For any string w ∈ L where |w| ≥ p, we can write w = uvxyz where:

  1. |vxy| ≤ p (middle part not too long)
  2. |vy| ≥ 1 (at least one of v, y is non-empty)
  3. uv^ixy^iz ∈ L for all i ≥ 0 (can pump both v and y)

Five Pieces

w = u v x y z
    ↑ ↑ ↑ ↑ ↑
    │ │ │ │ └─ Suffix (unchanged)
    │ │ │ └─── Second pumping part
    │ │ └───── Middle (unchanged)
    │ └─────── First pumping part
    └───────── Prefix (unchanged)

Pump v and y together: uv²xy²z, uv³xy³z, …, uv^ixy^iz

Key Differences: Regular vs CFL

FeatureRegular Pumping LemmaCFL Pumping Lemma
String splitw = xyzw = uvxyz
Pumpyv and y (both!)
Length constraint|xy| ≤ p|vxy| ≤ p
Non-empty|y| ≥ 1|vy| ≥ 1
What failsa^nb^na^nb^nc^n

Why It Works (Intuition)

From CNF grammar:

Parse tree for long string:
       S
      / \
     A   B
    / \   \
   A   C   ...
  / \
 ...

If string long enough: Some variable repeats on path from root to leaf

Repeated variable: Creates subtree that can be repeated

Pumping: Repeat that subtree 0, 1, 2, … times

Example Visualization

Original tree:           Pumped once more:
     A                        A
    / \                      / \
   v   A         →          v   A
      / \                      / \
     x   y                    v   A
                                 / \
                                x   y

Generates: vxy           Generates: vvxyy = v²xy²

Using the Pumping Lemma

Standard proof strategy (prove L is NOT CFL):

1. Assume L is CFL (for contradiction)
2. Let p be the pumping length
3. Choose a specific string w ∈ L with |w| ≥ p
4. Consider ALL possible ways to split w = uvxyz
5. For EACH split, find i such that uv^ixy^iz ∉ L
6. Contradiction! Therefore L is not CFL

Key: Must consider ALL possible splits, not just one!

Example 1: {a^nb^nc^n | n ≥ 1}

Claim: L = {a^nb^nc^n | n ≥ 1} is NOT a CFL

Proof:

Step 1: Assume L is CFL

Step 2: Let p be the pumping length

Step 3: Choose w = a^pb^pc^p ∈ L (clearly |w| = 3p ≥ p)

Step 4: By pumping lemma, w = uvxyz where:

  • |vxy| ≤ p
  • |vy| ≥ 1
  • uv^ixy^iz ∈ L for all i ≥ 0

Step 5: Key insight: Since |vxy| ≤ p, vxy contains at most 2 types of symbols

Case 1: vxy contains only a’s
Then v = a^k, y = a^m for some k, m with k+m ≥ 1
Pump i = 0: uxz = a^(p-k-m)b^pc^p
Fewer a’s than b’s and c’s → NOT in L ✗

Case 2: vxy contains only b’s
Similar: Pumping changes b count but not a, c counts ✗

Case 3: vxy contains only c’s
Similar: Pumping changes c count ✗

Case 4: vxy contains a’s and b’s
Then v and y contain some a’s and/or b’s
Pump i = 2: uv²xy²z has more a’s and b’s, but same c’s
Cannot maintain a^nb^nc^n pattern ✗

Case 5: vxy contains b’s and c’s
Similar to case 4 ✗

Case 6: vxy contains a’s and c’s
Impossible! Since |vxy| ≤ p and we have p b’s in the middle, vxy cannot span from a’s to c’s while including b’s.

Step 6: In all cases, pumping violates the constraint. Contradiction!

Conclusion: L is NOT a CFL ✗

Example 2: {ww | w ∈ {a,b}*}

Claim: L = {ww | w ∈ {a,b}*} is NOT a CFL

Proof:

Choose: w = a^pb^pa^pb^p ∈ L (this is ww where w = a^pb^p)

Split: w = uvxyz with |vxy| ≤ p

Key: vxy is in first half, or spans middle, or in second half

Case 1: vxy in first a^pb^p

  • Pump i = 2: uv²xy²z has more symbols in first part
  • First half ≠ second half ✗

Case 2: vxy spans the middle (between two halves)

  • |vxy| ≤ p, so vxy contains part of first b^p and part of second a^p
  • Pump i = 2: Changes the middle structure
  • No longer ww form ✗

Case 3: vxy in second a^pb^p

  • Pump i = 0: uxz has fewer symbols in second part
  • First half ≠ second half ✗

Conclusion: L is NOT a CFL ✗

Note: {ww^R} IS a CFL! The reversal makes it pumpable.

Example 3: {a^(n²) | n ≥ 0}

Claim: L = {a^(n²) | n ≥ 0} is NOT a CFL

Proof:

Choose: w = a^(p²) ∈ L

Split: w = uvxyz with |vxy| ≤ p, |vy| ≥ 1

So: 1 ≤ |vy| ≤ p

Pump i = 2: uv²xy²z = a^(p² + |vy|)

Check: Is p² + |vy| a perfect square?

Between: p² < p² + |vy| ≤ p² + p < p² + 2p + 1 = (p+1)²

So: p² < p² + |vy| < (p+1)²

Not a perfect square!

Conclusion: L is NOT a CFL ✗

Example 4: {a^(2^n) | n ≥ 0}

Claim: L = {a^(2^n) | n ≥ 0} = {a, aa, aaaa, aaaaaaaa, …} is NOT a CFL

Proof:

Choose: w = a^(2^p) ∈ L

Split: w = uvxyz with 1 ≤ |vy| ≤ p

Pump i = 2: |uv²xy²z| = 2^p + |vy|

Check: Is 2^p + |vy| a power of 2?

We have: 2^p < 2^p + |vy| ≤ 2^p + p < 2^p + 2^p = 2^(p+1)

For large p: p < 2^p, so 2^p + p < 2^(p+1)

Not a power of 2!

Conclusion: L is NOT a CFL ✗

Example 5: {a^ib^jc^k | 0 ≤ i ≤ j ≤ k}

Claim: L is NOT a CFL

Proof:

Choose: w = a^pb^pc^p ∈ L (i = j = k = p)

Split: w = uvxyz with |vxy| ≤ p, |vy| ≥ 1

Consider cases (similar to Example 1):

If vxy in a’s and b’s:
Pump i = 0: uxz = a^(p-|v|)b^(p-|y|)c^p
Now: i < p, j < p, k = p
But we need i ≤ j ≤ k
If |v| > 0 and |y| = 0, then i < j = k ✓ (this is OK)
If |v| = 0 and |y| > 0, then i = j < k, but also i < j? Need careful analysis…

Better choice: w = a^pb^(2p)c^(3p) ∈ L (i = p ≤ j = 2p ≤ k = 3p)

Now:

  • vxy contains at most 2 symbol types (since |vxy| ≤ p)
  • Pumping will increase some counts, but maintain others
  • Will violate i ≤ j ≤ k constraint ✗

Conclusion: L is NOT a CFL ✗

Example 6: {a^ib^jc^k | i = j or j = k}

This IS a CFL! (Union of two CFLs)

Grammar:

S → S₁ | S₂
S₁ → AB    (for i = j)
A → aAb | ε
B → cB | ε

S₂ → AB    (for j = k)
A → aA | ε
B → bBc | ε

Cannot use pumping lemma to prove it’s not CFL (because it IS!).

Example 7: {ww | w ∈ {a}*} = {a^(2n) | n ≥ 0}

This IS a CFL! (Even powers of a)

Grammar: S → aaSS | ε

Better: S → aSa | ε

This generates all even-length strings of a’s = {ww | w ∈ {a}*} ✓

Note: {ww | w ∈ {a,b}} is NOT CFL, but {ww | w ∈ {a}} IS!

Common Pitfalls

Pitfall 1: Wrong String Choice

Bad: Choose w = abc (too short!) ✓ Good: Choose w depending on p (like a^pb^pc^p)

Pitfall 2: Considering Only One Split

Bad: “Let v = a^k and y = b^m, then…” ✓ Good: “For ANY split into uvxyz…”

Must consider ALL possible splits!

Pitfall 3: Wrong Pumping Count

Bad: Pump i = 1 (that’s just original string!) ✓ Good: Try i = 0, 2, or other values

Pitfall 4: Not Using Constraints

Bad: Ignoring |vxy| ≤ p ✓ Good: Use |vxy| ≤ p to limit where v, y can be

Pitfall 5: Proving CFL with Pumping Lemma

Bad: “Pumping works, so it’s CFL” ✓ Good: Construct grammar or PDA to prove CFL

Pumping Length p

Where does p come from?

For CNF grammar: p = 2^|V|

  • |V| variables
  • Parse tree height ≥ log₂(n) for string length n
  • If n ≥ 2^|V|, some variable repeats on path

Don’t need to know exact p!

  • Just know it exists
  • Choose string based on arbitrary p

Strategy Summary

To prove L is NOT CFL:

1. Assume L is CFL
2. Let p be pumping length
3. Choose w ∈ L carefully:
   - Depends on p
   - Long enough (|w| ≥ p)
   - Shows structural weakness
4. Consider ALL splits w = uvxyz
5. Use constraints:
   - |vxy| ≤ p
   - |vy| ≥ 1
6. Find i where uv^ixy^iz ∉ L
7. Contradiction → L not CFL

Common string choices:

  • a^pb^pc^p (for three-way counting)
  • a^pb^pa^pb^p (for ww)
  • a^(p²) (for quadratic growth)
  • a^pb^(2p)c^(3p) (for inequalities)

Languages: CFL or Not?

LanguageCFL?Why?
{a^nb^n}✓ YesS → aSb | ε
{a^nb^nc^n}✗ NoPumping lemma
{ww^R}✓ YesS → aSa | bSb | ε
{ww}✗ NoPumping lemma
{a^nb^mc^md^n}✓ YesTwo independent counts
{a^nb^mc^nd^m}✗ NoCrossing dependencies
{a^(n²)}✗ NoQuadratic growth
{a^(2^n)}✗ NoExponential growth
{a^p | p prime}✗ NoPrimes (complex!)

Important Exam Points

  1. Statement: w = uvxyz, |vxy| ≤ p, |vy| ≥ 1, uv^ixy^iz ∈ L
  2. Purpose: Prove language NOT CFL (cannot prove it IS)
  3. Strategy: Choose w, consider ALL splits, find failing i
  4. Constraints: |vxy| ≤ p limits where v, y can be
  5. Both pump: Must pump both v and y together (not just one)
  6. Classic example: {a^nb^nc^n} not CFL
  7. Difference from regular: Regular pumps y, CFL pumps v and y
  8. String choice: Must depend on p, must be long
  9. All splits: Must consider every possible way to split
  10. Typical i values: Try i = 0 (shorten) or i = 2 (lengthen)

Common Mistakes to Avoid

Mistake 1: Choosing w independent of p
Correct: w must depend on p (like a^pb^pc^p)

Mistake 2: Considering only one split
Correct: Must handle ALL possible splits of w

Mistake 3: Pumping only v or only y
Correct: Must pump both v and y (uv^ixy^iz)

Mistake 4: Using i = 1 to find contradiction
Correct: i = 1 gives original string, try i = 0 or i = 2

Mistake 5: Trying to prove language IS CFL
Correct: Pumping lemma can only prove NOT CFL

Mistake 6: Ignoring |vxy| ≤ p constraint
Correct: Use this to limit where v, x, y can be

Practice Questions

Q1: Is {a^nb^nc^n} a CFL?

Answer: No, use pumping lemma with w = a^pb^pc^p.

Q2: What’s |vxy| ≤ p used for?

Answer: Limits where v, x, y can be (at most p consecutive symbols).

Q3: Can pumping lemma prove language IS CFL?

Answer: No! Can only prove NOT CFL. Use grammar/PDA to prove CFL.

Q4: Why does {ww} fail but {ww^R} succeed?

Answer: {ww} needs to remember and compare; {ww^R} uses stack symmetry.

Q5: Prove {a^(n²)} not CFL

Answer: Pump w = a^(p²) → get p² + |vy|, not perfect square.

Q6: String for pumping {a^nb^nc^n}?

Answer: w = a^pb^pc^p (all three equal).

Q7: Which i values typically work?

Answer: i = 0 (remove pumped parts) or i = 2 (double them).

Q8: Is {a^nb^(2n)} a CFL?

Answer: Yes! S → aSbb | ε.

Summary

  • Pumping lemma: Tool to prove languages NOT CFL
  • Statement: w = uvxyz, |vxy| ≤ p, |vy| ≥ 1, uv^ixy^iz ∈ L
  • Pump both: v and y together (not separately)
  • Strategy: Choose w, consider ALL splits, find failing i
  • Constraints: |vxy| ≤ p limits location of pumped parts
  • Classic example: {a^nb^nc^n} not CFL (two counts OK, three not)
  • vs Regular: Regular pumps y, CFL pumps v and y
  • Cannot prove CFL: Only proves NOT CFL
  • String choice: Must depend on p, must be long enough
  • Common i values: 0 (shorten), 2 (lengthen)
  • Applications: Showing limits of CFLs, understanding language hierarchy
  • Key insight: CFLs can’t count three independent values

Pumping lemma is essential for understanding the boundaries of context-free languages!