What is the Pumping Lemma?
Pumping Lemma is a powerful tool to prove that certain languages are NOT context-free.
Simple Analogy
Think of pumping lemma like a weakness detector:
- Regular languages: Can’t count unboundedly (a^nb^n fails)
- Context-free languages: Can’t count TWO things independently (a^nb^nc^n fails)
- Pumping lemma: Exposes this weakness!
What It Cannot Do
❌ Cannot prove language IS context-free
✓ Can only prove language is NOT context-free
To prove CFL: Construct grammar or PDA
The Statement
Pumping Lemma for CFLs: If L is a context-free language, then there exists a pumping length p such that:
For any string w ∈ L where |w| ≥ p, we can write w = uvxyz where:
- |vxy| ≤ p (middle part not too long)
- |vy| ≥ 1 (at least one of v, y is non-empty)
- uv^ixy^iz ∈ L for all i ≥ 0 (can pump both v and y)
Five Pieces
w = u v x y z
↑ ↑ ↑ ↑ ↑
│ │ │ │ └─ Suffix (unchanged)
│ │ │ └─── Second pumping part
│ │ └───── Middle (unchanged)
│ └─────── First pumping part
└───────── Prefix (unchanged)
Pump v and y together: uv²xy²z, uv³xy³z, …, uv^ixy^iz
Key Differences: Regular vs CFL
| Feature | Regular Pumping Lemma | CFL Pumping Lemma |
|---|---|---|
| String split | w = xyz | w = uvxyz |
| Pump | y | v and y (both!) |
| Length constraint | |xy| ≤ p | |vxy| ≤ p |
| Non-empty | |y| ≥ 1 | |vy| ≥ 1 |
| What fails | a^nb^n | a^nb^nc^n |
Why It Works (Intuition)
From CNF grammar:
Parse tree for long string:
S
/ \
A B
/ \ \
A C ...
/ \
...
If string long enough: Some variable repeats on path from root to leaf
Repeated variable: Creates subtree that can be repeated
Pumping: Repeat that subtree 0, 1, 2, … times
Example Visualization
Original tree: Pumped once more:
A A
/ \ / \
v A → v A
/ \ / \
x y v A
/ \
x y
Generates: vxy Generates: vvxyy = v²xy²
Using the Pumping Lemma
Standard proof strategy (prove L is NOT CFL):
1. Assume L is CFL (for contradiction)
2. Let p be the pumping length
3. Choose a specific string w ∈ L with |w| ≥ p
4. Consider ALL possible ways to split w = uvxyz
5. For EACH split, find i such that uv^ixy^iz ∉ L
6. Contradiction! Therefore L is not CFL
Key: Must consider ALL possible splits, not just one!
Example 1: {a^nb^nc^n | n ≥ 1}
Claim: L = {a^nb^nc^n | n ≥ 1} is NOT a CFL
Proof:
Step 1: Assume L is CFL
Step 2: Let p be the pumping length
Step 3: Choose w = a^pb^pc^p ∈ L (clearly |w| = 3p ≥ p)
Step 4: By pumping lemma, w = uvxyz where:
- |vxy| ≤ p
- |vy| ≥ 1
- uv^ixy^iz ∈ L for all i ≥ 0
Step 5: Key insight: Since |vxy| ≤ p, vxy contains at most 2 types of symbols
Case 1: vxy contains only a’s
Then v = a^k, y = a^m for some k, m with k+m ≥ 1
Pump i = 0: uxz = a^(p-k-m)b^pc^p
Fewer a’s than b’s and c’s → NOT in L ✗
Case 2: vxy contains only b’s
Similar: Pumping changes b count but not a, c counts ✗
Case 3: vxy contains only c’s
Similar: Pumping changes c count ✗
Case 4: vxy contains a’s and b’s
Then v and y contain some a’s and/or b’s
Pump i = 2: uv²xy²z has more a’s and b’s, but same c’s
Cannot maintain a^nb^nc^n pattern ✗
Case 5: vxy contains b’s and c’s
Similar to case 4 ✗
Case 6: vxy contains a’s and c’s
Impossible! Since |vxy| ≤ p and we have p b’s in the middle, vxy cannot span from a’s to c’s while including b’s.
Step 6: In all cases, pumping violates the constraint. Contradiction!
Conclusion: L is NOT a CFL ✗
Example 2: {ww | w ∈ {a,b}*}
Claim: L = {ww | w ∈ {a,b}*} is NOT a CFL
Proof:
Choose: w = a^pb^pa^pb^p ∈ L (this is ww where w = a^pb^p)
Split: w = uvxyz with |vxy| ≤ p
Key: vxy is in first half, or spans middle, or in second half
Case 1: vxy in first a^pb^p
- Pump i = 2: uv²xy²z has more symbols in first part
- First half ≠ second half ✗
Case 2: vxy spans the middle (between two halves)
- |vxy| ≤ p, so vxy contains part of first b^p and part of second a^p
- Pump i = 2: Changes the middle structure
- No longer ww form ✗
Case 3: vxy in second a^pb^p
- Pump i = 0: uxz has fewer symbols in second part
- First half ≠ second half ✗
Conclusion: L is NOT a CFL ✗
Note: {ww^R} IS a CFL! The reversal makes it pumpable.
Example 3: {a^(n²) | n ≥ 0}
Claim: L = {a^(n²) | n ≥ 0} is NOT a CFL
Proof:
Choose: w = a^(p²) ∈ L
Split: w = uvxyz with |vxy| ≤ p, |vy| ≥ 1
So: 1 ≤ |vy| ≤ p
Pump i = 2: uv²xy²z = a^(p² + |vy|)
Check: Is p² + |vy| a perfect square?
Between: p² < p² + |vy| ≤ p² + p < p² + 2p + 1 = (p+1)²
So: p² < p² + |vy| < (p+1)²
Not a perfect square! ✗
Conclusion: L is NOT a CFL ✗
Example 4: {a^(2^n) | n ≥ 0}
Claim: L = {a^(2^n) | n ≥ 0} = {a, aa, aaaa, aaaaaaaa, …} is NOT a CFL
Proof:
Choose: w = a^(2^p) ∈ L
Split: w = uvxyz with 1 ≤ |vy| ≤ p
Pump i = 2: |uv²xy²z| = 2^p + |vy|
Check: Is 2^p + |vy| a power of 2?
We have: 2^p < 2^p + |vy| ≤ 2^p + p < 2^p + 2^p = 2^(p+1)
For large p: p < 2^p, so 2^p + p < 2^(p+1)
Not a power of 2! ✗
Conclusion: L is NOT a CFL ✗
Example 5: {a^ib^jc^k | 0 ≤ i ≤ j ≤ k}
Claim: L is NOT a CFL
Proof:
Choose: w = a^pb^pc^p ∈ L (i = j = k = p)
Split: w = uvxyz with |vxy| ≤ p, |vy| ≥ 1
Consider cases (similar to Example 1):
If vxy in a’s and b’s:
Pump i = 0: uxz = a^(p-|v|)b^(p-|y|)c^p
Now: i < p, j < p, k = p
But we need i ≤ j ≤ k
If |v| > 0 and |y| = 0, then i < j = k ✓ (this is OK)
If |v| = 0 and |y| > 0, then i = j < k, but also i < j? Need careful analysis…
Better choice: w = a^pb^(2p)c^(3p) ∈ L (i = p ≤ j = 2p ≤ k = 3p)
Now:
- vxy contains at most 2 symbol types (since |vxy| ≤ p)
- Pumping will increase some counts, but maintain others
- Will violate i ≤ j ≤ k constraint ✗
Conclusion: L is NOT a CFL ✗
Example 6: {a^ib^jc^k | i = j or j = k}
This IS a CFL! (Union of two CFLs)
Grammar:
S → S₁ | S₂
S₁ → AB (for i = j)
A → aAb | ε
B → cB | ε
S₂ → AB (for j = k)
A → aA | ε
B → bBc | ε
Cannot use pumping lemma to prove it’s not CFL (because it IS!).
Example 7: {ww | w ∈ {a}*} = {a^(2n) | n ≥ 0}
This IS a CFL! (Even powers of a)
Grammar: S → aaSS | ε
Better: S → aSa | ε
This generates all even-length strings of a’s = {ww | w ∈ {a}*} ✓
Note: {ww | w ∈ {a,b}} is NOT CFL, but {ww | w ∈ {a}} IS!
Common Pitfalls
Pitfall 1: Wrong String Choice
❌ Bad: Choose w = abc (too short!) ✓ Good: Choose w depending on p (like a^pb^pc^p)
Pitfall 2: Considering Only One Split
❌ Bad: “Let v = a^k and y = b^m, then…” ✓ Good: “For ANY split into uvxyz…”
Must consider ALL possible splits!
Pitfall 3: Wrong Pumping Count
❌ Bad: Pump i = 1 (that’s just original string!) ✓ Good: Try i = 0, 2, or other values
Pitfall 4: Not Using Constraints
❌ Bad: Ignoring |vxy| ≤ p ✓ Good: Use |vxy| ≤ p to limit where v, y can be
Pitfall 5: Proving CFL with Pumping Lemma
❌ Bad: “Pumping works, so it’s CFL” ✓ Good: Construct grammar or PDA to prove CFL
Pumping Length p
Where does p come from?
For CNF grammar: p = 2^|V|
- |V| variables
- Parse tree height ≥ log₂(n) for string length n
- If n ≥ 2^|V|, some variable repeats on path
Don’t need to know exact p!
- Just know it exists
- Choose string based on arbitrary p
Strategy Summary
To prove L is NOT CFL:
1. Assume L is CFL
2. Let p be pumping length
3. Choose w ∈ L carefully:
- Depends on p
- Long enough (|w| ≥ p)
- Shows structural weakness
4. Consider ALL splits w = uvxyz
5. Use constraints:
- |vxy| ≤ p
- |vy| ≥ 1
6. Find i where uv^ixy^iz ∉ L
7. Contradiction → L not CFL
Common string choices:
- a^pb^pc^p (for three-way counting)
- a^pb^pa^pb^p (for ww)
- a^(p²) (for quadratic growth)
- a^pb^(2p)c^(3p) (for inequalities)
Languages: CFL or Not?
| Language | CFL? | Why? |
|---|---|---|
| {a^nb^n} | ✓ Yes | S → aSb | ε |
| {a^nb^nc^n} | ✗ No | Pumping lemma |
| {ww^R} | ✓ Yes | S → aSa | bSb | ε |
| {ww} | ✗ No | Pumping lemma |
| {a^nb^mc^md^n} | ✓ Yes | Two independent counts |
| {a^nb^mc^nd^m} | ✗ No | Crossing dependencies |
| {a^(n²)} | ✗ No | Quadratic growth |
| {a^(2^n)} | ✗ No | Exponential growth |
| {a^p | p prime} | ✗ No | Primes (complex!) |
Important Exam Points
- ✓ Statement: w = uvxyz, |vxy| ≤ p, |vy| ≥ 1, uv^ixy^iz ∈ L
- ✓ Purpose: Prove language NOT CFL (cannot prove it IS)
- ✓ Strategy: Choose w, consider ALL splits, find failing i
- ✓ Constraints: |vxy| ≤ p limits where v, y can be
- ✓ Both pump: Must pump both v and y together (not just one)
- ✓ Classic example: {a^nb^nc^n} not CFL
- ✓ Difference from regular: Regular pumps y, CFL pumps v and y
- ✓ String choice: Must depend on p, must be long
- ✓ All splits: Must consider every possible way to split
- ✓ Typical i values: Try i = 0 (shorten) or i = 2 (lengthen)
Common Mistakes to Avoid
✗ Mistake 1: Choosing w independent of p
✓ Correct: w must depend on p (like a^pb^pc^p)
✗ Mistake 2: Considering only one split
✓ Correct: Must handle ALL possible splits of w
✗ Mistake 3: Pumping only v or only y
✓ Correct: Must pump both v and y (uv^ixy^iz)
✗ Mistake 4: Using i = 1 to find contradiction
✓ Correct: i = 1 gives original string, try i = 0 or i = 2
✗ Mistake 5: Trying to prove language IS CFL
✓ Correct: Pumping lemma can only prove NOT CFL
✗ Mistake 6: Ignoring |vxy| ≤ p constraint
✓ Correct: Use this to limit where v, x, y can be
Practice Questions
Q1: Is {a^nb^nc^n} a CFL?
Answer: No, use pumping lemma with w = a^pb^pc^p.
Q2: What’s |vxy| ≤ p used for?
Answer: Limits where v, x, y can be (at most p consecutive symbols).
Q3: Can pumping lemma prove language IS CFL?
Answer: No! Can only prove NOT CFL. Use grammar/PDA to prove CFL.
Q4: Why does {ww} fail but {ww^R} succeed?
Answer: {ww} needs to remember and compare; {ww^R} uses stack symmetry.
Q5: Prove {a^(n²)} not CFL
Answer: Pump w = a^(p²) → get p² + |vy|, not perfect square.
Q6: String for pumping {a^nb^nc^n}?
Answer: w = a^pb^pc^p (all three equal).
Q7: Which i values typically work?
Answer: i = 0 (remove pumped parts) or i = 2 (double them).
Q8: Is {a^nb^(2n)} a CFL?
Answer: Yes! S → aSbb | ε.
Summary
- Pumping lemma: Tool to prove languages NOT CFL
- Statement: w = uvxyz, |vxy| ≤ p, |vy| ≥ 1, uv^ixy^iz ∈ L
- Pump both: v and y together (not separately)
- Strategy: Choose w, consider ALL splits, find failing i
- Constraints: |vxy| ≤ p limits location of pumped parts
- Classic example: {a^nb^nc^n} not CFL (two counts OK, three not)
- vs Regular: Regular pumps y, CFL pumps v and y
- Cannot prove CFL: Only proves NOT CFL
- String choice: Must depend on p, must be long enough
- Common i values: 0 (shorten), 2 (lengthen)
- Applications: Showing limits of CFLs, understanding language hierarchy
- Key insight: CFLs can’t count three independent values
Pumping lemma is essential for understanding the boundaries of context-free languages!