Languages

What is a Language?

A language is a set of strings over an alphabet. While a string is a single sequence, a language is a collection of strings. Languages can be finite (containing a specific number of strings) or infinite (containing unlimited strings).

Formal Definition

A language L over an alphabet Σ is a subset of Σ*.

L ⊆ Σ*

L can be finite or infinite

L can even be empty

Simple Analogy

Alphabet Σ = Letters (a-z)
Strings = All possible sequences of letters
Language = Valid English words (a specific collection)

Just like “cat”, “dog”, “hello” are valid English words but “xyz”, “zzz” might not be, a formal language defines which strings are “valid” or “accepted”.

Examples of Languages

Example 1: Finite Languages

Language 1: L₁ = {0, 1, 01, 10}
Alphabet: Σ = {0, 1}
Size: |L₁| = 4 (contains exactly 4 strings)

Language 2: L₂ = {ε, a, aa, aaa}
Alphabet: Σ = {a, b}
Size: |L₂| = 4

Language 3: L₃ = {cat, dog, bird}
Alphabet: Σ = {a, b, c, …, z}
Size: |L₃| = 3

Example 2: Infinite Languages

Language 1: L₄ = {0, 00, 000, 0000, …}
Description: All strings of 0’s
Formal Notation: L₄ = {0ⁿ | n ≥ 1}

Language 2: L₅ = {ε, 01, 0011, 000111, …}
Description: Strings with equal number of 0’s followed by equal number of 1’s
Formal Notation: L₅ = {0ⁿ1ⁿ | n ≥ 0}

Language 3: L₆ = {all binary numbers divisible by 3}
Examples: {0, 11, 110, 1001, 1100, …}

Example 3: Special Languages

Empty Language: L₇ = ∅ = { }
Description: Contains no strings at all
Note: ∅ ≠ {ε}

Language with Empty String: L₈ = {ε}
Description: Contains only the empty string
Size: |L₈| = 1

Universal Language: L₉ = Σ*
Description: Contains all possible strings
Size: Infinite

Types of Languages

1. Finite Languages

Languages with a finite (countable) number of strings.

Examples:

L = {a, ab, abc}
L = {0, 1}
L = {hello, world}

Properties:

Can be listed completely
Has a definite size |L|
Easy to define explicitly

2. Infinite Languages

Languages with infinite number of strings.

Examples:

L = {aⁿ | n ≥ 0} = {ε, a, aa, aaa, aaaa, ...}
L = {0ⁿ1ⁿ | n ≥ 0} = {ε, 01, 0011, 000111, ...}
L = Σ* = all possible strings

Properties:

Cannot be listed completely
Must be defined using a rule or pattern
Most interesting languages in TOC are infinite

Ways to Define Languages

Method 1: Enumeration (Listing)

For finite languages, simply list all strings.

Example:

L = {0, 1, 00, 11, 000}

Limitation: Only works for finite languages

Method 2: Set Builder Notation

Define a rule that describes which strings belong to the language.

Example 1:

L = {w ∈ {0,1}* | w starts with 0}
Examples: 0, 01, 00, 011, 0001, ...

Example 2:

L = {w ∈ {a,b}* | |w| is even}
Examples: ε, aa, ab, ba, bb, aaa, ...

Example 3:

L = {aⁿbⁿ | n ≥ 0}
Examples: ε, ab, aabb, aaabbb, ...

Method 3: Using Automata or Grammars

Define machines (like DFA, NFA) or grammars (like CFG) that generate or accept the language.

Example: A DFA that accepts all strings ending in ‘01’

We’ll study this in detail later.

Language Operations

Languages can be combined using various operations:

1. Union (L₁ ∪ L₂)

All strings that are in L₁ OR in L₂.

Example:

L₁ = {0, 00}
L₂ = {1, 11}
L₁ ∪ L₂ = {0, 00, 1, 11}

2. Concatenation (L₁ · L₂)

All strings formed by concatenating a string from L₁ with a string from L₂.

Example:

L₁ = {a, b}
L₂ = {0, 1}
L₁ · L₂ = {a0, a1, b0, b1}

3. Kleene Star (L*)

All possible concatenations of strings from L, including ε.

Example:

L = {a}
L* = {ε, a, aa, aaa, aaaa, ...}

L = {0, 1}
L* = {ε, 0, 1, 00, 01, 10, 11, 000, ...} = Σ*

4. Complement (L̄)

All strings in Σ* that are NOT in L.

Example:

Σ = {0, 1}
L = {0, 1}
L̄ = Σ* - L = {ε, 00, 01, 10, 11, 000, ...}

5. Intersection (L₁ ∩ L₂)

All strings that are in BOTH L₁ AND L₂.

Example:

L₁ = {0, 01, 10}
L₂ = {01, 10, 11}
L₁ ∩ L₂ = {01, 10}

We’ll study these operations in detail in the next section.

Special Languages in TOC

1. Regular Languages

Languages that can be recognized by finite automata.

Example: L = {w | w ends with ‘01’}

2. Context-Free Languages

Languages that can be recognized by pushdown automata.

Example: L = {0ⁿ1ⁿ | n ≥ 0}

3. Recursively Enumerable Languages

Languages that can be recognized by Turing machines.

Example: All computable languages

We’ll study these classifications throughout the course.

Empty Language vs Language with Empty String

This is a very important distinction that often confuses students:

Aspect	Empty Language ∅	Language with ε
Notation	∅ or { }	{ε}
Size	\|∅\| = 0	\|{ε}\| = 1
Contains ε?	No	Yes
Examples	No strings	One string (empty)

Key Point:

∅ is a language with no strings
{ε} is a language with one string (which happens to be empty)

Example:

L₁ = ∅
ε ∈ L₁? NO
|L₁| = 0

L₂ = {ε}
ε ∈ L₂? YES
|L₂| = 1

Language Recognition

A language is recognized by a computational model if the model accepts exactly the strings in that language.

Example:

Language: L = {w | w starts with 0}
Recognition: Build a DFA that accepts any string starting with 0
Strings accepted: 0, 01, 00, 011, 0101, ...
Strings rejected: 1, 10, 11, 101, ...

Decidable vs Undecidable Languages

Decidable Language

A language for which we can build an algorithm that always terminates and correctly decides membership.

Example: L = {even numbers in binary}

Undecidable Language

A language for which no such algorithm exists.

Example: Halting problem (we’ll study this later)

Practical Applications

1. Programming Languages

Alphabet: {keywords, identifiers, operators, ...}
Language: All valid programs in Python/Java/C

2. Regular Expressions

Regex: [a-z]+@[a-z]+\.com
Language: Valid email addresses

3. Compilers

Language: All syntactically correct programs
Recognition: Compiler checks if source code is valid

4. Network Protocols

Language: Valid sequences of network packets
Recognition: Protocol validator

Important Exam Points

✓ A language is a set of strings over an alphabet
✓ L ⊆ Σ* (language is subset of all possible strings)
✓ Languages can be finite or infinite
✓ ∅ (empty language) ≠ {ε} (language with empty string)
✓ |∅| = 0, but |{ε}| = 1
✓ Σ* is the universal language (all possible strings)
✓ Infinite languages need a rule or pattern to define them
✓ Set builder notation: L = {w | condition on w}

Common Mistakes to Avoid

❌ Mistake 1: Confusing ∅ with {ε}
✓ Correct: ∅ has 0 strings, {ε} has 1 string

❌ Mistake 2: Thinking all languages are finite
✓ Correct: Most interesting languages are infinite

❌ Mistake 3: Writing L = {0ⁿ1ⁿ} without specifying n
✓ Correct: L = {0ⁿ1ⁿ | n ≥ 0}

❌ Mistake 4: Confusing language with alphabet
✓ Correct: Alphabet is Σ, Language is L ⊆ Σ*

Practice Questions

Q1: Define the language of all binary strings with exactly two 1’s.

Answer: L = {w ∈ {0,1}* | w contains exactly two 1’s}
Examples: 11, 011, 101, 110, 0011, 0101, 0110, 1001, …

Q2: Is ε in the language L = {0ⁿ | n ≥ 1}?

Answer: No, because n ≥ 1, so minimum string is “0”

Q3: What is |L| if L = ∅?

Answer: |L| = 0

Q4: Give an example of an infinite language over Σ = {a}.

Answer: L = {aⁿ | n ≥ 0} = {ε, a, aa, aaa, aaaa, …}

Q5: Define the language of all strings that start and end with the same symbol over Σ = {0, 1}.

Answer: L = {w ∈ {0,1}* | first symbol = last symbol}
Examples: ε, 0, 1, 00, 11, 000, 010, 101, 111, …

Summary

A language is a set of strings: L ⊆ Σ*
Languages can be finite or infinite
Empty language ∅ contains no strings
{ε} is a language containing one string (empty string)
Languages are defined by:
- Enumeration (for finite)
- Set builder notation (for infinite)
- Automata/Grammars (computational models)
Languages can be combined using operations (union, concatenation, etc.)
Different classes: Regular, Context-Free, Recursive, etc.

Languages are the central objects we study in Theory of Computation. Every computational problem can be viewed as a language recognition problem!

Languages

What is a Language?

Formal Definition

Simple Analogy

Examples of Languages

Example 1: Finite Languages

Example 2: Infinite Languages

Example 3: Special Languages

Types of Languages

1. Finite Languages

2. Infinite Languages

Ways to Define Languages

Method 1: Enumeration (Listing)

Method 2: Set Builder Notation

Method 3: Using Automata or Grammars

Language Operations

1. Union (L₁ ∪ L₂)

2. Concatenation (L₁ · L₂)

3. Kleene Star (L*)

4. Complement (L̄)

5. Intersection (L₁ ∩ L₂)

Special Languages in TOC

1. Regular Languages

2. Context-Free Languages

3. Recursively Enumerable Languages

Empty Language vs Language with Empty String

Example:

Language Recognition

Decidable vs Undecidable Languages

Decidable Language

Undecidable Language

Practical Applications

1. Programming Languages

2. Regular Expressions

3. Compilers

4. Network Protocols

Important Exam Points

Common Mistakes to Avoid

Practice Questions

Summary

Install Notes App