Strings

What is a String?

A string (also called a word) is a finite sequence of symbols chosen from an alphabet. If an alphabet provides us with the building blocks, strings are what we build with them.

Formal Definition

A string w over an alphabet Σ is a finite sequence of symbols from Σ.

  • w = a₁a₂a₃…aₙ where each aᵢ ∈ Σ
  • The length of w is n, denoted as |w| = n

Simple Analogy

Think of it this way:

  • Alphabet = Letters of the English language
  • String = Words formed using those letters
  • Just like “cat”, “dog”, “hello” are words formed from English letters, in Theory of Computation, strings are sequences formed from an alphabet.

Examples of Strings

Example 1: Binary Strings

Alphabet: Σ = {0, 1}

Valid Strings:

  • w₁ = 0
  • w₂ = 1
  • w₃ = 01
  • w₄ = 10
  • w₅ = 101
  • w₆ = 111000
  • w₇ = 010101

Example 2: English Letter Strings

Alphabet: Σ = {a, b, c}

Valid Strings:

  • w₁ = a
  • w₂ = ab
  • w₃ = abc
  • w₄ = aabbcc
  • w₅ = cabba
  • w₆ = bcabca

Example 3: Decimal Strings

Alphabet: Σ = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}

Valid Strings:

  • w₁ = 123
  • w₂ = 456789
  • w₃ = 0000
  • w₄ = 42

The Empty String (ε or λ)

The empty string is a special string that contains no symbols. It is denoted by:

  • ε (epsilon) - most common notation
  • λ (lambda) - alternative notation

Properties of Empty String

  1. Length: |ε| = 0
  2. Universal: Exists over any alphabet
  3. Unique: There is only one empty string
  4. Identity: For any string w, w·ε = ε·w = w (concatenation identity)

Example:

Σ = {0, 1}
ε = "" (no symbols)
|ε| = 0

Important: Empty string (ε) ≠ Empty set (∅)

  • ε is a string with zero symbols
  • ∅ is a set with zero elements

Length of a String

The length of a string is the number of symbols it contains, denoted by |w|.

Examples:

StringAlphabetLength
w = “010”{0, 1}|w| = 3
w = “aaaa”{a, b}|w| = 4
w = “abc”{a, b, c}|w| = 3
w = εany|w| = 0
w = “1”{0, 1}|w| = 1

Important:

  • Repeated symbols count: “aaa” has length 3, not 1
  • Empty string: Always has length 0
  • Length ≥ 0: Length cannot be negative

Operations on Strings

1. Concatenation

Concatenation is joining two strings together, denoted by · (dot) or simply placing them side by side.

Definition: If w = a₁a₂…aₘ and x = b₁b₂…bₙ, then
w·x = a₁a₂…aₘb₁b₂…bₙ

Examples:

w = "abc", x = "def"
w·x = "abcdef"

w = "01", x = "10"
w·x = "0110"

w = "hello", x = ε
w·x = "hello" (ε is identity)

Properties:

  • Associative: (w·x)·y = w·(x·y)
  • NOT Commutative: w·x ≠ x·w (usually)
  • Identity: w·ε = ε·w = w
  • Length: |w·x| = |w| + |x|

2. Reversal (wᴿ)

The reversal of a string w, denoted wᴿ, is the string written backwards.

Examples:

w = "abc" → wᴿ = "cba"
w = "0110" → wᴿ = "0110" (palindrome)
w = "hello" → wᴿ = "olleh"
w = "a" → wᴿ = "a"
ε ᴿ = ε

Properties:

  • (wᴿ)ᴿ = w (reversing twice gives original)
  • |wᴿ| = |w|
  • (w·x)ᴿ = xᴿ·wᴿ

3. Power of a String (wⁿ)

The power of a string means concatenating it with itself n times.

Definition:

  • w⁰ = ε (by convention)
  • w¹ = w
  • wⁿ = w·w·w·…·w (n times)

Examples:

w = "ab"
w⁰ = ε
w¹ = "ab"
w² = "abab"
w³ = "ababab"
w⁴ = "abababab"

w = "0"
w⁵ = "00000"

4. Substring

A string v is a substring of w if v appears consecutively within w.

Example:

w = "computer"
Substrings: "com", "put", "ter", "compu", "uter", "c", "e", ε, "computer"
Not substrings: "cpt", "mpu" (not consecutive)

5. Prefix and Suffix

Prefix: A substring that starts from the beginning Suffix: A substring that ends at the end

Example:

w = "testing"

Prefixes: ε, "t", "te", "tes", "test", "testi", "testin", "testing"

Suffixes: ε, "g", "ng", "ing", "ting", "sting", "esting", "testing"

Note: Every string is both a prefix and suffix of itself. ε is a prefix and suffix of every string.

Special String Sets

Σ* (Sigma Star)

Σ* is the set of all possible strings over alphabet Σ, including the empty string.

Example:

Σ = {0, 1}
Σ* = {ε, 0, 1, 00, 01, 10, 11, 000, 001, 010, 011, 100, 101, 110, 111, ...}

Properties:

  • Σ* is infinite (even if Σ is finite)
  • ε ∈ Σ* (always includes empty string)
  • Contains strings of all possible lengths

Σ⁺ (Sigma Plus)

Σ⁺ is the set of all non-empty strings over alphabet Σ.

Example:

Σ = {0, 1}
Σ⁺ = {0, 1, 00, 01, 10, 11, 000, 001, 010, 011, 100, 101, 110, 111, ...}

Properties:

  • Σ⁺ = Σ* - {ε}
  • ε ∉ Σ⁺ (does NOT include empty string)
  • Σ⁺ is also infinite

Relationship:

Σ* = Σ⁺ ∪ {ε}

Strings vs Symbols

AspectSymbolString
DefinitionSingle element from ΣSequence of symbols
Examplea”aaa”
Notationa, b, cw, x, y
LengthAlways 1Can be 0, 1, 2, …
EmptyN/Aε exists

Important: A single symbol can also be viewed as a string of length 1.

Practical Applications

1. Programming

string = "hello"  # String of length 5
alphabet = {'a', 'b', ..., 'z'}  # Alphabet used

2. Pattern Matching

Text: "The quick brown fox"
Pattern: "quick" (substring to find)

3. DNA Analysis

Alphabet: {A, C, G, T}
DNA sequence: "ACGTACGTAACC" (string)

4. Binary Data

Alphabet: {0, 1}
Byte: "01100001" (string of length 8)

Important Exam Points

  1. ✓ A string is a finite sequence of symbols from an alphabet
  2. ✓ Length of a string is denoted by |w|
  3. ✓ Empty string (ε) has length 0
  4. ✓ ε is different from ∅ (empty set)
  5. ✓ Σ* includes ε, but Σ⁺ does not
  6. ✓ Concatenation: w·x simply places x after w
  7. ✓ |w·x| = |w| + |x|
  8. ✓ wⁿ means w repeated n times

Common Mistakes to Avoid

Mistake 1: Confusing ε with 0 or space
Correct: ε is completely empty, not “0” or ” ”

Mistake 2: Thinking |“aaa”| = 1
Correct: |“aaa”| = 3 (count all symbols)

Mistake 3: Assuming w·x = x·w
Correct: Concatenation is NOT commutative

Mistake 4: Forgetting ε ∈ Σ*
Correct: Σ* always includes the empty string

Practice Questions

Q1: Given w = “abc” and x = “de”, find:

  • a) w·x = ?
  • b) |w·x| = ?
  • c) (w·x)ᴿ = ?

Answers:

  • a) w·x = “abcde”
  • b) |w·x| = 5
  • c) (w·x)ᴿ = “edcba”

Q2: If w = “01”, find w³.

Answer: w³ = “010101”

Q3: Is “ab” a substring of “abcabc”?

Answer: Yes, it appears at positions 0 and 3.

Q4: What is the difference between Σ* and Σ⁺?

Answer: Σ* includes the empty string ε, while Σ⁺ does not.

Summary

  • A string is a finite sequence of symbols from an alphabet
  • Length |w| = number of symbols in w
  • Empty string ε has length 0
  • Σ* = all possible strings (including ε)
  • Σ⁺ = all non-empty strings (excluding ε)
  • Operations: concatenation, reversal, power, substring
  • Strings form the basic data that automata and grammars process

Strings are the fundamental units of data in computation. Every input, output, program, and data file is ultimately a string!