Graph connections

Draft

Nondeterministic Finite Automata

Recognize patterns by letting a finite-state machine branch, keeping any path that could still succeed.

concept beginner theory-of-computingautomataregular-languages

Hook problem

We want to recognize binary strings that contain the substring 01. The string 010 should be accepted because the first two symbols are 01. The string 1110 should be rejected because no 0 is immediately followed by 1.

The interesting part is not that this language is hard. It is deliberately small. The goal is to compare two ways of telling the same finite-memory story.

Find the hidden 01The language contains exactly the binary strings that have substring 01 somewhere.
010

pattern found

1110

no 01 yet

1001

pattern found

DFA baseline

A DFA can solve this with one current state. It can remember whether the previous symbol was a promising 0, and once it has seen 01, it can stay accepting forever.

That deterministic view works. An NFA gives an alternate viewpoint: at every 0, keep scanning normally and also start a branch that treats this 0 as a possible beginning of 01.

One state versus many possible statesA DFA can solve the problem; the NFA tells a different branching story.
DFA

One current state remembers whether the previous symbol was a promising 0.

current: saw last 0
NFA

The same prefix can keep q0 alive and also try q1 as a candidate start.

q0scan for a startactiveq1candidate 0active / newly spawnedq2saw 01inactive / accepting state

The branching idea

The smallest new invention is that a transition may return several next states. From q0 on input 0, the machine moves to both q0 and q1:

δ(q0,0)={q0,q1}\delta(q_0, 0) = \{q_0, q_1\}

One branch says “keep looking.” The other branch says “try this 0 as the start of 01.”

The first forkOn a 0, q0 both keeps scanning and starts a candidate branch in q1.
0,1010,1q0scan for a startq0: This branch has not committed to a particular 0 yet; it keeps scanning the whole suffix.q1candidate 0q1: This branch just chose the latest 0 as a possible start of the substring 01.q2saw 01q2: This branch has already seen 01, so any remaining symbols keep it accepting.

One transition can return a set: delta(q0, 0) = {q0,q1}.

What each state means

This NFA has three states:

  • q0: scanning for a possible start.
  • q1: just chose a 0 as the possible start of 01.
  • q2: already saw 01; accept and stay accepting.

These are not three separate machines. They are three possible histories that can be active after reading the same prefix.

Branch meaningsEvery active state names a different possible history of the same prefix.
q0scan for a start

This branch has not committed to a particular 0 yet; it keeps scanning the whole suffix.

q1candidate 0

This branch just chose the latest 0 as a possible start of the substring 01.

q2saw 01

This branch has already seen 01, so any remaining symbols keep it accepting.

Branch death

Missing transitions matter. From q1, reading 1 completes the pattern and goes to q2; reading 0 does not complete 01, so that old candidate branch has nowhere to go:

δ(q1,0)=\delta(q_1, 0) = \emptyset

For input 00, after the first 0 the active set is {q0,q1}. On the second 0, old q1 dies, but old q0 moves to {q0,q1} and immediately spawns a fresh candidate for the second 0. The active set remains {q0,q1}, but the event ledger tells the real story.

A branch can die while another is bornFor 00, old q1 has no 0-transition, while old q0 immediately spawns a new q1.
Second 0 transition events
Old branchEventResult
q0q0 --0--> {q0,q1}old q0 keeps scanning in q0 and also spawns q1 for this 0.
q1q1 --0--> {}q1 has no transition on 0, so that old branch dies.

Before: {q0,q1}. After: {q0,q1}. The set looks unchanged, but the ledger shows one old q1 died and a new q1 was spawned.

q0scan for a startactiveq1candidate 0active / newly spawned / old branch diedq2saw 01inactive / accepting state

Active state set

Instead of drawing every branch separately, we summarize all live branches with an active state set. For input 010, the exact trace is:

{q0} -> {q0,q1} -> {q0,q2} -> {q0,q1,q2}

The row after prefix 01 would accept if the input ended there, because q2 is active. The whole input is accepted only after every symbol has been consumed.

Active-set trace for 010{q0} -> {q0,q1} -> {q0,q2} -> {q0,q1,q2}.
Active-set trace
PrefixReadActive statesWould accept if ended here?Explanation
ε-{q0}noStart with only q0 active: scan for a possible 0 that could begin 01.
00{q0,q1}noReading 0 keeps q0 alive and starts a candidate q1 branch for this 0.
011{q0,q2}yesA q1 branch reads 1 and reaches accepting q2; q0 still keeps scanning.
0100{q0,q1,q2}yesReading 0 keeps q0 alive and starts a candidate q1 branch for this 0.

Interactive visual demo

Step through 010, 00, and 1010. Use the active-set view to see the compact summary, then switch to branch-events view to see exactly which old branch created or lost each state.

View
Current status

Would reject if input ended here.

Active states
{q0}
Consumed prefix
ε
Remaining input
010
Next symbol
0

Start with only q0 active: scan for a possible 0 that could begin 01.

0,1010,1q0scan for a startq1candidate 0q2saw 01
0-1-0-
Active-set ledger
NowPrefixReadActive statesWould accept if input ended here?
nowε-{q0}no
00{q0,q1}no
011{q0,q2}yes
0100{q0,q1,q2}yes
Transition table used by the simulator
State01
q0{q0,q1}{q0}
q1{}{q2}
q2{q2}{q2}

Formal version

An epsilon-free nondeterministic finite automaton is a 5-tuple:

N=(Q,Σ,δ,q0,F)N = (Q, \Sigma, \delta, q_0, F)

Here Q is the finite set of states, \Sigma is the input alphabet, q0 is the start state, and F is the set of accepting states.

The key difference from a DFA is the transition function:

δ:Q×ΣP(Q)\delta: Q \times \Sigma \to P(Q)

P(Q) means the power set of Q: the set of all subsets of states. A transition returns a set of possible next states, not one required next state.

For the contains-01 NFA, Q = { q0, q1, q2 }, \Sigma = { 0, 1 }, q0 = q0, F = { q2 }, and the transition table is the one shown here.

The concrete 5-tupleThe picture becomes states, alphabet, set-valued transition function, start state, and accept set.
Q

{ q0, q1, q2 }

Σ

{ 0, 1 }

δ

Q x Σ -> P(Q)

q0

q0

F

{ q2 }

Set-valued transition table
State01
q0{q0,q1}{q0}
q1{}{q2}
q2{q2}{q2}

Epsilon note

Some NFA definitions also allow epsilon transitions. An epsilon transition consumes no input; it is written with ε\varepsilon.

This node does not use them. The main implementation stays epsilon-free with δ:Q×ΣP(Q)\delta: Q \times \Sigma \to P(Q). Epsilon-NFAs extend the transition domain to include ε\varepsilon and need epsilon-closure bookkeeping, which is a useful later topic but not needed for the branching semantics here.

Epsilon is outside this main exampleThis node uses no epsilon transitions; they are a later compactness tool.
Bounded note

An epsilon transition consumes no input. This node's main machine has none, so delta stays Q x Sigma -> P(Q).

Implementation sketch

The direct simulation stores a set of active states. For each input symbol, ask every currently active state where it can go, then union all answers:

Si+1=qSiδ(q,ai+1)S_{i+1} = \bigcup_{q \in S_i} \delta(q, a_{i+1})

For 010, after reading prefix 0, S_1 = {q0,q1}. The next symbol is 1, so:

S2=δ(q0,1)δ(q1,1)={q0}{q2}={q0,q2}S_2 = \delta(q_0, 1) \cup \delta(q_1, 1) = \{q_0\} \cup \{q_2\} = \{q_0, q_2\}
function accepts(input: string) {
  let activeStates = new Set<NfaState>(["q0"]);

  for (const symbol of input) {
    const nextActiveStates = new Set<NfaState>();
    for (const state of activeStates) {
      for (const next of transitionTable[state][symbol]) {
        nextActiveStates.add(next);
      }
    }
    activeStates = nextActiveStates;
  }

  return activeStates.has("q2");
}
Code-to-active-set mapSimulation unions all next states from the old active states.
let next = new Set<NfaState>();
for (const q of activeStates) {
  for (const r of delta[q][symbol]) {
    next.add(r);
  }
}
Concrete row

From S1 = {q0,q1} on 1:

{q0} union {q2} = {q0,q2}

Prefix invariant

After consuming the first i symbols, S_i contains exactly the states reachable by some branch that reads that prefix. No branch is invented later, and no live branch is forgotten.

That invariant explains the acceptance rule:

accept if SnF\text{accept if } S_n \cap F \ne \emptyset

In words: after the whole input is consumed, accept if at least one active branch is in an accepting state.

Prefix invariantAfter prefix 01, q2 is active, so the prefix would accept if it ended there.
q0scan for a startactiveq1candidate 0inactiveq2saw 01active / accepting state

After prefix 01, S2 = {q0,q2}. Because q2 is active, this prefix would accept if the input ended here.

Complexity

For input length n, direct simulation examines the active states for each symbol. There are at most |Q| active states, so the running time is O(nQ)O(n|Q|) with a table lookup representation. The transition table has |Q||Σ| rows or cells of sets.

A future subset-construction node will show how active sets can be compiled into DFA states. This page stops at direct NFA simulation.

Branching costDirect simulation scans at most |Q| active states per input symbol.
n

input symbols

|Q|

maximum active states scanned per step

|Q||Σ|

transition-table cells

O(n|Q|)

direct simulation time for a fixed alphabet lookup table

Common confusions

A dead branch is not a rejected input. For 00, the old q1 candidate dies, but the q0 branch continues, so the run is still meaningful.

Also, acceptedIfInputEndedHere is not the same field as whole-input accepted. A trace row can say “would accept if the input ended here”; the final answer is checked only at the last row.

Two common mistakesA dead branch is not a failed input, and prefix acceptance is not final acceptance until the input ends.
Dead branch != failed input

In 00, old q1 dies, but old q0 keeps the run alive.

Prefix acceptance is conditional

A trace row says would accept if input ended here; whole-input accepted is checked only at the final row.

Connections in the graph

DFA is the prerequisite: learn the one-current-state semantics first, then NFA branching is a controlled generalization where one prefix can have a set of possible current states.

Future nodes can connect NFA to subset construction, regular expressions, and epsilon-NFA construction. Those edges are not added until those nodes exist.

Exercises

  1. Predict the active sets for 01.
  2. For 00, which old branch dies on the second symbol?
  3. Why does 1010 accept even though it starts with 1?
  4. What changes in the implementation if epsilon transitions are allowed?
Predict before revealingUse the same fixture to practice active-set semantics.
01

Predict the final active set and result.

Reveal answer

{q0,q2} - accepted

00

Predict the final active set and result.

Reveal answer

{q0,q1} - rejected

1010

Predict the final active set and result.

Reveal answer

{q0,q1,q2} - accepted

ε

Predict the final active set and result.

Reveal answer

{q0} - rejected

Graph connections : Nondeterministic Finite Automata