Graph connections

Draft

Deterministic Finite Automata

Recognize simple string patterns with a finite set of states and one deterministic transition per input symbol.

concept beginner theory-of-computingautomataregular-languages

Hook problem

Imagine a signup form that accepts a deliberately tiny email-like pattern: one or more local characters, one @, one or more domain characters, one ., then one or more suffix characters.

For this teaching model, letters and digits are char, @ is at, . is dot, and everything else is other. Dots are banned in the local part. This is not real RFC email validation; it is a small language for learning finite memory.

Toy signup stringsThe validator sees categories, not full email syntax.
ana@cs.ai

Accepted toy pattern

acharin-localncharin-localacharin-local@atneed-domainccharin-domainscharin-domain.dotneed-suffixacharin-suffixicharin-suffix
ana@.ai

Rejected: dot too early

acharin-localncharin-localacharin-local@atneed-domain.dotdeadachardeadichardead

First naive idea

The first scan might use flags: seenAt, seenDot, hasLocal, hasDomain, and hasSuffix. That can be made correct, and it is a fair starting point.

The pain is that correctness becomes hidden inside combinations. hasLocal && seenAt && !hasDomain means “the next useful thing must be a domain character.” hasLocal && seenAt && hasDomain && seenDot && hasSuffix means “accepting if the input ends now.” The flags are not wrong; their meanings are just unnamed.

Flags hide meaningsSome flag combinations are useful, but the reason is buried in booleans.
Useful flag combinations mapped to named states
Boolean combinationNamed stateMeaning made explicit
hasLocal && !seenAtin-locallocal part exists; @ is now allowed
hasLocal && seenAt && !hasDomainneed-domainthe next useful symbol must be a domain char
hasLocal && seenAt && hasDomain && !seenDotin-domaindomain chars are flowing; one dot is now allowed
hasLocal && seenAt && hasDomain && seenDot && !hasSuffixneed-suffixthe dot appeared; suffix must start
hasLocal && seenAt && hasDomain && seenDot && hasSuffixin-suffixaccepting if the input ends now

Where it breaks

Small mistakes expose the hidden cases. ana@.ai has a local part and an @, but it reads dot before any domain character. ana@@cs.ai reads a second @. ana@cs. reaches the dot but never gets a suffix character.

A deterministic finite automaton makes each fragile case explicit: after ana@, the machine is in need-domain; on dot, the only next state is dead.

A missing domain character`ana@.ai` fails exactly when `need-domain` reads a dot.
acharin-localncharin-localacharin-local@atneed-domain.dotdeadachardeadichardead
Deterministic trace
PrefixReadSymbolStateMeaning
ε--need-localNothing useful has been read yet; the next symbol must be a letter or digit.
aacharin-localAt least one local character has appeared; more local characters or one @ can follow.
anncharin-localAt least one local character has appeared; more local characters or one @ can follow.
anaacharin-localAt least one local character has appeared; more local characters or one @ can follow.
ana@@atneed-domainThe @ has appeared; the next symbol must be the first domain character.
ana@..dotdeadA fatal pattern mistake has happened; every later symbol keeps the machine rejected.
ana@.aachardeadA fatal pattern mistake has happened; every later symbol keeps the machine rejected.
ana@.aiichardeadA fatal pattern mistake has happened; every later symbol keeps the machine rejected.

The core invention

A deterministic finite automaton, or DFA, replaces scattered flags with one current state. A state is a named summary of the prefix read so far: what facts matter, and what must still happen next.

For the toy validator, need-local means “no local character yet.” in-domain means “we have a local part, exactly one @, and at least one domain character, so the dot is now allowed.” dead means a fatal mistake has happened and this input cannot be repaired.

State meaningsEach named state says what must still happen next.
need-localneed local char

Nothing useful has been read yet; the next symbol must be a letter or digit.

in-localinside local part

At least one local character has appeared; more local characters or one @ can follow.

need-domainneed domain char

The @ has appeared; the next symbol must be the first domain character.

in-domaininside domain

At least one domain character has appeared; more domain characters or one dot can follow.

need-suffixneed suffix char

The domain dot has appeared; the next symbol must be the first suffix character.

in-suffixinside suffix

At least one suffix character has appeared; this is accepting only if the input ends here.

deaddead / trap state

A fatal pattern mistake has happened; every later symbol keeps the machine rejected.

Visual model

The whole machine is a table: every state has exactly one next state for every input symbol category. That “exactly one” is the deterministic part.

For example, δ(in-local,@)=need-domain\delta(\text{in-local}, @) = \text{need-domain}, while δ(in-local,dot)=dead\delta(\text{in-local}, \text{dot}) = \text{dead}. The transition table is total: no case is left to an implicit default.

Seven states by four symbolsEvery state has exactly one next state for each input category.
char@chardotcharchardot/other@/dot/otherneed-localneed local charneed-local: Nothing useful has been read yet; the next symbol must be a letter or digit.in-localinside local partin-local: At least one local character has appeared; more local characters or one @ can follow.need-domainneed domain charneed-domain: The @ has appeared; the next symbol must be the first domain character.in-domaininside domainin-domain: At least one domain character has appeared; more domain characters or one dot can follow.need-suffixneed suffix charneed-suffix: The domain dot has appeared; the next symbol must be the first suffix character.in-suffixinside suffixin-suffix: At least one suffix character has appeared; this is accepting only if the input ends here.deaddead / trap statedead: A fatal pattern mistake has happened; every later symbol keeps the machine rejected.
Total transition table
Statecharatdotother
need-localin-localdeaddeaddead
in-localin-localneed-domaindeaddead
need-domainin-domaindeaddeaddead
in-domainin-domaindeadneed-suffixdead
need-suffixin-suffixdeaddeaddead
in-suffixin-suffixdeaddeaddead
deaddeaddeaddeaddead

Interactive visual demo

Try stepping through ana@cs.ai, ana@.ai, and ana@cs.ai.. Watch the text status as well as the colors: being in in-suffix is only enough when the input is exhausted.

Current status

Pending: acceptance is checked after all input is consumed.

Current state
need-local (need local char)
Consumed prefix
ε
Remaining input
ana@cs.ai
Next symbol
a -> char

Start before reading input. The machine still needs the first local character.

char@chardotcharchardot/other@/dot/other*need-localneed local charin-localinside local partneed-domainneed domain charin-domaininside domainneed-suffixneed suffix charin-suffixinside suffixdeaddead / trap state
achar-nchar-achar-@at-cchar-schar-.dot-achar-ichar-
Trace ledger
CurrentPrefixReadSymbolStateStatus
nowε--need-localNothing useful has been read yet; the next symbol must be a letter or digit.
aacharin-localAt least one local character has appeared; more local characters or one @ can follow.
anncharin-localAt least one local character has appeared; more local characters or one @ can follow.
anaacharin-localAt least one local character has appeared; more local characters or one @ can follow.
ana@@atneed-domainThe @ has appeared; the next symbol must be the first domain character.
ana@cccharin-domainAt least one domain character has appeared; more domain characters or one dot can follow.
ana@csscharin-domainAt least one domain character has appeared; more domain characters or one dot can follow.
ana@cs..dotneed-suffixThe domain dot has appeared; the next symbol must be the first suffix character.
ana@cs.aacharin-suffixAt least one suffix character has appeared; this is accepting only if the input ends here.
ana@cs.aiicharin-suffixAt least one suffix character has appeared; this is accepting only if the input ends here.
Transition table used by the simulator
Statecharatdotother
need-localin-localdeaddeaddead
in-localin-localneed-domaindeaddead
need-domainin-domaindeaddeaddead
in-domainin-domaindeadneed-suffixdead
need-suffixin-suffixdeaddeaddead
in-suffixin-suffixdeaddeaddead
deaddeaddeaddeaddead

Formal version

After the trace intuition is in place, the mathematical object is compact:

M=(Q,Σ,δ,q0,F)M = (Q, \Sigma, \delta, q_0, F)

Here Q is the finite set of states, Σ is the finite input alphabet, δ is the transition function, q0 is the start state, and F is the set of accepting states.

In this node, Q is the seven named states, Σ = { char, at, dot, other }, q0 = need-local, and F = { in-suffix }. The function δ is exactly the transition table above.

The concrete 5-tupleThe diagram becomes a finite set of states, alphabet, transition function, start state, and accept set.
Q

need-local, in-local, need-domain, in-domain, need-suffix, in-suffix, dead

Σ

char, at, dot, other

δ

the table above

q0

need-local

F

{ in-suffix }

Implementation sketch

The implementation mirrors the visible trace: classify one character, look up one table cell, replace the current state, and check acceptance only after the loop.

type DfaSymbol = "char" | "at" | "dot" | "other";
type DfaState =
  | "need-local"
  | "in-local"
  | "need-domain"
  | "in-domain"
  | "need-suffix"
  | "in-suffix"
  | "dead";

function classify(input: string): DfaSymbol {
  if (/^[A-Za-z0-9]$/.test(input)) return "char";
  if (input === "@") return "at";
  if (input === ".") return "dot";
  return "other";
}

function accepts(input: string) {
  let state: DfaState = "need-local";
  for (const char of input) {
    const symbol = classify(char);
    state = transitionTable[state][symbol];
  }
  return state === "in-suffix";
}
Code-to-trace mapClassifier, table lookup, and final accept check match the visible trace.
1

classify one input character

2

look up transition[state][symbol]

3

replace the current state

end

accept only if final state is in F

Prefix invariant

After each prefix, the current state summarizes everything the rest of the computation needs to know. It does not remember the literal string ana@cs; it remembers the relevant phase: local part is done, @ is done, domain has started, dot has not happened yet.

That is the core invariant: for any two prefixes that land in the same state, every possible remaining suffix will lead to the same accept/reject answer.

Prefix ledgerAfter each prefix, one state summarizes all facts that still matter.
Deterministic trace
PrefixReadSymbolStateMeaning
ε--need-localNothing useful has been read yet; the next symbol must be a letter or digit.
aacharin-localAt least one local character has appeared; more local characters or one @ can follow.
anncharin-localAt least one local character has appeared; more local characters or one @ can follow.
anaacharin-localAt least one local character has appeared; more local characters or one @ can follow.
ana@@atneed-domainThe @ has appeared; the next symbol must be the first domain character.
ana@cccharin-domainAt least one domain character has appeared; more domain characters or one dot can follow.
ana@csscharin-domainAt least one domain character has appeared; more domain characters or one dot can follow.
ana@cs..dotneed-suffixThe domain dot has appeared; the next symbol must be the first suffix character.
ana@cs.aacharin-suffixAt least one suffix character has appeared; this is accepting only if the input ends here.
ana@cs.aiicharin-suffixAt least one suffix character has appeared; this is accepting only if the input ends here.

Complexity

For input length n, the DFA performs one transition per character, so the running time is O(n)O(n). The transition table has QΣ|Q||\Sigma| entries; here that is 7 * 4. The working memory is O(1)O(1) beyond the input because the machine stores only the current state.

One symbol, one lookupRunning the DFA is linear in the input length and uses constant working memory.
input lengthntransition lookupsnO(n)
ninput symbols

An input of length n is read left to right, one symbol at a time.

ntransitions

Each symbol triggers exactly one transition lookup, so time is O(n).

7 * 4table entries

The finite table has |Q||Σ| cells: seven states times four symbol categories.

1current-state variable

The runner stores only the current state beyond the input, so working memory is O(1).

Common confusions

Acceptance is final-state based. During ana@cs.ai., the prefix ana@cs.ai reaches in-suffix, which is accepting at that moment. But the input still has one more .. The machine must read it, and from in-suffix a dot moves to dead, so the whole string is rejected.

Accept only at the end`ana@cs.ai.` passes through an accepting state, then the final dot sends it to dead.
acharin-localncharin-localacharin-local@atneed-domainccharin-domainscharin-domain.dotneed-suffixacharin-suffixicharin-suffix.dotdead

After the prefix ana@cs.ai the state is in-suffix, but the machine still must read the final dot. The last transition decides the whole string.

Deterministic trace
PrefixReadSymbolStateMeaning
ε--need-localNothing useful has been read yet; the next symbol must be a letter or digit.
aacharin-localAt least one local character has appeared; more local characters or one @ can follow.
anncharin-localAt least one local character has appeared; more local characters or one @ can follow.
anaacharin-localAt least one local character has appeared; more local characters or one @ can follow.
ana@@atneed-domainThe @ has appeared; the next symbol must be the first domain character.
ana@cccharin-domainAt least one domain character has appeared; more domain characters or one dot can follow.
ana@csscharin-domainAt least one domain character has appeared; more domain characters or one dot can follow.
ana@cs..dotneed-suffixThe domain dot has appeared; the next symbol must be the first suffix character.
ana@cs.aacharin-suffixAt least one suffix character has appeared; this is accepting only if the input ends here.
ana@cs.aiicharin-suffixAt least one suffix character has appeared; this is accepting only if the input ends here.
ana@cs.ai..dotdeadA fatal pattern mistake has happened; every later symbol keeps the machine rejected.

The dead state is not a punishment; it is a memory shortcut. Once a fatal mistake has happened, no remaining input can make this toy pattern valid, so every symbol loops back to dead.

Dead state loopAfter a fatal mistake, the remaining input cannot repair this toy pattern.
char, @, dot, otherneed-localneed local charneed-local: Nothing useful has been read yet; the next symbol must be a letter or digit.in-localinside local partin-local: At least one local character has appeared; more local characters or one @ can follow.need-domainneed domain charneed-domain: The @ has appeared; the next symbol must be the first domain character.in-domaininside domainin-domain: At least one domain character has appeared; more domain characters or one dot can follow.need-suffixneed suffix charneed-suffix: The domain dot has appeared; the next symbol must be the first suffix character.in-suffixinside suffixin-suffix: At least one suffix character has appeared; this is accepting only if the input ends here.deaddead / trap statedead: A fatal pattern mistake has happened; every later symbol keeps the machine rejected.
Total transition table
Statecharatdotother
need-localin-localdeaddeaddead
in-localin-localneed-domaindeaddead
need-domainin-domaindeaddeaddead
in-domainin-domaindeadneed-suffixdead
need-suffixin-suffixdeaddeaddead
in-suffixin-suffixdeaddeaddead
deaddeaddeaddeaddead

Connections in the graph

Graph basics and DFA diagrams both use circles and arrows, but they model different things. A graph edge can be any relationship among objects. A DFA arrow is a labeled transition between memory states while reading input.

Future nodes can build from here toward designing DFAs, regular languages, NFAs, and pumping-lemma limits. This page stays bounded to deterministic finite memory and trace-by-input acceptance.

Exercises

  1. Predict the final state for a@b.c before checking the trace.
  2. Why does ana.@cs.ai enter dead before it ever sees @?
  3. Which table entry rejects ana@cs!?
  4. What would need to change if local-part dots were allowed?
Prediction promptsUse the frozen traces to predict the next state and final result.
a@b.c

Will the final state accept?

Reveal answer

accept

ana.@cs.ai

Which state appears right after the local-part dot?

Reveal answer

dead

ana@cs!

Which table row rejects the exclamation mark?

Reveal answer

δ(in-domain, other) = dead

Graph connections : Deterministic Finite Automata