Computer scienceAlgorithms and Data StructuresIntro to algorithms and data structuresAutomata theory

Finite-state machine

15 minutes read

Various fields deal with issues related to identifying specific sequences and analyzing data. These include pattern matching in text processing and lexical analysis in compiler creation; both analyze text in search of unique combinations. In pattern matching, it's about finding a specified pattern, and in compiler creation, it's about finding tokens. But how can this be achieved? One method is using a finite-state machine. It's a mathematical model commonly used not only in computer science but also in linguistics, engineering, and beyond. Yes, it's an abstraction, but it's also how the first computers were made. You can think of it as creating a mini machine inside your computer. Given the adaptability of this tool, you're probably ready to dive deeper, so let's start exploring it now!

Definition of Finite-state machine

Let's get down to the nitty-gritty of this concept. Simply put, a finite-state machine (FSM) is a mathematical model, often used in the fields of computer science and engineering. The noteworthy detail about this kind of abstract machine is that it can be in exactly one state, out of a finite set, at any given time. This means, it's a rule-based system that modifies its behavior depending on its present state or situation. It moves from one state to another based on the rules you outline.

You can imagine it like a game where you move from one step to another based on specific actions. For instance, in a simple board game, you might start at the 'start' square (the initial state), then move to the 'play' square (a new state) when you roll the dice (the input). If you land on a snake, you slither down to a previous square (another state). In every instance, where you are at that moment (your current 'state') and what you do (the 'input') dictates where you'll be next (the next 'state'). This principle forms the basis of a finite state machine.

You might be wondering, what are these 'states' we're discussing? Let's explore the main components of the finite-state machine to make every part crystal clear. Simply put, states describe the conditions or situations that the machine can be in. It's critical to remember that the machine can only be in one state at any given moment. For instance, a traffic light might have three states: 'Red', 'Yellow', and 'Green', displaying only one color at a time.

Among the full list of states, there are some special ones that demand attention. One of them is the initial state of the FSM. Just as the name suggests, it is the state in which our machine is situated when it first powers up. Another notable state isn't singular, but a collection of states – the accept states. These states signify the successful completion of a process. Take a vending machine for example, an 'Item Delivered' state would indicate a successful transaction.

Now you may be asking, how do we transition between these states? This where the next component – the transition function – comes into play. It maps a state and an input to a new state. It's essentially the rule that dictates 'if the machine is in this state and it gets this input, it should transition to this state next'.

But what about the input? What does it represent? The set of input symbols, which is the final component of the FSM, is a compilation of all possible inputs that can trigger transitions.

Types of Finite-state machine

Now that you're familiar with the FSM and its components, it's time to delve into the types of the machine. There are two types:

  • Deterministic FSM (DFSM). In this form of FSM, each state has a single transition for every possible input. This indicates that the behavior of the machine is entirely predictable: given a current state and input, there's only one possible next state. DFSMs are often used when a system's behavior is predictable and follows clear rules, such as text string processing, vending machines, and so on.

  • Nondeterministic FSM (NFSM). As you might have guessed, a specific input can lead to one, more than one, or no transition for a particular state. This means that the machine's next state isn't solely determined by its current state and input. NFSMs are often used when a system's behavior is more complex and can't be accurately represented by a deterministic machine or when a system needs to consider multiple probable outcomes simultaneously. Instances include compilers and interpreters, where complex structures or syntax require processing, or search algorithms.

Working of Finite-state machine

Having learned all the necessary terms, let's see how the FSM operates and processes input. Consider the problem of finding patterns in a string, and let's see which stages our machine will pass during its operation. Initially, you need to build the machine. Suppose we aim to find a pattern abcabc in a string. Now note down its possible states. In this type of task, you may consider the number of matching symbols as a state. For instance, if you're considering a string aabbbcaabbbc, the current state will be 2, since only two symbols (bcbc) match. So, there are four possible states: {0,1,2,3}.\{0, 1, 2, 3\}. Note that state 0 is the initial state (as we start the operation of the machine from this point), and state 3 is the accept state, which signals that the desired pattern has been found.

The next step is to create the transitions in a machine. For the stated pattern, the FSM looks like this:

A finite-state machine with four states and input symbols {a, b, c}

Let's focus on transitions starting from state 2 for each possible letter. The transitions for other states can be found in the same way. Being in state 2 means that the letters aa and bb of the pattern have already been matched. There are three possible actions now: the next letter can be either aa, bb or cc. If it's aa, you go back to state 1, since only one letter matches. In case of bb the machine reverts to the initial state (colored green), since no symbols match. Finally, when letter cc appears, the machine progresses to the accept state (colored red) and the pattern is found.

Once the FSM is built, it can begin its operation. The steps can be formally described as follows:

  1. Initialization. At this stage, the FSM is in the initial state and is ready to take input. In the case of pattern matching, the initial state is 0, since no symbols have been compared yet.

  2. Processing input. Now, our FSM starts receiving the input string, one symbol at a time. If the symbol equals the one that is next in our pattern, you transition to the next state. Otherwise, you revert to the initial state or another state representing a shorter partial match, depending on the exact pattern and FSM design. For example, if the next symbol in a pattern is 'C', and you receive 'C' as an input, then the machine goes to the state i+1i + 1, where ii is the current one.

  3. You repeat the second step until you find the given pattern or the input string ends. In the first case, you reach the accept state, which represents a full match of the pattern.

At first, you might find this algorithm baffling, as it is not similar to the ones you have already dealt with, but applying it on a real pattern and string can provide excellent clarification.

Example

So, let's put this into practice. Say we want to find a pattern fsmfsm in a string fmfsmfmfsm. Regarding the components of the FSM, let {f,m,s}\{f, m, s\} be the set of the machine's input symbols, and the states will be {0,1,2,3}\{0, 1, 2, 3\}, each one corresponding to the number of matched symbols. Take note: state 33 is the accept state.

Before passing input to the FSM, you need to construct it. Try it yourself first and then compare with the one shown below!

As you might recall, to create an FSM, you need to jot down all its potential states first. As mentioned above, we have 4 in our case. The next step is to create transitions between them. We have 3 symbols in our set of input symbols, so for each state we have 3 outgoing transitions. After you sketch all these, you may end up with something like this:

A finite-state machine with 4 states and input symbols {f, s, m}

Now, having constructed an FSM, it's time to use it to find the pattern fsmfsm. Let's analyze our string symbol by symbol. The green circle represents the current state of the machine, the red symbols on the left and right represent the current input and the pattern symbol to match it to, the green substring is a match with the pattern, a red transition denotes the one being taken in this step. Also, you'll notice a strange figure in the illustrations that may seem out of place. It's our friendly FSM expert, who will guide us through the transitions, leading from one state to another while we explore the example.

The first input symbol is ff. The current state is the initial one, so we find a transition from it with the letter ff written next to it and move to the state 11.

A finite-state machine after considering the first symbol

Now we repeat this process for the remainder of the symbols in the initial string.

A finite-state machine after considering the second symbol A finite-state machine after considering the third symbol

A finite-state machine after considering the fourth symbol A finite-state machine after considering the fifth symbol

Lastly, we arrive at the accept state:

A finite-state machine after considering the whole text

Celebrations are in order! We've successfully found the given pattern using the FSM.

Applications of Finite-state machine

As for the real-world applications of the FSM, many have already been mentioned in this topic. Let's discuss them in further detail:

  1. Text processing and parsing. FSMs are instrumental in lexical analysis, which is part of parsing text. For instance, they can recognize patterns in text or tokenize a text string into separate words or symbols.

  2. Compilers. They employ FSMs in their frontend to recognize the structure of the programming language being compiled. This process comprises lexical analysis (tokenizing the input) and syntax analysis (creating a parsing tree), both tasks that FSMs can handle.

  3. Networking. FSMs are beneficial in network protocols for facilitating communication between devices. For example, the Transmission Control Protocol (TCP) uses an FSM to manage transitions between states like LISTEN, SYN_RECEIVED, ESTABLISHED, FIN_WAIT, and CLOSED.

  4. User Interfaces. Many user interfaces can be interpreted as FSMs. Each screen or view in the interface can be perceived as a state, and user actions like clicks or key presses can be inputs that trigger transitions.

  5. Game development. FSMs are utilized in game development to control character behavior. Each state might portray a different behavior (like idle, attack, run away), and game events can trigger transitions between states.

  6. Operating systems. FSMs are useful in operating systems to govern processes. Each process can be in one of several states (e.g., running, ready, blocked), and the operating system manages transitions between these states.

  7. Software testing. FSMs are used in software testing to generate test cases. Each state represents a different section of the software, and the transitions represent different actions or events.

  8. Robotics. In robotics, FSMs can be used to regulate the behavior of a robot. Each state might represent a different action or behavior (like moving forward, turning or stopping), and sensor inputs can trigger transitions between states.

  9. Vending machines. A vending machine can be represented as an FSM where each state represents a different stage in the vending process. There could be states indicating waiting for payment, dispensing a product, giving change, etc.

The examples mentioned are only some of the potential applications of finite-state machines, but they already demonstrate the adaptability of this tool in modeling and managing a broad range of systems and processes.

Conclusion

To reiterate the main points about FSM, let's outline them:

  1. It's a mathematical model, frequently used for computation in computer science and engineering.

  2. FSMs are extensively used in text processing, parsing, compilers, networking, operating systems, and many other fields.

  3. There are 5 components of the FSM: its set of states, initial and accept states, transition function, and set of input symbols.

  4. FSMs can be classified into two types: deterministic and nondeterministic. In the former, each state has a singular transition for each possible input. In the latter, an input can lead to one, more than one, or no transition for a particular state.

  5. To find a pattern in a string using an FSM, you have to define potential states, the set of input symbols, and connect states with transitions.

Given that a finite-state machine is an abstract model, it's frequently used as the backbone for some algorithms used in computer science. One such algorithm is the Aho-Corasick algorithm, which will be covered in one of the following topics.

11 learners liked this piece of theory. 1 didn't like it. What about you?
Report a typo