You already know how to take as an input successive space-separated values into successive arguments with the Scan functions from the fmt package. But what if you wanted to take as an input an entire string, including the whitespaces in it, or maybe only take input until you reach a certain character?
In this topic, we will learn about the bufio package: it contains functions that allow us to perform advanced input operations like the ones mentioned above.
Taking input with NewReader
We'll start by creating a Reader with the help of the bufio.NewReader() function. In simple terms, the Reader type contains a buffer with the default size 4 kB and a reader that allows us to store and read data from it.
To let our program know we'll be taking data from the standard input or stdin, we'll need to pass as an argument os.Stdin to the bufio.NewReader() function.
After creating a Reader, the most common functions used to read data are the following:
ReadBytes()— returns a slice of bytes with the data and an error;ReadString()— returns a string with the data and an error.
Both functions take a specified delimiter as an argument, which usually is the newline '\n' character, and read data until they reach the specified delimiter. However, we can use other delimiters such as a single rune 'd' like in the below example:
package main
import (
"bufio"
"fmt"
"log"
"os"
)
func main() {
reader := bufio.NewReader(os.Stdin)
b, err := reader.ReadBytes('\n') // Input into `b`: Hello World!\n
if err != nil {
log.Fatal(err) // Exit if we have an unexpected error
}
fmt.Println(string(b)) // Output: Hello World!\n
s, err := reader.ReadString('d') // Input into `s`: JetBrains Academy\n
if err != nil {
log.Fatal(err)
}
fmt.Println(s) // Output: JetBrains Acad
}An important detail regarding the ReadBytes() and ReadString() functions is that the returned slice of bytes or string includes the specified delimiter — HelloWorld!\n includes the \n, and Jetbrains Acad includes the d.
Taking input with NewScanner
Apart from the previously mentioned Reader type, we can also create a Scanner. Just like Reader, the Scanner type contains a buffer of the default size 4 kB to store data and a reader that allows us to read the stored data from it.
We can create a Scanner with the help of the NewScanner() function. We will also need to pass os.Stdin to NewScanner() as an argument — this will let our program know we'll be taking data from the standard input.
The most common usage of a Scanner is to read a certain input line by line, for example:
...
func main() {
scanner := bufio.NewScanner(os.Stdin)
for scanner.Scan() {
line := scanner.Text() // Input: Sheldon Cooper 100 98 Physics\n
fmt.Println(line) // Output: Sheldon Cooper 100 98 Physics
}
}In the above example, we use a for loop with the scanner.Scan() function: it scans data line by line with the default ScanLines() function.
In contrast to the Reader functions, the ScanLines() function does not include the newline character \n in the scanned data.
Next, we declare and assign the line variable to scanner.Text(). The Text() helper allows us to access the previously scanned data. Finally, we output the scanned string with the fmt.Println(line) statement.
Note that the for scanner.Scan() loop will keep scanning until the input ends. However, you can break it explicitly using Ctrl+C in Windows, Ctrl+D in Linux, and Cmd⌘+D in macOS.
Scanning other types of tokens
The key difference between a Scanner and a Reader is that a Scanner reads data as tokens of split lines via the default ScanLines() function. However, a Scanner can also read data as different types of tokens, such as:
Tokens of space-delimited words with
ScanWords();Tokens of UTF-8-encoded runes with
ScanRunes();Tokens of bytes with
ScanBytes();Or we could even create a custom split function that only reads a certain type of token, depending on our requirements.
Now let's take a look at how we can use a Scanner with space-delimited words only:
...
func main() {
wordScanner := bufio.NewScanner(os.Stdin)
// Set the `Split` function to scan for words (space-delimited tokens):
wordScanner.Split(bufio.ScanWords)
for wordScanner.Scan() { // Input: Among Us ඞ\n
fmt.Println(wordScanner.Text())
}
}
// Output:
// Among
// Us
// ඞThe above example showcases the wordScanner that uses the ScanWords() function. Take notice that to properly set wordScanner to scan for space-delimited word tokens, we need to set the Split() function via the wordScanner.Split(bufio.ScanWords) statement.
Next, we use a for loop with the wordScanner.Scan() function. It scans data word by word with the previously set ScanWords() function.
Finally, the wordScanner outputs each one of the scanned words via the fmt.Println(wordScanner.Text()) statement.
Scanner with a custom split function
As previously mentioned, we can also create custom split functions for a Scanner. Let's go ahead and create the ScanBools() function, which validates bool type input only:
// The custom `ScanBools` function validates `bool` type input only:
func ScanBools(data []byte, atEOF bool) (advance int, token []byte, err error) {
advance, token, err = bufio.ScanWords(data, atEOF)
if err == nil && token != nil {
_, err = strconv.ParseBool(string(token))
}
return advance, token, err
}ScanBools() takes two arguments: data — a slice of bytes that contains the data to be scanned, and atEOF, a bool type that indicates whether the data is at the end of the file.
Additionally, it returns three values: advance that contains the number of bytes scanned, token — a slice of bytes containing the words scanned, and err that contains any error encountered.
Within the body of ScanBools(), we set advance, token, and err as the return values of the bufio.ScanWords() function and pass data and atEOF as arguments to it. Next, we validate that there aren't any errors and that the scanned token is not nil within the if statement. After passing this validation, we attempt to parse the scanned token as a bool value.
Now let's go ahead and use ScanBools() within our Go program:
...
func main() {
scanner := bufio.NewScanner(os.Stdin)
// Set `ScanBools` as the split function for the scanning operation
scanner.Split(ScanBools)
for scanner.Scan() {
fmt.Println(scanner.Text())
}
if err := scanner.Err(); err != nil {
log.Fatal(err) // Exit if the scanned value is not a `bool`
}
}
// Input: true false Hello World!
// Output:
// true
// false
// 2022/02/24 23:02:04 strconv.ParseBool: parsing "Hello": invalid syntaxAfter creating a new scanner and setting it to take data from the standard input; the most important part is setting ScanBools as the split function of the scanner.
Finally, the program outputs the scanned bool value; however, the program will keep scanning for data until the scanned value is not of the bool type. When this happens, the program will return an error and exit instead.
Summary
In this topic, we have learned how to use the Reader and Scanner types from the bufio package to take input in an advanced way. We've also learned the key differences between a Reader and a Scanner as well as what the most common functions are that we can use to take input with both types, respectively.
To sum up:
The key difference between a
Scannerand aReaderis that aScannerreads data as tokens of split lines by default, but can also read data as tokens of different data types.The most common functions that a
Readeruses to read data areReadBytes()andReadString(); both functions take a specified delimiter as an argument and read data until they reach the specified delimiter.Apart from reading data as tokens of split lines, a
Scannercan also read data as tokens of space-delimited words withScanWords(), tokens of UTF-8 runes withScanRunes(), tokens of bytes withScanBytes(), and we can even create a custom split function to make ourScannerread and validate only a certain type of token.
This sure was a long topic! However, we're not done yet; it's time to test our knowledge and solve some theory and coding tasks to make sure we've learned how to properly use a Reader and a Scanner, along with their functions included in the bufio package.