Computer scienceProgramming languagesGolangWorking with dataWorking with files

Reading files

10 minutes read

Quite often, you'll need your program to work with data that is outside the codebase, and regularly this data will come in the form of files.

In this topic, you'll learn different ways of reading files in Go, in particular how to read text files.

Reading files using the os package

One of the most straightforward ways to read a file in Go is with the help of the os.ReadFile() function, which allows us to read the whole file directly and automatically close the file after reading it; this means that we don't need to worry about leakage of file descriptors due to not closing the file.

Now, let's go ahead and read the contents of test_file.txt via the os.ReadFile() function:

package main

import (
    "fmt"
    "log"
    "os"
)

func main() {
    data, err := os.ReadFile("test_file.txt") // test_file.txt is inside the local directory
    if err != nil {
        log.Fatal(err) // exit if we have an unexpected error
    }
    fmt.Println(string(data))
}

// Output:
// Hello! This is the first line of a text file.
// This is the second line.

The os.ReadFile() function has one required parameter — filename, which is the complete path of the file we want to open. In the example above, we have passed "test_file.txt" as the filename parameter, since test_file.txt should be present in our current working directory.

When we read a file via the os.ReadFile() function, the entire content of the file is loaded into memory as a slice of bytes. To see the text contents of the file properly, we convert it to a string in the line #14 before printing it.

Since this is a very direct approach to reading a file in Go, we should only use the os.ReadFile() function when working with small or medium-sized files, to avoid creating any performance bottlenecks in our program.

Reading files line by line

We can also read a text file line by line: this can be achieved with the help of the functions NewScanner() and Text() in the bufio package. Let's read test_file.txt line by line:

package main

import (
    "bufio"
    "fmt"
    "log"
    "os"
)

func main() {
    // open the file "test_file.txt" in read-only mode
    file, err := os.Open("test_file.txt")
    if err != nil {
        log.Fatal(err)
    }
    defer file.Close() // this line closes the file before exiting the program

    scanner := bufio.NewScanner(file) // create a new Scanner for the file

    for scanner.Scan() {
        fmt.Println(scanner.Text()) // the Text() function converts the scanned bytes to a string
    }

    if err := scanner.Err(); err != nil {
        log.Fatal(err)
    }
}

In the line #18, the function bufio.NewScanner() creates a Scanner type within the scanner variable. It provides a convenient interface for reading data of newline-delimited lines of a text as tokens from a file. In very simple terms, the Scanner looks by default for the newline \n character at the end of each line in the text file and then separates the line from the rest.

Then in the line #20, we create a for loop that makes the Scanner iterate over every previously split line within the file, thus printing the following output:

Hello! This is the first line of a text file.
This is the second line.

Reading files word by word

Another way to read text files without loading them completely into memory is to read the data word by word. We can scan and split the content of our text file by words with the help of the ScanWords function in the bufio package. Let's go ahead and read the contents of a new file song.txt word by word:

package main

import (
    "bufio"
    "fmt"
    "log"
    "os"
)

func main() {
    file, err := os.Open("song.txt")
    if err != nil {
        log.Fatal(err)
    }
    defer file.Close()

    scanner := bufio.NewScanner(file)

    scanner.Split(bufio.ScanWords)  // split each scanned line into words

    for scanner.Scan() {
        fmt.Println(scanner.Text())
    }

    if err := scanner.Err(); err != nil {
        log.Fatal(err)
    }
}

In the example above, we call the Split function in the line #19 and pass the bufio.ScanWords function as a parameter that makes the Scanner split every token read from the file into words separated by a blank space.

After executing this program we will have the following word by word output:

Work
it
make
it
do
it
makes
us
harder
better
faster
stronger

Reading files in chunks

When working with big text files, a more efficient approach is to read the file in chunks; this means we don't load the whole contents of the file into memory, but instead we load it in small parts or chunks to avoid "out of memory" errors.

We'll use the Read() function from the os package to read the contents of the file, and the make() function from the builtin package to create a buffer with a predetermined chunk size in bytes:

package main

import (
    "errors"
    "fmt"
    "io"
    "log"
    "os"
)

const chunkSize = 15

func main() {
    file, err := os.Open("test_file.txt")
    if err != nil {
        log.Fatal(err)
    }
    defer file.Close()

    buf := make([]byte, chunkSize) // create a slice of bytes buffer with 
                                   // the previously defined chunk size

    for {
        readTotal, err := file.Read(buf)
        if err != nil {
            if errors.Is(err, io.EOF) {
                break // after reading the last chunk, break the loop
            }
            log.Fatal(err)
        }
        fmt.Println(string(buf[:readTotal]))
    }
}

The output of this code snippet will be the following:

Hello! This is 
the first line 
of a text file.

This is the s
econd line.

After examining the code above, we can see that the key parts of this program are in the line #11, when we declare chunkSize to a size of 15, and in the line #20, when we create the buffer buf via the make() function. The buffer size plays an important role when reading the file in chunks; depending on the buffer size, we can read shorter or longer lines of our text file. Let's take a look at another output of the program, after increasing chunkSize to 45:

Hello! This is the first line of a text file.

This is the second line.

Since we have 45 characters in the first line of test_file.txt and the chunk size of our buffer variable is 45, the program prints both the first line and the second line completely and doesn't split them like in the first example where the chunk size was 15.

Conclusion

In this topic, we've learned different methods of reading a text file's contents in Go. Knowing all these different methods is quite useful, since files might vary in size and we should be able to adapt to that. If we can easily establish which method to implement when working with a big file or a small file, it will help us make our program efficient and avoid creating performance bottlenecks.

Let's recap all the different ways to read a text file in Go:

Read the entire file via the os.ReadFile() function – recommended only for small or medium-sized files.
Read files line by line, using a for loop to iterate through every split line of the file – can be used for any file size.
Read files word by word, using a for loop to iterate through each split word of the file – can be used for any file size.
Read the file in chunks by loading the contents of the file into a sized buffer – recommended for big files.

54 learners liked this piece of theory. 1 didn't like it. What about you?

Report a typo