Suppose you have a file that contains the entire text of "War and Peace" by Leo Tolstoy. In this file, each sentence of the novel is on a new line.
You want to write a program that preprocesses the text (e.g. splits sentences into words, and deletes punctuation marks). What is the optimal way to read this file?
Mind that you are likely to do several operations in each line to preprocess the novel.