In our work, we often have to extract the necessary information from a large amount of data. If you need to filter the results of console utilities or find necessary data in the files, grep is a great helper.
Search word in a file
Grep (global regular expression print) is a command line utility to search for strings that contain a regular expression. With grep you can search for text in files, analyze logs, and simplify your work with other command line utilities. It is a very useful and powerful utility that can save you a lot of time.
In this article we will use the log.txt file, which contains the following lines:
2021-02-15 03:18.57 [info] response code=200 elapsed=329.08
2021-02-15 03:18.64 [info] request json={"make_a_problem": true} url=https://some.site.org/api
2021-02-15 03:18.77 [Error] response code=500 elapsed=329.08
2021-02-15 03:18.97 [error] response code=400 elapsed=329.08
2021-02-15 03:19.44 [info] request json={"make_a_problem": true} url=https://some.site.org/api
2021-02-15 03:19.47 [error] response code=500 elapsed=329.08If you want to search for strings in a file, you can pass the filename directly to the grep utility. It will read the file and filter out the lines with the given pattern.
└> grep error log.txt
2021-02-15 03:18.97 [error] response code=400 elapsed=329.08
2021-02-15 03:19.47 [error] response code=500 elapsed=329.08Now let's take a closer look at what else we can do with the grep utility!
Ignore case
Sometimes we do not know if the word is capitalized or not. You can see that the word error has different spelling in our file. In order to account for all variants of a word, grep has the ignore-case flag -i.
Let's find all the errors in the log.txt file:
└> grep -i error log.txt
2021-02-15 03:18.77 [Error] response code=500 elapsed=329.08
2021-02-15 03:18.97 [error] response code=400 elapsed=329.08
2021-02-15 03:19.47 [error] response code=500 elapsed=329.08The output is three lines.
If you run the command without the -i flag, then there will be only two lines because one word, Error, is capitalized:
└> grep error log.txt
2021-02-15 03:18.97 [error] response code=400 elapsed=329.08
2021-02-15 03:19.47 [error] response code=500 elapsed=329.08Show a line number
We already know, how to search the lines with the given word, but how can we locate these lines in the file? Grep has a special flag for this: -n (line number). It's helpful when you are working with configuration files. Configuration files can be very large. If you want to change the values of a certain parameter, you can use grep to find the number of the line with that parameter. Then just open the file and go to the desired line.
We'll continue working with the same log.txt file and find the number of the line with the 400 error code:
└> grep -n 400 log.txt
4:2021-02-15 03:18.97 [error] response code=400 elapsed=329.08The number is 4 at the beginning of the line in the file, which contains the string 400.
Inverted search
It's good to find lines with some pattern, but how can we invert our search and only get the lines that do not have this pattern? The grep has the invert match -v flag for that too!
For example, let's read server errors from the log:
└> grep -i error log.txt
2021-02-15 03:18.77 [Error] response code=500 elapsed=329.08
2021-02-15 03:18.97 [error] response code=400 elapsed=329.08
2021-02-15 03:19.47 [error] response code=500 elapsed=329.08The result consists both of server errors with error codes greater than or equal to 500 and client errors with error codes from 400 to 499.
So we can add the condition "don't show lines that contain 400" to the previous command:
└> grep -i error log.txt | grep -v 400
2021-02-15 03:18.77 [Error] response code=500 elapsed=329.08
2021-02-15 03:19.47 [error] response code=500 elapsed=329.08And now the result contains only server errors, i.e. errors with the code greater than or equal to 500.
You can combine several flags at once. For example, the combination of flags -i -n or -in applies case-insensitive search and also prints the line numbers at the same time.
Filtering output of utilities
In the last example, we used grep together twice and passed the output of one command to the input of another. It is possible because grep uses standard input, so we can use it to filter the results of other utilities, including grep itself.
Let's take the familiar echo utility, which allows us to print text into the console. The command:
└> echo -e "something" | grep some
somethingWill bring back the word "something". However, this command below won't get anything back:
└> echo "something" | grep -w someIt's all about the flag -w. It allows you to search for a string that has a full match with the search query. So we need to specify the whole word "something":
└> echo "something" | grep -w something
somethingTo search for a multi-word expression, the search query must be in quotation marks. Grep supports single and double quotation marks.
You can combine grep with any other command line utilities familiar to you. For example, you can pass the output of the commands like ls or tree, and find files with some patterns in their names.
Conclusion
In this article we've looked at the basic techniques of working with the grep utility:
searching for strings that contain the desired information
ignoring the letter cases if necessary
filtering out unnecessary results using the
-vflagsearching by full word
filtering the results of other utilities
If you're ready, proceed to the tasks to test your understanding of the topic.