Computer scienceProgramming languagesJavaScriptWorking with dataStrings

Introduction to Regexp

10 minutes read

Regular expressions can seem a bit scary at first glance — it looks like an alien is trying to communicate with you, doesn't it? But there's no need to worry because you don't have to learn all the characters and their meanings in one go. If you break the learning down into small parts instead, it becomes much easier.

A regular expression is very useful when you want to look for specific patterns in text, or when you're working with a variety of data and need to carry out a detailed search. This tool can be used in different ways, and learning how to apply it can make your life much easier.

Creating a RegExp

You can create your regex with the RegExp constructor using the keyword new or with literal notation using two slashes //.

Examples of both these approaches can be seen below.

The constructor accepts two arguments: the first is a string that defines the pattern you want to compare, and the second contains any flags you would like to use (these are optional):

let regex = new RegExp('[a-z]', 'i');

Using literal notation is simpler. You enclose your pattern in slashes (/pattern/) instead of quotation marks ('pattern'):

let regex = /[a-z]/i;

You can also use the constructor with literal notation if you want. So the code below is a hundred percent valid:

let regex = new RegExp(/[a-z]/i);

// This is also valid
let regex = new RegExp(/[a-z]/, 'i');

There are only a few minor differences between the two ways of creating regexes (which we will cover in the next section). So keep in mind that the option you decide to use is a personal choice.

Constructor vs. literal

The main difference between the two approaches is, of course, their syntax. But they also differ in a couple of other ways.

JavaScript strings require you to escape special characters with a backslash \, such as line breaks \n or quotation marks \'. This means that if you try to use any of the special regex characters, like \w (which matches any word character), you'll need to use a second backslash to escape the regex backslash character \\w. Let's look at an example to make this clearer:

let regex = new RegExp('\\w');

If you use this pattern without escaping, the only thing that will match this regex is the letter w:

let regex = new RegExp('\w');

console.log(regex.test('abc')); // false
console.log(regex.test('w')); // true

However, with literal notation, you don't need to escape the backslash:

let regex = /\w/;

So this method is more convenient if you don't want to be concerned with escaping.

Another difference is the way flags are specified. When using the constructor, you need to pass the flag (or flags) as the second argument. Take the i flag, for example. It's used to return matches regardless of whether the letters are in upper or lower case:

let regex = new RegExp('\\w', 'i');

With literal notation, you can simply put the flags after the last slash:

let regex = /\w/i;

Now that you know the main differences between these two methods, it's up to you to choose which one you prefer.

Using RegExp

You can use the test() method to compare a regex pattern with a string.

Alternatively, match() can be used to return regex pattern matches found in a string. This method will be covered in a later topic.

To use test(), you'll need the regex object and, of course, the string itself. You can see the method's syntax below:

regex.test(string)

The return value will be true if a match is found and false if it isn't.

Let's look at some examples:

let regex = /\d+/;

console.log(regex.test('2021')); // true
console.log(regex.test('199')); // true
console.log(regex.test('a5b')); // true
console.log(regex.test('')); // false

In the above example, the pattern will match a string containing one or more + digits \d. As shown, even when the string contains digits and letters together, this is still a match.

However, If the string consists only of letters, it won't match, and the method will return false:

let regex = /\d+/;
let strChar = 'hello';

console.log(regex.test(strChar)); // false

The dot character

A RegExp object has many special characters that you can combine, omit, quantify, and so on. One of these special characters is the dot ., which can be used to match a character of almost any kind (letter, digit, space, etc.). The only exceptions are newline characters like \n and \r.

Let's say you want to match all versions of JavaScript. The dot character can help you in this case, as shown below:

let regex = /JavaScript ES./;

console.log(regex.test('JavaScript ES6')); // true
console.log(regex.test('JavaScript ES7')); // true
console.log(regex.test('JavaScript ES')); // false

The dot character matches any digit, so the first two strings match the pattern. But the last one doesn't because there's no character after ES.

You can also use it to match spaces and punctuation marks.

Let's look at some examples:

let regex = /JavaScript.is.awesome./;

console.log(regex.test('JavaScript is awesome!')); // true
console.log(regex.test('JavaScript is awesome?')); // true
console.log(regex.test('JavaScript.is.awesome.')); // true
console.log(regex.test('JavaScript is awesome\n')); // false
console.log(regex.test('JavaScript is awesome\r')); // false

You can see that the line characters returned false because they aren't a match.

If you want an exact match of a dot ., you need to escape that character with a backslash \.:

let regex = /JavaScript.is.awesome\./;

console.log(regex.test('JavaScript is awesome!')); // false
console.log(regex.test('JavaScript is awesome.')); // true

The first string wasn't a match because the pattern specifies that the phrase must end exactly with a dot and not with any other characters.

Don't forget that if you're passing a string to the constructor, you'll also need to escape the backslash:

let regex = new RegExp('JavaScript.is.awesome\\.');

The question mark

Sometimes you want a character to be optional, meaning that it can be in the string or not. This is very useful when the string might have several variations.

A question mark will make the preceding character optional. You can think of it as a quantifier: "Match zero or one instance of the previous character."

Let's say you want to match the following sentence:

"I have 1 dog"

And you also want to return a match if the person has more than one dog. So, "I have 3 dogs" should also be a match.

The question mark can be handy in cases like this:

let regex = /I have \d dogs?/;

console.log(regex.test('I have 1 dog')); // true
console.log(regex.test('I have 2 dogs')); // true
console.log(regex.test('I have 3 dogs')); // true

All the strings were matched because the s at the end is optional.

The process for matching a question mark ? is the same as the one used to match a dot. You need to escape it \?:

let regex = /Am I learning regex\?/;

console.log(regex.test('Am I learning regex?')); // true
console.log(regex.test('Am I learning regex!')); // false

Conclusion

In this topic, you learned how to create a regex object in JavaScript. There are two ways to do this: with the RegExp constructor or with literal notation /pattern/, and you've discovered how these approaches differ. You also learned how to use the test() method to match a regex pattern with a string, and saw that it returns true when there's a match and false if not.

In addition, you now know about some of the most important regex characters and what they are used for. Such as the dot for matching any character, and the question mark, which makes the preceding character optional. So, how about putting all this into practice with some exercises?

93 learners liked this piece of theory. 5 didn't like it. What about you?
Report a typo