Hands on Regular Expressions

Yashod Perera
4 min readMay 1, 2020

--

It is for validating a string.

Hello everyone, In this tutorial I will cover all the fundamentals regarding regular expressions.

When developing an application you have to validate the user input before submitting. As an example you have to check whether the user inputs a valid email address. For that you have to match the user inputted email value with email pattern. To make these kind of patterns you have to use regular expressions.

Format of a regular expression

The common format of a regular expression has two sections,

  • pattern — Where you place the pattern and we focus more in pattern making.
  • modifiers — Modifiers allow pattern to do more functionalities. As an example if you make a patter for simple letter string validation and you can make it available for simple and capital letter validation by adding “i” modifier. You can find more modifiers here.
/pattern/modifiers;

Let’s start.

For a specific word

It is simple to check whether a specific word is in a paragraph.

/cat/

Above pattern will check all the words with “cat” pattern in it.

It is simple I want to know how to check complex patters. Okay but let’s go one by one.

For two words

Let’s make a pattern to identify two words which are cat and bat.

/[cb]at/

Ranges

Then let’s validate cat, bat, rat and mat.

/[cbrm]at/

As you see you can add any optional type of characters in to square brackets and what if you need to validate a word with whole alphabet then you have to add either the alphabet as /[abcdefghijklmnopqrstuvwxyz]/ or using a range as follows (This is for checking a one letter).

/[a-z]/
  • a-z — simple alphabet
  • A-Z — capital alphabet
  • 0–9 — Numbers

Exclude

if you need to exclude some letters you have to use “^” mark as follows.

/[^ab]at/

Above pattern will recognise all the “*at”s without “aat” ant “bat”.

Repetition

You have to validate a mobile number with 10 numbers. Then you have to indicate the number of characters as follows.

/[0-9]{10}/

It will match all the strings with 10 characters. But it will and all the above will give true as output even it is a part of a sentence. Let me explain.

/[0-9]{10}/ will output 0771234567 as a valid pattern and 077123456789012 as also valid as it also has a pattern with 10 digits. Then you have to indicate this is the exact pattern /^<pattern>$/ as follows.

/^[0-9]{10}$/

Then it will allow the strings only having 10 digits.

Special characters

There are some important special characters which you should know.

  • \d or \D — digit
  • \s or \S — space
  • \w — alphanumeric and underscores

Apart from those there are lots of special characters which can be refer using this.

One or more

To indicate one or more characters we use + as follows.

/^a+b$/

It will identify ab, aab, aab, aaab …. patterns.

zero or more

To indicate zero or more we use * as follows.

/^a+b$/

It will identify b, ab, aab, aab, aaab …. patterns.

Optional

To indicate optional characters or patterns we use ? as follows.

/^a?b$/

It will identify ab, b as patterns.

Or

If you need to identify either tyre or are you can use or as follows.

/^(ty|a)re$/

Let’s validate an email

To identify any pattern first you have to identify its’ structure.

  • Words with alphanumeric characters, underscores, dots and dashes

For that we have to accept all the alphanumerics, underscores, dots and dashes. For alphanumerics and underscores we can use “\w” and since dot and dashes are special characters we have to use them as “\.” and “\-”. Then we have to allow them to add one or more using “+”. Let’s construct that part as follows.

[\w\.\-]+
  • @ mark
@
  • Words with alphanumeric characters, underscores, dots and dashes
[\w\.\-]+
  • domain names which consists of . and 2 to 5 alphabet letters

Then we have to indicate the dot as the first letter and following 2 to 5 alphabet characters for one or many times as follows.

(\.[a-zA-Z]{2,5})+

Let’s put them all to gather and construct the pattern for an email.

/^([\w\.\-]+)@([\w\.\-]+)((\.[a-zA-Z]{2,5})+)$/

Hope you get a better understanding about Regular Expressions.

If you have found this helpful please hit that 👏 and share it on social media :)

--

--

Yashod Perera
Yashod Perera

Written by Yashod Perera

Technical Writer | Tech Enthusiast | Open source contributor

No responses yet