Javascript Regular Expression Cheat Sheet

broken image


JavaScript Strings Cheat Sheet regular expression regexp against string str str1.replace(regexp, str2) Replace matched regexp elements in string str1 with string str2. As part of resources, you will get this high-quality cheat-sheet for regex language. And the supported operations. Besides, we will use an interactive regex tool to write and test patterns. In the JavaScript Regex features section, you will get familiar with various regex methods, their purpose, and how to unit test your pattern. Regular Expressions Cheat Sheet. Angelos Chalaris May 1, 2020 JavaScript, String, Regexp, Cheatsheet. 6 JavaScript Regular Expression features you can use.

These are lookaround assertions in regular expressions in JavaScript:

  • Positive lookahead: (?=«pattern»)
  • Negative lookahead: (?!«pattern»)
  • Positive lookbehind: (?<=«pattern»)
  • Negative lookbehind: (?

This blog post shows examples of using them.

Cheat sheet: lookaround assertions #

At the current location in the input string:

  • Lookahead assertions (ECMAScript 3):
    • Positive lookahead: (?=«pattern») matches if pattern matches what comes after the current location.
    • Negative lookahead: (?!«pattern») matches if pattern does not match what comes after the current location.
  • Lookbehind assertions (ECMAScript 2018):
    • Positive lookbehind: (?<=«pattern») matches if pattern matches what comes before the current location.
    • Negative lookbehind: (? matches if pattern does not match what comes before the current location.
Javascript Regular Expression Cheat Sheet

For more information, see 'JavaScript for impatient programmers': lookahead assertions, lookbehind assertions.

A word of caution about regular expressions #

Regular expressions are a double-edged sword: powerful and short, but also sloppy and cryptic. Sometimes different, longer approaches (especially proper parsers) may be better, especially for production code.

Another caveat is that lookbehind assertions are a relatively new feature that may not be supported by all JavaScript engines you are targeting.

Example: Specifying what comes before or after a match (positive lookaround) #

In the following interaction, we extract quoted words:

Two lookaround assertions help us here:

  • (?<=') 'must be preceded by a quote'
  • (?=') 'must be followed by a quote'

Lookaround assertions are especially convenient for .match() in its /g mode, which returns whole matches (capture group 0). Whatever the pattern of a lookaround assertion matches is not captured. Without lookaround assertions, the quotes show up in the result:

Example: Specifying what does not come before or after a match (negative lookaround) #

How can we achieve the opposite of what we did in the previous section and extract all unquoted words from a string?

  • Input: 'how 'are' 'you' doing'
  • Output: ['how', 'doing']

Our first attempt is to simply convert positive lookaround assertions to negative lookaround assertions. Alas, that fails:

Cheat

The problem is that we extract sequences of characters that are not bracketed by quotes. That means that in the string 'are', the 'r' in the middle is considered unquoted, because it is preceded by an 'a' and followed by an 'e'.

We can fix this by stating that prefix and suffix must be neither quote nor letter:

Another solution is to demand via b that the sequence of characters [a-z]+ start and end at word boundaries:

One thing that is nice about negative lookbehind and negative lookahead is that they also work at the beginning or end, respectively, of a string – as demonstrated in the example.

There are no simple alternatives to negative lookaround assertions #

Negative lookaround assertions are a powerful tool and difficult to emulate via other (regular expression) means.

If you don't want to use them, you normally have to take completely different approach. For example, in this case, you could split the string into (quoted and unquoted) words and then filter those:

Benefits of this approach:

  • It works on older engines.
  • It is easy to understand.

Interlude: pointing lookaround assertions inward #

All of the examples we have seen so far have in common that the lookaround assertions dictate what must come before or after the match but without including those characters in the match.

The regular expressions shown in the remainder of this blog post are different: Their lookaround assertions point inward and restrict what's inside the match.

Example: match strings not starting with 'abc'#

Let‘s assume we want to match all strings that do not start with 'abc'. Our first attempt could be the regular expression /^(?!abc)/.

That works well for .test():

Oracle sql commands cheat sheet. However, .exec() gives us an empty string:

The problem is that assertions such as lookaround assertions don't expand the matched text. That is, they don't capture input characters, they only make demands about the current location in the input.

Therefore, the solution is to add a pattern that does capture input characters:

As desired, this new regular expression rejects strings that are prefixed with 'abc':

And it accepts strings that don't have the full prefix:

Example: match substrings that do not contain '.mjs'#

Javascript Regular Expression Cheat Sheet

For more information, see 'JavaScript for impatient programmers': lookahead assertions, lookbehind assertions.

A word of caution about regular expressions #

Regular expressions are a double-edged sword: powerful and short, but also sloppy and cryptic. Sometimes different, longer approaches (especially proper parsers) may be better, especially for production code.

Another caveat is that lookbehind assertions are a relatively new feature that may not be supported by all JavaScript engines you are targeting.

Example: Specifying what comes before or after a match (positive lookaround) #

In the following interaction, we extract quoted words:

Two lookaround assertions help us here:

  • (?<=') 'must be preceded by a quote'
  • (?=') 'must be followed by a quote'

Lookaround assertions are especially convenient for .match() in its /g mode, which returns whole matches (capture group 0). Whatever the pattern of a lookaround assertion matches is not captured. Without lookaround assertions, the quotes show up in the result:

Example: Specifying what does not come before or after a match (negative lookaround) #

How can we achieve the opposite of what we did in the previous section and extract all unquoted words from a string?

  • Input: 'how 'are' 'you' doing'
  • Output: ['how', 'doing']

Our first attempt is to simply convert positive lookaround assertions to negative lookaround assertions. Alas, that fails:

The problem is that we extract sequences of characters that are not bracketed by quotes. That means that in the string 'are', the 'r' in the middle is considered unquoted, because it is preceded by an 'a' and followed by an 'e'.

We can fix this by stating that prefix and suffix must be neither quote nor letter:

Another solution is to demand via b that the sequence of characters [a-z]+ start and end at word boundaries:

One thing that is nice about negative lookbehind and negative lookahead is that they also work at the beginning or end, respectively, of a string – as demonstrated in the example.

There are no simple alternatives to negative lookaround assertions #

Negative lookaround assertions are a powerful tool and difficult to emulate via other (regular expression) means.

If you don't want to use them, you normally have to take completely different approach. For example, in this case, you could split the string into (quoted and unquoted) words and then filter those:

Benefits of this approach:

  • It works on older engines.
  • It is easy to understand.

Interlude: pointing lookaround assertions inward #

All of the examples we have seen so far have in common that the lookaround assertions dictate what must come before or after the match but without including those characters in the match.

The regular expressions shown in the remainder of this blog post are different: Their lookaround assertions point inward and restrict what's inside the match.

Example: match strings not starting with 'abc'#

Let‘s assume we want to match all strings that do not start with 'abc'. Our first attempt could be the regular expression /^(?!abc)/.

That works well for .test():

Oracle sql commands cheat sheet. However, .exec() gives us an empty string:

The problem is that assertions such as lookaround assertions don't expand the matched text. That is, they don't capture input characters, they only make demands about the current location in the input.

Therefore, the solution is to add a pattern that does capture input characters:

As desired, this new regular expression rejects strings that are prefixed with 'abc':

And it accepts strings that don't have the full prefix:

Example: match substrings that do not contain '.mjs'#

Regex Expression Cheat Sheet

In the following example, we want to find

where module-specifier does not end with '.mjs'.

Here, the lookbehind assertion (? acts as a guard and prevents that the regular expression matches strings that contain '.mjs' at this location.

Example: skipping lines with comments #

Scenario: We want to parse lines with settings, while skipping comments. For example:

How did we arrive at the regular expression RE_SETTING?

We started with the following regular expression for settings:

Intuitively, it is a sequence of the following parts:

  • Start of the line
  • Non-colons (zero or more)
  • A single colon
  • Any characters (zero or more)
  • The end of line

This regular expression does reject some comments:

But it accepts others (that have colons in them):

We can fix that by prefixing (?!#) as a guard. Intuitively, it means: 'The current location in the input string must not be followed by the character #.'

The new regular expression works as desired:

Example: smart quotes #

Let's assume we want to convert pairs of straight double quotes to curly quotes: Mosfet with diode.

  • Input: `'yes' and 'no'`
  • Output: `'yes' and 'no'`

This is our first attempt:

Only the first quote and the last quote is curly. The problem here is that the * quantifier matches greedily (as much as possible).

If we put a question mark after the *, it matches reluctantly:

Supporting escaping via backslashes #

What if we want to allow the escaping of quotes via backslashes? We can do that by using the guard (? before the quotes:

C# Regex Cheat Sheet

As a post-processing step, we would still need to do: New model dress design pencil drawing.

However, this regular expression can fail when there is a backslash-escaped backslash:

The second backslash prevented the quotes from becoming curly.

We can fix that if we make our guard more sophisticated (?: makes the group non-capturing):

(Credit: @jonasraoni)

Reg Expression Cheat Sheet

The new guard allows pairs of backslashes before quotes:

One issue remains. This guard prevents the first quote from being matched if it appears at the beginning of a string:

We can fix that by changing the first guard to: (?<=[^](?:)*|^)

Javascript Regular Expression Cheat Sheet

Further reading #

Javascript Regular Expression Cheat Sheet 2019

  • Chapter 'Regular expressions (RegExp)' in 'JavaScript for impatient programmers'




broken image