Regular expression not category
This subset suffices to describe all regular The syntax described so far is most of the traditional Unix The operator precedence, from weakest to strongest binding, isįirst alternation, then concatenation, and finally theĮxplicit parentheses can be used to force different meanings, Matches a sequence of zero or more (possibly different) Two regular expressions can be alternated or concatenated to form a new The simplest regular expression is a single literal character. The following expression will not match any string that contains a vowel: / ^ ( ?!. Note that special characters inside square brackets don’t need to be escaped. * /įor a set of characters, one can include them in square brackets. If the character you want to exclude is a reserved character in regex (such as ? or *) you need to include a backslash \ in front of the character to escape it, as shown: / ^ ( ?!. This expression will ignore any string containing an a: / ^ ( ?!. To match everything except a specific character, simply insert the character inside the negative lookahead. * at the front of the negative lookahead will work together with dollar but not with euro or pound, causing sentences that contain other characters before these unwanted words to be matched. Notice that we need to enclose the list of unwanted words in round brackets ( ) for this to work correctly. The following expression will ignore strings that contain any of the words dollar, euro, or pound: / ^ ( ?!. We can list multiple unwanted words by separating them with the OR symbol |. The following expression will not match any string containing the word foo: / ^ ( ?!. To match everything except a specific word, we simply enter the unwanted word inside the negative lookahead. We can now tweak it to suit specific use-cases. If we placed it in front of the negative lookahead, the entire string will be matched before the negative lookahead is even checked.Īnd this completes the general expression required. and zero-or-more quantifier * that will notice zero-or-more characters in front of the unwanted expression. To do this, we need to add another dot character. To prevent this from happening, we need to provide an additional expression that will notice the characters at the start of the string, together with the unwanted expression. In other words, it will accept aabc or xabc. This anchor forces the matched expression to start at the beginning of the string and ensures that no subsequent sub-strings can be matched.įinally, this expression above will reject any string starting with abc but will accept any string that starts with a different character followed by abc. To prevent this from happening, we need to provide a start-of-string anchor ^: / ^ ( ?!abc ). Therefore, the remainder of the string will be matched. However, upon validating the substring starting with the second character, bc, the test will fail since bc is not equal to abc. The expression above will now start from the first character in the string, checking every substring for abc, and won’t match if it finds this expression. Note that we place the negative lookahead at the start of the expression to ensure that it is validated before anything else is checked. It work by only checking whether the abc expression is present, without actually matching or returning the expression. The negative lookahead looks ahead into the string to see if the specified expression ( abc in this case) is present. Next, we add a negative lookahead, written in the form ( ?!abc ). This allows us to match zero or more of any character: /. which matches any character, followed by a zero-or-more quantifier *. To begin our expression, we first start by allowing everything to be matched. (cats ? |dogs ? )īefore we dive into each of these, let’s first discuss how the whole thing works:ĪLSO READ: Regex Match Everything After A Specific Character How The Main Expression Works