exec
and test
methods of RegExp
, and with the match
, replace
, search
, and split
methods of String
. This chapter describes JavaScript regular expressions.
JavaScript 1.1 and earlier.
Regular expressions are not available in JavaScript 1.1 and earlier.
This chapter contains the following sections:
re = /ab+c/
re = new RegExp("ab+c")
compile
method to compile a new regular expression for efficient reuse./abc/
, or a combination of simple and special characters, such as /ab*c/
or /Chapter (\d+)\.\d*/
. The last example includes parentheses which are used as a memory device. The match made with this part of the pattern is remembered for later use, as described in "Using Parenthesized Substring Matches" on page 64.
/abc/
matches character combinations in strings only when exactly the characters 'abc' occur together and in that order. Such a match would succeed in the strings "Hi, do you know your abc's?" and "The latest airplane designs evolved from slabcraft." In both cases the match is with the substring 'abc'. There is no match in the string "Grab crab" because it does not contain the substring 'abc'.
/ab*c/
matches any character combination in which a single 'a' is followed by zero or more 'b's (*
means 0 or more occurrences of the preceding character) and then immediately followed by 'c'. In the string "cbbabbbbcdebc," the pattern matches the substring 'abbbbc'.
The following table provides a complete list and description of the special characters that can be used in regular expressions.
Table 4.1 Special characters in regular expressions.
/Chapter (\d+)\.\d*/
illustrates additional escaped and special characters and indicates that part of the pattern should be remembered. It matches precisely the characters 'Chapter ' followed by one or more numeric characters (\d
means any numeric character and +
means 1 or more times), followed by a decimal point (which in itself is a special character; preceding the decimal point with \ means the pattern must look for the literal character '.'), followed by any numeric character 0 or more times (\d
means numeric character, *
means 0 or more times). In addition, parentheses are used to remember the first matched numeric characters.
This pattern is found in "Open Chapter 4.3, paragraph 6" and '4' is remembered. The pattern is not found in "Chapter 3 and 4", because that string does not have a period after the '3'.
RegExp
methods test
and exec
and with the String
methods match
, replace
, search
, and split
.These methods are explained in detail in the Core JavaScript Reference.
test
or search
method; for more information (but slower execution) use the exec
or match
methods. If you use exec
or match
and if the match succeeds, these methods return an array and update properties of the associated regular expression object and also of the predefined regular expression object, RegExp
. If the match fails, the exec
method returns null
(which converts to false
).
In the following example, the script uses the exec
method to find a match in a string.
<SCRIPT LANGUAGE="JavaScript1.2">If you do not need to access the properties of the regular expression, an alternative way of creating
myRe=/d(b+)d/g;
myArray = myRe.exec("cdbbdbsbz");
</SCRIPT>
myArray
is with this script:
<SCRIPT LANGUAGE="JavaScript1.2">If you want to be able to recompile the regular expression, yet another alternative is this script:
myArray = /d(b+)d/g.exec("cdbbdbsbz");
</SCRIPT>
<SCRIPT LANGUAGE="JavaScript1.2">With these scripts, the match succeeds and returns the array and updates the properties shown in the following table.
myRe= new RegExp ("d(b+)d", "g:);
myArray = myRe.exec("cdbbdbsbz");
</SCRIPT>
Table 4.2 Results of regular expression execution.
RegExp.leftContext
and RegExp.rightContext
can be computed from the other values. RegExp.leftContext
is equivalent to:
myArray.input.substring(0, myArray.index)and
RegExp.rightContext
is equivalent to:
myArray.input.substring(myArray.index + myArray[0].length)As shown in the second form of this example, you can use the a regular expression created with an object initializer without assigning it to a variable. If you do, however, every occurrence is a new regular expression. For this reason, if you use this form without assigning it to a variable, you cannot subsequently access the properties of that regular expression. For example, assume you have this script:
<SCRIPT LANGUAGE="JavaScript1.2">This script displays: However, if you have this script:
myRe=/d(b+)d/g;
myArray = myRe.exec("cdbbdbsbz");
document.writeln("The value of lastIndex is " + myRe.lastIndex);
</SCRIPT>
<SCRIPT LANGUAGE="JavaScript1.2">It displays: The occurrences of
myArray = /d(b+)d/g.exec("cdbbdbsbz");
document.writeln("The value of lastIndex is " + /d(b+)d/g.lastIndex);
</SCRIPT>
/d(b+)d/g
in the two statements are different regular expression objects and hence have different values for their lastIndex
property. If you need to access the properties of a regular expression created with an object initializer, you should first assign it to a variable.
/a(b)c/
matches the characters 'abc' and remembers 'b'. To recall these parenthesized substring matches, use the RegExp
properties $1
, ..., $9
or the Array
elements [1]
, ..., [n]
.
The number of possible parenthesized substrings is unlimited. The predefined RegExp
object holds up to the last nine and the returned array holds all that were found. The following examples illustrate how to use parenthesized substring matches.
Example 1.
The following script uses the replace
method to switch the words in the string. For the replacement text, the script uses the values of the $1
and $2
properties.
<SCRIPT LANGUAGE="JavaScript1.2">This prints "Smith, John". Example 2. In the following example,
re = /(\w+)\s(\w+)/;
str = "John Smith";
newstr = str.replace(re, "$2, $1");
document.write(newstr)
</SCRIPT>
RegExp.input
is set by the Change event. In the getInfo
function, the exec
method uses the value of RegExp.input
as its argument. Note that RegExp
must be prepended to its $
properties (because they appear outside the replacement string). (Example 3 is a more efficient, though possibly more cryptic, way to accomplish the same thing.)
<HTML>
<SCRIPT LANGUAGE="JavaScript1.2">
function getInfo(){
re = /(\w+)\s(\d+)/
re.exec();
window.alert(RegExp.$1 + ", your age is " + RegExp.$2);
}
</SCRIPT>
Enter your first name and your age, and then press Enter.
<FORM>
<INPUT TYPE="text" NAME="NameAge" onChange="getInfo(this);">
</FORM>
</HTML>Example 3. The following example is similar to Example 2. Instead of using the
RegExp.$1
and RegExp.$2
, this example creates an array and uses a[1]
and a[2]
. It also uses the shortcut notation for using the exec
method.
<HTML>
<SCRIPT LANGUAGE="JavaScript1.2">
function getInfo(){
a = /(\w+)\s(\d+)/();
window.alert(a[1] + ", your age is " + a[2]);
}
</SCRIPT>
Enter your first name and your age, and then press Enter.
<FORM>
<INPUT TYPE="text" NAME="NameAge" onChange="getInfo(this);">
</FORM>
</HTML>
g
flag. To indicate a case insensitive search, use the i
flag. These flags can be used separately or together in either order, and are included as part of the regular expression.
To include a flag with the regular expression, use this syntax:
re = /pattern/[g|i|gi]Note that the flags,
re = new RegExp("pattern", ['g'|'i'|'gi'])
i
and g
, are an integral part of a regular expression. They cannot be added or removed later.
For example, re = /\w+\s/g
creates a regular expression that looks for one or more characters followed by a space, and it looks for this combination throughout the string.
<SCRIPT LANGUAGE="JavaScript1.2">This displays ["fee ", "fi ", "fo "]. In this example, you could replace the line:
re = /\w+\s/g;
str = "fee fi fo fum";
myArray = str.match(re);
document.write(myArray);
</SCRIPT>
re = /\w+\s/g;with:
re = new RegExp("\\w+\\s", "g");and get the same result.
string.split()
and string.replace()
. It cleans a roughly formatted input string containing names (first name first) separated by blanks, tabs and exactly one semicolon. Finally, it reverses the name order (last name first) and sorts the list.
<SCRIPT LANGUAGE="JavaScript1.2">
// The name string contains multiple spaces and tabs,
// and may have multiple spaces between first and last names.
names = new String ( "Harry Trump ;Fred Barney; Helen Rigby ;\
Bill Abel ;Chris Hand ")
document.write ("---------- Original String" + "<BR>" + "<BR>")
document.write (names + "<BR>" + "<BR>")
// Prepare two regular expression patterns and array storage.
// Split the string into array elements.
// pattern: possible white space then semicolon then possible white space
pattern = /\s*;\s*/
// Break the string into pieces separated by the pattern above and
// and store the pieces in an array called nameList
nameList = names.split (pattern)
// new pattern: one or more characters then spaces then characters.
// Use parentheses to "memorize" portions of the pattern.
// The memorized portions are referred to later.
pattern = /(\w+)\s+(\w+)/
// New array for holding names being processed.
bySurnameList = new Array;
// Display the name array and populate the new array
// with comma-separated names, last first.
//
// The replace method removes anything matching the pattern
// and replaces it with the memorized string--second memorized portion
// followed by comma space followed by first memorized portion.
//
// The variables $1 and $2 refer to the portions
// memorized while matching the pattern.
document.write ("---------- After Split by Regular Expression" + "<BR>")
for ( i = 0; i < nameList.length; i++) {
document.write (nameList[i] + "<BR>")
bySurnameList[i] = nameList[i].replace (pattern, "$2, $1")
}
// Display the new array.
document.write ("---------- Names Reversed" + "<BR>")
for ( i = 0; i < bySurnameList.length; i++) {
document.write (bySurnameList[i] + "<BR>")
}
// Sort by last name, then display the sorted array.
bySurnameList.sort()
document.write ("---------- Sorted" + "<BR>")
for ( i = 0; i < bySurnameList.length; i++) {
document.write (bySurnameList[i] + "<BR>")
}
document.write ("---------- End" + "<BR>")
</SCRIPT>
\(?
, followed by three digits \d{3}
, followed by zero or one close parenthesis \)?
, followed by one dash, forward slash, or decimal point and when found, remember the character ([-\/\.])
, followed by three digits \d{3}
, followed by the remembered match of a dash, forward slash, or decimal point \1
, followed by four digits \d{4}
.
The Change event activated when the user presses Enter sets the value of RegExp.input
.
<HTML>
<SCRIPT LANGUAGE = "JavaScript1.2">
re = /\(?\d{3}\)?([-\/\.])\d{3}\1\d{4}/
function testInfo() {
OK = re.exec()
if (!OK)
window.alert (RegExp.input +
" isn't a phone number with area code!")
else
window.alert ("Thanks, your phone number is " + OK[0])
}
</SCRIPT>
Enter your phone number (with area code) and then press Enter.
<FORM>
<INPUT TYPE="text" NAME="Phone" onChange="testInfo(this);">
</FORM>
</HTML>
Last Updated: 10/29/98 16:12:04