What Are Regular Expressions?
Regular expressions are a way to describe a set of strings based on common characteristics shared by each string in the set. They can be used to search, edit, or manipulate text and data. You must learn a specific syntax to create regular expressions — one that goes beyond the normal syntax of the Java programming language. Regular expressions vary in complexity, but once you understand the basics of how they're constructed, you'll be able to decipher (or create) any regular expression.This trail teaches the regular expression syntax supported by the
java.util.regexAPI and presents several working examples to illustrate how the various objects interact. In the world of regular expressions, there are many different flavors to choose from, such as grep, Perl, Tcl, Python, PHP, and awk. The regular expression syntax in thejava.util.regexAPI is most similar to that found in Perl.How Are Regular Expressions Represented in This Package?
Thejava.util.regexpackage primarily consists of three classes:Pattern,Matcher, andPatternSyntaxException.
The last few lessons of this trail explore each class in detail. But first, you must understand how regular expressions are actually constructed. Therefore, the next section introduces a simple test harness that will be used repeatedly to explore their syntax.
- A
Patternobject is a compiled representation of a regular expression. ThePatternclass provides no public constructors. To create a pattern, you must first invoke one of itspublic static compilemethods, which will then return aPatternobject. These methods accept a regular expression as the first argument; the first few lessons of this trail will teach you the required syntax.
- A
Matcherobject is the engine that interprets the pattern and performs match operations against an input string. Like thePatternclass,Matcherdefines no public constructors. You obtain aMatcherobject by invoking thematchermethod on aPatternobject.
- A
PatternSyntaxExceptionobject is an unchecked exception that indicates a syntax error in a regular expression pattern.