Get started with the 1.4 regex package

If you have no clue about what regular expressions are, check out Introduction to Regular Expressions by Tom Sgouros.

First, define a regular expression:

   Pattern p = Pattern.compile("hol");

Then you can apply that pattern to a sequence of characters (the interface CharSequence, which String, StringBuffer and CharBuffer implement, has been added to JDK1.4.

You can then create a Matcher providing it with a CharSequence to which you want to apply your pattern:

 
   Matcher m = p.matcher("I am the great Cornholio! I am a gringo. " +
                         "(heh-heh) I have no bunghole. (heh-heh) Bungholio.");

Executing matches() returns true when the whole region matches the pattern. find() returns true when there is a subsequence of the region that matches the pattern. You can find the start and end indexes with start() and end().

For example: MatchString.java:

import java.util.regex.*;
import java.io.*;
 
public class MatchString
{
   public static void main(String []args) throws Exception {
      Pattern p = Pattern.compile("[A-Z|a-z]*hol[A-Z|a-z]*");
 
      String line = "I am the great Cornholio! I am a gringo. (heh-heh) " +
                    "I have no bunghole. (heh-heh) Bungholio.";
      Matcher m = p.matcher(line);
 
      while (m.find()) {
         System.out.print(m.group());
         System.out.println("t[" + m.start() + ", " + m.end() + "]");
      }
   }
}

outputs:

Cornholio	[15, 24]
bunghole	[61, 69]
Bungholio	[81, 90]

To test out different regular expressions (find a summary of regular-expression constructs in the Javadoc documentation of Pattern), you can use the following example. It prints out a series of matches, given an input file and a regular expression.

Main.java:

import java.util.regex.*;
import java.io.*;
 
public class Main
{
   public static void main(String []args) throws Exception {
      if (args.length != 2) {
         System.err.println("Usage: java Main filename regex");
         System.exit(1);
      }
 
      Pattern p = Pattern.compile(args[1]);
 
      BufferedReader br = new BufferedReader(new FileReader(args[0]));     
      String line = null;
      while ((line = br.readLine()) != null) {
         Matcher m = p.matcher(line);
         while (m.find()) {
            System.out.println(line);
            indent(m.start());
            System.out.print("^");
            indent(m.end()-m.start()-2);
            System.out.println("^");
         }
      } 
      br.close();
   }
 
   public static void indent(int h) {
      for (int i=0; i<h; i++) 
         System.out.print(' ');
   }
}

Example:

   c:jdk1.4binjava Main Main.java "[a-z]*fer"

outputs:

      BufferedReader br = new BufferedReader(new FileReader(args[0]));
       ^   ^
      BufferedReader br = new BufferedReader(new FileReader(args[0]));
                               ^   ^