Using a PushbackReader

The class PushbackReader is useful if you want to look ahead at the incoming stream of data. It allows you to unread the data that you have read. The amount of data that can be unread can be specified.

Here’s a simple Parser example that puts it into practice. The main method creates an instance and passes as an argument a reader that points to the Character stream to be processed, in this case a FileReader. Then it repeatedly reads tokens until EOF. The Parser’s getNextToken method reads in one character to determine what method it should call and unreads that character back onto the stream.

Main.java:

import java.util.*;
import java.io.*;
  
public class Main {   
   public static void main(String[] args) throws Exception {
      if (args.length != 1) { 
         System.out.println("Usage: java Main <file>");
         System.exit(1);
      }
 
      BufferedReader br = new BufferedReader(new FileReader(args[0]));
      Parser parser = new Parser(br);
      String token;
      while ((token = parser.getNextToken()) != null) {
         System.out.println("Found: " + parser.getType() + "t[" + token + "]");
      }
   }
}
 
class Parser
{
   private final static int NOTOKEN  = 0;
   private final static int VARIABLE = 1;
   private final static int ASSIGN   = 2;
   private final static int SEMICOL  = 3;
 
   PushbackReader pr;
   String token = null;
   int type = NOTOKEN;
 
   public Parser(Reader r) {
      pr = new PushbackReader(r);
   }
 
   public String getType() {
      if (type == VARIABLE) return "VARIABLE";
      else if (type == ASSIGN) return "ASSIGN";
      else if (type == SEMICOL) return "SEMICOL";
  
      return "NOTOKEN";
   }
   
   public String getNextToken() throws IOException {
      token = null;
      type = NOTOKEN;
 
      skipWhitespace();
      int k = pr.read();
      if (k != -1) {
         char c = (char) k;
         pr.unread(k);
         if (Character.isLetter(c)) {
            parseVariable();
         }
         else if (c == ':') {
            parseAssign();
         }
         else if (c == ';') {
            parseSemicol();
         }
         else {
            parseError("Unrecognized character: " + c);
         }
      }
  
      return token;
   }
 
   public void skipWhitespace() throws IOException {
      int k;
      while ((k = pr.read()) > -1) {
         if (!Character.isWhitespace((char) k)) {
            pr.unread(k);
            break;
         }
      }   
   }
  
   private void parseVariable() throws IOException {
      StringBuffer buffer = new StringBuffer();
      int k;
      type = VARIABLE;
      while ((k = pr.read()) > -1) {
         if (Character.isLetter((char) k) || Character.isDigit((char) k)) {
            buffer.append((char) k);
         }
         else {
            break;
         }
      }
 
      token = buffer.toString();
   }
 
   private void parseAssign() throws IOException {
      pr.read();   // read ':'  which we already know
      int k = pr.read();
      if (k > -1 && ((char) k) == '=') {
         type = ASSIGN;
         token = ":=";
      }
      else {
         parseError("Expected character '=', found character '" + (char) k + "'");
      }
   }
 
   private void parseSemicol() throws IOException {
      pr.read();   // read ';'  which we already know
      type = SEMICOL;
      token = ";";
   }
 
   private void parseError(String error) {
      System.err.println("ParseError: " + error);
   }
}

Using it on the file test.pas:

a := b;
var1 := var2;

results in:

Found: VARIABLE	[a]
Found: ASSIGN	[:=]
Found: VARIABLE	[b]
Found: SEMICOL	[;]
Found: VARIABLE	[var1]
Found: ASSIGN	[:=]
Found: VARIABLE	[var2]
Found: SEMICOL	[;]

Using it on the file test2.pas:

a12345 := b0003
var1 = var2;

outputs:

Found: VARIABLE [a12345]
Found: ASSIGN   [:=]
Found: VARIABLE [b0003]
Found: VARIABLE [var1]
ParseError: Unrecognized character: =