Package antlr

Class CodeGenerator

  • Direct Known Subclasses:
    CppCodeGenerator, CSharpCodeGenerator, DiagnosticCodeGenerator, DocBookCodeGenerator, HTMLCodeGenerator, JavaCodeGenerator, PythonCodeGenerator

    public abstract class CodeGenerator
    extends java.lang.Object
    A generic ANTLR code generator. All code generators Derive from this class.

    A CodeGenerator knows about a Grammar data structure and a grammar analyzer. The Grammar is walked to generate the appropriate code for both a parser and lexer (if present). This interface may change slightly so that the lexer is itself living inside of a Grammar object (in which case, this class generates only one recognizer). The main method to call is gen(), which initiates all code gen.

    The interaction of the code generator with the analyzer is simple: each subrule block calls deterministic() before generating code for the block. Method deterministic() sets lookahead caches in each Alternative object. Technically, a code generator doesn't need the grammar analyzer if all lookahead analysis is done at runtime, but this would result in a slower parser.

    This class provides a set of support utilities to handle argument list parsing and so on.

    Version:
    2.00a
    Author:
    Terence Parr, John Lilley
    See Also:
    JavaCodeGenerator, DiagnosticCodeGenerator, LLkAnalyzer, Grammar, AlternativeElement, Lookahead
    • Constructor Summary

      Constructors 
      Constructor Description
      CodeGenerator()
      Construct code generator base class
    • Method Summary

      All Methods Static Methods Instance Methods Abstract Methods Concrete Methods 
      Modifier and Type Method Description
      protected void _print​(java.lang.String s)
      Output a String to the currentOutput stream.
      protected void _printAction​(java.lang.String s)
      Print an action without leading tabs, attempting to preserve the current indentation level for multi-line actions Ignored if string is null.
      protected void _println​(java.lang.String s)
      Output a String followed by newline, to the currentOutput stream.
      static java.lang.String decodeLexerRuleName​(java.lang.String id)  
      static boolean elementsAreRange​(int[] elems)
      Test if a set element array represents a contiguous range.
      static java.lang.String encodeLexerRuleName​(java.lang.String id)  
      protected java.lang.String extractIdOfAction​(Token t)
      Get the identifier portion of an argument-action token.
      protected java.lang.String extractIdOfAction​(java.lang.String s, int line, int column)
      Get the identifier portion of an argument-action.
      protected java.lang.String extractTypeOfAction​(Token t)
      Get the type string out of an argument-action token.
      protected java.lang.String extractTypeOfAction​(java.lang.String s, int line, int column)
      Get the type portion of an argument-action.
      abstract void gen()
      Generate the code for all grammars
      abstract void gen​(ActionElement action)
      Generate code for the given grammar element.
      abstract void gen​(AlternativeBlock blk)
      Generate code for the given grammar element.
      abstract void gen​(BlockEndElement end)
      Generate code for the given grammar element.
      abstract void gen​(CharLiteralElement atom)
      Generate code for the given grammar element.
      abstract void gen​(CharRangeElement r)
      Generate code for the given grammar element.
      abstract void gen​(LexerGrammar g)
      Generate the code for a parser
      abstract void gen​(OneOrMoreBlock blk)
      Generate code for the given grammar element.
      abstract void gen​(ParserGrammar g)
      Generate the code for a parser
      abstract void gen​(RuleRefElement rr)
      Generate code for the given grammar element.
      abstract void gen​(StringLiteralElement atom)
      Generate code for the given grammar element.
      abstract void gen​(TokenRangeElement r)
      Generate code for the given grammar element.
      abstract void gen​(TokenRefElement atom)
      Generate code for the given grammar element.
      abstract void gen​(TreeElement t)
      Generate code for the given grammar element.
      abstract void gen​(TreeWalkerGrammar g)
      Generate the code for a parser
      abstract void gen​(WildcardElement wc)
      Generate code for the given grammar element.
      abstract void gen​(ZeroOrMoreBlock blk)
      Generate code for the given grammar element.
      protected void genTokenInterchange​(TokenManager tm)
      Generate the token types as a text file for persistence across shared lexer/parser
      abstract java.lang.String getASTCreateString​(Vector v)
      Get a string for an expression to generate creation of an AST subtree.
      abstract java.lang.String getASTCreateString​(GrammarAtom atom, java.lang.String str)
      Get a string for an expression to generate creating of an AST node
      protected java.lang.String getBitsetName​(int index)
      Given the index of a bitset in the bitset list, generate a unique name.
      java.lang.String getFIRSTBitSet​(java.lang.String ruleName, int k)  
      java.lang.String getFOLLOWBitSet​(java.lang.String ruleName, int k)  
      abstract java.lang.String mapTreeId​(java.lang.String id, ActionTransInfo tInfo)
      Map an identifier to it's corresponding tree-node variable.
      protected int markBitsetForGen​(BitSet p)
      Add a bitset to the list of bitsets to be generated.
      protected void print​(java.lang.String s)
      Output tab indent followed by a String, to the currentOutput stream.
      protected void printAction​(java.lang.String s)
      Print an action with leading tabs, attempting to preserve the current indentation level for multi-line actions Ignored if string is null.
      protected void println​(java.lang.String s)
      Output tab indent followed by a String followed by newline, to the currentOutput stream.
      protected void printTabs()
      Output the current tab indentation.
      protected abstract java.lang.String processActionForSpecialSymbols​(java.lang.String actionStr, int line, RuleBlock currentRule, ActionTransInfo tInfo)
      Lexically process $ and # references within the action.
      java.lang.String processStringForASTConstructor​(java.lang.String str)
      Process a string for an simple expression for use in xx/action.g it is used to cast simple tokens/references to the right type for the generated language.
      protected java.lang.String removeAssignmentFromDeclaration​(java.lang.String d)
      Remove the assignment portion of a declaration, if any.
      static java.lang.String reverseLexerRuleName​(java.lang.String id)  
      void setAnalyzer​(LLkGrammarAnalyzer analyzer_)  
      void setBehavior​(DefineGrammarSymbols behavior_)  
      protected void setGrammar​(Grammar g)
      Set a grammar for the code generator to use
      void setTool​(Tool tool)  
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Field Detail

      • antlrTool

        protected Tool antlrTool
      • tabs

        protected int tabs
        Current tab indentation for code output
      • currentOutput

        protected transient java.io.PrintWriter currentOutput
        Current output Stream
      • grammar

        protected Grammar grammar
        The grammar for which we generate code
      • bitsetsUsed

        protected Vector bitsetsUsed
        List of all bitsets that must be dumped. These are Vectors of BitSet.
      • charFormatter

        protected CharFormatter charFormatter
        Object used to format characters in the target language. subclass must initialize this to the language-specific formatter
      • DEBUG_CODE_GENERATOR

        protected boolean DEBUG_CODE_GENERATOR
        Use option "codeGenDebug" to generate debugging output
      • DEFAULT_MAKE_SWITCH_THRESHOLD

        protected static final int DEFAULT_MAKE_SWITCH_THRESHOLD
        Default values for code-generation thresholds
        See Also:
        Constant Field Values
      • DEFAULT_BITSET_TEST_THRESHOLD

        protected static final int DEFAULT_BITSET_TEST_THRESHOLD
        See Also:
        Constant Field Values
      • BITSET_OPTIMIZE_INIT_THRESHOLD

        protected static final int BITSET_OPTIMIZE_INIT_THRESHOLD
        If there are more than 8 long words to init in a bitset, try to optimize it; e.g., detect runs of -1L and 0L.
        See Also:
        Constant Field Values
      • makeSwitchThreshold

        protected int makeSwitchThreshold
        This is a hint for the language-specific code generator. A switch() or language-specific equivalent will be generated instead of a series of if/else statements for blocks with number of alternates greater than or equal to this number of non-predicated LL(1) alternates. This is modified by the grammar option "codeGenMakeSwitchThreshold"
      • bitsetTestThreshold

        protected int bitsetTestThreshold
        This is a hint for the language-specific code generator. A bitset membership test will be generated instead of an ORed series of LA(k) comparisions for lookahead sets with degree greater than or equal to this value. This is modified by the grammar option "codeGenBitsetTestThreshold"
      • TokenTypesFileSuffix

        public static java.lang.String TokenTypesFileSuffix
      • TokenTypesFileExt

        public static java.lang.String TokenTypesFileExt
    • Constructor Detail

      • CodeGenerator

        public CodeGenerator()
        Construct code generator base class
    • Method Detail

      • _print

        protected void _print​(java.lang.String s)
        Output a String to the currentOutput stream. Ignored if string is null.
        Parameters:
        s - The string to output
      • _printAction

        protected void _printAction​(java.lang.String s)
        Print an action without leading tabs, attempting to preserve the current indentation level for multi-line actions Ignored if string is null.
        Parameters:
        s - The action string to output
      • _println

        protected void _println​(java.lang.String s)
        Output a String followed by newline, to the currentOutput stream. Ignored if string is null.
        Parameters:
        s - The string to output
      • elementsAreRange

        public static boolean elementsAreRange​(int[] elems)
        Test if a set element array represents a contiguous range.
        Parameters:
        elems - The array of elements representing the set, usually from BitSet.toArray().
        Returns:
        true if the elements are a contiguous range (with two or more).
      • extractIdOfAction

        protected java.lang.String extractIdOfAction​(Token t)
        Get the identifier portion of an argument-action token. The ID of an action is assumed to be a trailing identifier. Specific code-generators may want to override this if the language has unusual declaration syntax.
        Parameters:
        t - The action token
        Returns:
        A string containing the text of the identifier
      • extractIdOfAction

        protected java.lang.String extractIdOfAction​(java.lang.String s,
                                                     int line,
                                                     int column)
        Get the identifier portion of an argument-action. The ID of an action is assumed to be a trailing identifier. Specific code-generators may want to override this if the language has unusual declaration syntax.
        Parameters:
        s - The action text
        line - Line used for error reporting.
        column - Line used for error reporting.
        Returns:
        A string containing the text of the identifier
      • extractTypeOfAction

        protected java.lang.String extractTypeOfAction​(Token t)
        Get the type string out of an argument-action token. The type of an action is assumed to precede a trailing identifier Specific code-generators may want to override this if the language has unusual declaration syntax.
        Parameters:
        t - The action token
        Returns:
        A string containing the text of the type
      • extractTypeOfAction

        protected java.lang.String extractTypeOfAction​(java.lang.String s,
                                                       int line,
                                                       int column)
        Get the type portion of an argument-action. The type of an action is assumed to precede a trailing identifier Specific code-generators may want to override this if the language has unusual declaration syntax.
        Parameters:
        s - The action text
        line - Line used for error reporting.
        Returns:
        A string containing the text of the type
      • gen

        public abstract void gen()
        Generate the code for all grammars
      • gen

        public abstract void gen​(ActionElement action)
        Generate code for the given grammar element.
        Parameters:
        action - The {...} action to generate
      • gen

        public abstract void gen​(AlternativeBlock blk)
        Generate code for the given grammar element.
        Parameters:
        blk - The "x|y|z|..." block to generate
      • gen

        public abstract void gen​(BlockEndElement end)
        Generate code for the given grammar element.
        Parameters:
        end - The block-end element to generate. Block-end elements are synthesized by the grammar parser to represent the end of a block.
      • gen

        public abstract void gen​(CharLiteralElement atom)
        Generate code for the given grammar element.
        Parameters:
        atom - The character literal reference to generate
      • gen

        public abstract void gen​(CharRangeElement r)
        Generate code for the given grammar element.
        Parameters:
        r - The character-range reference to generate
      • gen

        public abstract void gen​(LexerGrammar g)
                          throws java.io.IOException
        Generate the code for a parser
        Throws:
        java.io.IOException
      • gen

        public abstract void gen​(OneOrMoreBlock blk)
        Generate code for the given grammar element.
        Parameters:
        blk - The (...)+ block to generate
      • gen

        public abstract void gen​(ParserGrammar g)
                          throws java.io.IOException
        Generate the code for a parser
        Throws:
        java.io.IOException
      • gen

        public abstract void gen​(RuleRefElement rr)
        Generate code for the given grammar element.
        Parameters:
        rr - The rule-reference to generate
      • gen

        public abstract void gen​(StringLiteralElement atom)
        Generate code for the given grammar element.
        Parameters:
        atom - The string-literal reference to generate
      • gen

        public abstract void gen​(TokenRangeElement r)
        Generate code for the given grammar element.
        Parameters:
        r - The token-range reference to generate
      • gen

        public abstract void gen​(TokenRefElement atom)
        Generate code for the given grammar element.
        Parameters:
        atom - The token-reference to generate
      • gen

        public abstract void gen​(TreeElement t)
        Generate code for the given grammar element.
        Parameters:
        blk - The tree to generate code for.
      • gen

        public abstract void gen​(TreeWalkerGrammar g)
                          throws java.io.IOException
        Generate the code for a parser
        Throws:
        java.io.IOException
      • gen

        public abstract void gen​(WildcardElement wc)
        Generate code for the given grammar element.
        Parameters:
        wc - The wildcard element to generate
      • gen

        public abstract void gen​(ZeroOrMoreBlock blk)
        Generate code for the given grammar element.
        Parameters:
        blk - The (...)* block to generate
      • genTokenInterchange

        protected void genTokenInterchange​(TokenManager tm)
                                    throws java.io.IOException
        Generate the token types as a text file for persistence across shared lexer/parser
        Throws:
        java.io.IOException
      • processStringForASTConstructor

        public java.lang.String processStringForASTConstructor​(java.lang.String str)
        Process a string for an simple expression for use in xx/action.g it is used to cast simple tokens/references to the right type for the generated language.
        Parameters:
        str - A String.
      • getASTCreateString

        public abstract java.lang.String getASTCreateString​(Vector v)
        Get a string for an expression to generate creation of an AST subtree.
        Parameters:
        v - A Vector of String, where each element is an expression in the target language yielding an AST node.
      • getASTCreateString

        public abstract java.lang.String getASTCreateString​(GrammarAtom atom,
                                                            java.lang.String str)
        Get a string for an expression to generate creating of an AST node
        Parameters:
        str - The text of the arguments to the AST construction
      • getBitsetName

        protected java.lang.String getBitsetName​(int index)
        Given the index of a bitset in the bitset list, generate a unique name. Specific code-generators may want to override this if the language does not allow '_' or numerals in identifiers.
        Parameters:
        index - The index of the bitset in the bitset list.
      • encodeLexerRuleName

        public static java.lang.String encodeLexerRuleName​(java.lang.String id)
      • decodeLexerRuleName

        public static java.lang.String decodeLexerRuleName​(java.lang.String id)
      • mapTreeId

        public abstract java.lang.String mapTreeId​(java.lang.String id,
                                                   ActionTransInfo tInfo)
        Map an identifier to it's corresponding tree-node variable. This is context-sensitive, depending on the rule and alternative being generated
        Parameters:
        id - The identifier name to map
        forInput - true if the input tree node variable is to be returned, otherwise the output variable is returned.
        Returns:
        The mapped id (which may be the same as the input), or null if the mapping is invalid due to duplicates
      • markBitsetForGen

        protected int markBitsetForGen​(BitSet p)
        Add a bitset to the list of bitsets to be generated. if the bitset is already in the list, ignore the request. Always adds the bitset to the end of the list, so the caller can rely on the position of bitsets in the list. The returned position can be used to format the bitset name, since it is invariant.
        Parameters:
        p - Bit set to mark for code generation
        forParser - true if the bitset is used for the parser, false for the lexer
        Returns:
        The position of the bitset in the list.
      • print

        protected void print​(java.lang.String s)
        Output tab indent followed by a String, to the currentOutput stream. Ignored if string is null.
        Parameters:
        s - The string to output.
      • printAction

        protected void printAction​(java.lang.String s)
        Print an action with leading tabs, attempting to preserve the current indentation level for multi-line actions Ignored if string is null.
        Parameters:
        s - The action string to output
      • println

        protected void println​(java.lang.String s)
        Output tab indent followed by a String followed by newline, to the currentOutput stream. Ignored if string is null.
        Parameters:
        s - The string to output
      • printTabs

        protected void printTabs()
        Output the current tab indentation. This outputs the number of tabs indicated by the "tabs" variable to the currentOutput stream.
      • processActionForSpecialSymbols

        protected abstract java.lang.String processActionForSpecialSymbols​(java.lang.String actionStr,
                                                                           int line,
                                                                           RuleBlock currentRule,
                                                                           ActionTransInfo tInfo)
        Lexically process $ and # references within the action. This will replace #id and #(...) with the appropriate function calls and/or variables etc...
      • getFOLLOWBitSet

        public java.lang.String getFOLLOWBitSet​(java.lang.String ruleName,
                                                int k)
      • getFIRSTBitSet

        public java.lang.String getFIRSTBitSet​(java.lang.String ruleName,
                                               int k)
      • removeAssignmentFromDeclaration

        protected java.lang.String removeAssignmentFromDeclaration​(java.lang.String d)
        Remove the assignment portion of a declaration, if any.
        Parameters:
        d - the declaration
        Returns:
        the declaration without any assignment portion
      • reverseLexerRuleName

        public static java.lang.String reverseLexerRuleName​(java.lang.String id)
      • setGrammar

        protected void setGrammar​(Grammar g)
        Set a grammar for the code generator to use
      • setTool

        public void setTool​(Tool tool)