schrodinger.application.desmond.antlr3.debug module

class schrodinger.application.desmond.antlr3.debug.DebugParser(stream, state=None, dbg=None, *args, **kwargs)

Bases: schrodinger.application.desmond.antlr3.recognizers.Parser

__init__(stream, state=None, dbg=None, *args, **kwargs)

Initialize self. See help(type(self)) for accurate signature.

setDebugListener(dbg)

Provide a new debug event listener for this parser. Notify the input stream too that it should send events to this listener.

getDebugListener()
dbg
beginResync()

A hook to listen in on the token consumption during error recovery. The DebugParser subclasses this to fire events to the listenter.

endResync()

A hook to listen in on the token consumption during error recovery. The DebugParser subclasses this to fire events to the listenter.

beginBacktrack(level)
endBacktrack(level, successful)
reportError(exc)

Report a recognition problem.

This method sets errorRecovery to indicate the parser is recovering not parsing. Once in recovery mode, no errors are generated. To get out of recovery mode, the parser must successfully match a token (after a resync). So it will go:

  1. error occurs
  2. enter recovery mode, report error
  3. consume until token found in resynch set
  4. try to resume parsing
  5. next match() will reset errorRecovery mode

If you override, make sure to update syntaxErrors if you care about that.

DEFAULT_TOKEN_CHANNEL = 0
HIDDEN = 99
MEMO_RULE_FAILED = -2
MEMO_RULE_UNKNOWN = -1
alreadyParsedRule(input, ruleIndex)

Has this rule already parsed input at the current index in the input stream? Return the stop token index or MEMO_RULE_UNKNOWN. If we attempted but failed to parse properly before, return MEMO_RULE_FAILED.

This method has a side-effect: if we have seen this input for this rule and successfully parsed before, then seek ahead to 1 past the stop token matched for this rule last time.

antlr_version = (3, 0, 1, 0)
antlr_version_str = '3.0.1'
combineFollows(exact)
computeContextSensitiveRuleFOLLOW()

Compute the context-sensitive FOLLOW set for current rule. This is set of token types that can follow a specific rule reference given a specific call chain. You get the set of viable tokens that can possibly come next (lookahead depth 1) given the current call chain. Contrast this with the definition of plain FOLLOW for rule r:

FOLLOW(r)={x | S=>*alpha r beta in G and x in FIRST(beta)}

where x in T* and alpha, beta in V*; T is set of terminals and V is the set of terminals and nonterminals. In other words, FOLLOW(r) is the set of all tokens that can possibly follow references to r in any sentential form (context). At runtime, however, we know precisely which context applies as we have the call chain. We may compute the exact (rather than covering superset) set of following tokens.

For example, consider grammar:

stat : ID ‘=’ expr ‘;’ // FOLLOW(stat)=={EOF}
“return” expr ‘.’

;

expr : atom (‘+’ atom)* ; // FOLLOW(expr)=={‘;’,’.’,’)’} atom : INT // FOLLOW(atom)=={‘+’,’)’,’;’,’.’}

‘(‘ expr ‘)’

;

The FOLLOW sets are all inclusive whereas context-sensitive FOLLOW sets are precisely what could follow a rule reference. For input input “i=(3);”, here is the derivation:

stat => ID ‘=’ expr ‘;’
=> ID ‘=’ atom (‘+’ atom)* ‘;’ => ID ‘=’ ‘(‘ expr ‘)’ (‘+’ atom)* ‘;’ => ID ‘=’ ‘(‘ atom ‘)’ (‘+’ atom)* ‘;’ => ID ‘=’ ‘(‘ INT ‘)’ (‘+’ atom)* ‘;’ => ID ‘=’ ‘(‘ INT ‘)’ ‘;’

At the “3” token, you’d have a call chain of

stat -> expr -> atom -> expr -> atom

What can follow that specific nested ref to atom? Exactly ‘)’ as you can see by looking at the derivation of this specific input. Contrast this with the FOLLOW(atom)={‘+’,’)’,’;’,’.’}.

You want the exact viable token set when recovering from a token mismatch. Upon token mismatch, if LA(1) is member of the viable next token set, then you know there is most likely a missing token in the input stream. “Insert” one by just not throwing an exception.

computeErrorRecoverySet()

Compute the error recovery set for the current rule. During rule invocation, the parser pushes the set of tokens that can follow that rule reference on the stack; this amounts to computing FIRST of what follows the rule reference in the enclosing rule. This local follow set only includes tokens from within the rule; i.e., the FIRST computation done by ANTLR stops at the end of a rule.

EXAMPLE

When you find a “no viable alt exception”, the input is not consistent with any of the alternatives for rule r. The best thing to do is to consume tokens until you see something that can legally follow a call to r or any rule that called r. You don’t want the exact set of viable next tokens because the input might just be missing a token–you might consume the rest of the input looking for one of the missing tokens.

Consider grammar:

a : ‘[‘ b ‘]’
‘(‘ b ‘)’

;

b : c ‘^’ INT ; c : ID

INT

;

At each rule invocation, the set of tokens that could follow that rule is pushed on a stack. Here are the various “local” follow sets:

FOLLOW(b1_in_a) = FIRST(‘]’) = ‘]’ FOLLOW(b2_in_a) = FIRST(‘)’) = ‘)’ FOLLOW(c_in_b) = FIRST(‘^’) = ‘^’

Upon erroneous input “[]”, the call chain is

a -> b -> c

and, hence, the follow context stack is:

depth local follow set after call to rule
0 <EOF> a (from main()) 1 ‘]’ b 3 ‘^’ c

Notice that ‘)’ is not included, because b would have to have been called from a different context in rule a for ‘)’ to be included.

For error recovery, we cannot consider FOLLOW(c) (context-sensitive or otherwise). We need the combined set of all context-sensitive FOLLOW sets–the set of all tokens that could follow any reference in the call chain. We need to resync to one of those tokens. Note that FOLLOW(c)=’^’ and if we resync’d to that token, we’d consume until EOF. We need to sync to context-sensitive FOLLOWs for a, b, and c: {‘]’,’^’}. In this case, for input “[]”, LA(1) is in this set so we would not consume anything and after printing an error rule c would return normally. It would not find the required ‘^’ though. At this point, it gets a mismatched token error and throws an exception (since LA(1) is not in the viable following token set). The rule exception handler tries to recover, but finds the same recovery set and doesn’t consume anything. Rule b exits normally returning to rule a. Now it finds the ‘]’ (and with the successful match exits errorRecovery mode).

So, you cna see that the parser walks up call chain looking for the token that was a member of the recovery set.

Errors are not generated in errorRecovery mode.

ANTLR’s error recovery mechanism is based upon original ideas:

“Algorithms + Data Structures = Programs” by Niklaus Wirth

and

“A note on error recovery in recursive descent parsers”: http://portal.acm.org/citation.cfm?id=947902.947905

Later, Josef Grosch had some good ideas:

“Efficient and Comfortable Error Recovery in Recursive Descent Parsers”: ftp://www.cocolab.com/products/cocktail/doca4.ps/ell.ps.zip

Like Grosch I implemented local FOLLOW sets that are combined at run-time upon error to avoid overhead during parsing.

consumeUntil(input, tokenTypes)

Consume tokens until one matches the given token or token set

tokenTypes can be a single token type or a set of token types

displayRecognitionError(tokenNames, e)
emitErrorMessage(msg)

Override this method to change where error messages go

failed()

Return whether or not a backtracking attempt failed.

getBacktrackingLevel()
getCurrentInputSymbol(input)

Match needs to return the current input symbol, which gets put into the label for the associated token ref; e.g., x=ID. Token and tree parsers need to return different objects. Rather than test for input stream type or change the IntStream interface, I use a simple method to ask the recognizer to tell me what the current input symbol is.

This is ignored for lexers.

getErrorHeader(e)

What is the error header, normally line/character position information?

getErrorMessage(e, tokenNames)

What error message should be generated for the various exception types?

Not very object-oriented code, but I like having all error message generation within one method rather than spread among all of the exception classes. This also makes it much easier for the exception handling because the exception classes do not have to have pointers back to this object to access utility routines and so on. Also, changing the message for an exception type would be difficult because you would have to subclassing exception, but then somehow get ANTLR to make those kinds of exception objects instead of the default. This looks weird, but trust me–it makes the most sense in terms of flexibility.

For grammar debugging, you will want to override this to add more information such as the stack frame with getRuleInvocationStack(e, this.getClass().getName()) and, for no viable alts, the decision description and state etc…

Override this to change the message generated for one or more exception types.

getGrammarFileName()

For debugging and other purposes, might want the grammar name.

Have ANTLR generate an implementation for this method.

getMissingSymbol(input, e, expectedTokenType, follow)

Conjure up a missing token during error recovery.

The recognizer attempts to recover from single missing symbols. But, actions might refer to that missing symbol. For example, x=ID {f($x);}. The action clearly assumes that there has been an identifier matched previously and that $x points at that token. If that token is missing, but the next token in the stream is what we want we assume that this token is missing and we keep going. Because we have to return some token to replace the missing token, we have to conjure one up. This method gives the user control over the tokens returned for missing tokens. Mostly, you will want to create something special for identifier tokens. For literals such as ‘{‘ and ‘,’, the default action in the parser or tree parser works. It simply creates a CommonToken of the appropriate type. The text will be the token. If you change what tokens must be created by the lexer, override this method to create the appropriate tokens.

getNumberOfSyntaxErrors()

Get number of recognition errors (lexer, parser, tree parser). Each recognizer tracks its own number. So parser and lexer each have separate count. Does not count the spurious errors found between an error and next valid token match

See also reportError()

getRuleInvocationStack()

Return List<String> of the rules in your parser instance leading up to a call to this method. You could override if you want more details such as the file/line info of where in the parser java code a rule is invoked.

This is very useful for error messages and for context-sensitive error recovery.

You must be careful, if you subclass a generated recognizers. The default implementation will only search the module of self for rules, but the subclass will not contain any rules. You probably want to override this method to look like

def getRuleInvocationStack(self):
return self._getRuleInvocationStack(<class>.__module__)

where <class> is the class of the generated recognizer, e.g. the superclass of self.

getRuleMemoization(ruleIndex, ruleStartIndex)

Given a rule number and a start token index number, return MEMO_RULE_UNKNOWN if the rule has not parsed input starting from start index. If this rule has parsed input starting from the start index before, then return where the rule stopped parsing. It returns the index of the last token matched by the rule.

getSourceName()
getTokenErrorDisplay(t)

How should a token be displayed in an error message? The default is to display just the text, but during development you might want to have a lot of information spit out. Override in that case to use t.toString() (which, for CommonToken, dumps everything about the token). This is better than forcing you to override a method in your token objects because you don’t have to go modify your lexer so that it creates a new Java type.

getTokenStream()
match(input, ttype, follow)

Match current input symbol against ttype. Attempt single token insertion or deletion error recovery. If that fails, throw MismatchedTokenException.

To turn off single token insertion or deletion error recovery, override recoverFromMismatchedToken() and have it throw an exception. See TreeParser.recoverFromMismatchedToken(). This way any error in a rule will cause an exception and immediate exit from rule. Rule would recover by resynchronizing to the set of symbols that can follow rule ref.

matchAny(input)

Match the wildcard: in a symbol

memoize(input, ruleIndex, ruleStartIndex, success)

Record whether or not this rule parsed the input at this position successfully.

mismatchIsMissingToken(input, follow)
mismatchIsUnwantedToken(input, ttype)
recover(input, re)

Recover from an error found on the input stream. This is for NoViableAlt and mismatched symbol exceptions. If you enable single token insertion and deletion, this will usually not handle mismatched symbol exceptions but there could be a mismatched token that the match() routine could not recover from.

recoverFromMismatchedSet(input, e, follow)

Not currently used

recoverFromMismatchedToken(input, ttype, follow)

Attempt to recover from a single missing or extra token.

EXTRA TOKEN

LA(1) is not what we are looking for. If LA(2) has the right token, however, then assume LA(1) is some extra spurious token. Delete it and LA(2) as if we were doing a normal match(), which advances the input.

MISSING TOKEN

If current token is consistent with what could come after ttype then it is ok to ‘insert’ the missing token, else throw exception For example, Input ‘i=(3;’ is clearly missing the ‘)’. When the parser returns from the nested call to expr, it will have call chain:

stat -> expr -> atom

and it will be trying to match the ‘)’ at this point in the derivation:

=> ID ‘=’ ‘(‘ INT ‘)’ (‘+’ atom)* ‘;’
^

match() will see that ‘;’ doesn’t match ‘)’ and report a mismatched token error. To recover, it sees that LA(1)==’;’ is in the set of tokens that can follow the ‘)’ token reference in rule atom. It can assume that you forgot the ‘)’.

reset()

reset the parser’s state; subclasses must rewinds the input stream

setBacktrackingLevel(n)
setInput(input)
setTokenStream(input)

Set the token stream and reset the parser

toStrings(tokens)

A convenience method for use most often with template rewrites.

Convert a List<Token> to List<String>

tokenNames = None
traceIn(ruleName, ruleIndex)
traceOut(ruleName, ruleIndex)
class schrodinger.application.desmond.antlr3.debug.DebugTokenStream(input, dbg=None)

Bases: schrodinger.application.desmond.antlr3.streams.TokenStream

__init__(input, dbg=None)

Initialize self. See help(type(self)) for accurate signature.

getDebugListener()
setDebugListener(dbg)
dbg
consume()
consumeInitialHiddenTokens()

consume all initial off-channel tokens

LT(i)

Get Token at current input pointer + i ahead where i=1 is next Token. i<0 indicates tokens in the past. So -1 is previous token and -2 is two tokens ago. LT(0) is undefined. For i>=n, return Token.EOFToken. Return null for LT(0) and any index that results in an absolute address that is negative.

LA(i)

Get int at current input pointer + i ahead where i=1 is next int.

Negative indexes are allowed. LA(-1) is previous token (token just matched). LA(-i) where i is before first token should yield -1, invalid char / EOF.

get(i)

Get a token at an absolute index i; 0..n-1. This is really only needed for profiling and debugging and token stream rewriting. If you don’t want to buffer up tokens, then this method makes no sense for you. Naturally you can’t use the rewrite stream feature. I believe DebugTokenStream can easily be altered to not use this method, removing the dependency.

index()

Return the current input symbol index 0..n where n indicates the last symbol has been read. The index is the symbol about to be read not the most recently read symbol.

mark()

Tell the stream to start buffering if it hasn’t already. Return current input position, index(), or some other marker so that when passed to rewind() you get back to the same spot. rewind(mark()) should not affect the input cursor. The Lexer track line/col info as well as input index so its markers are not pure input indexes. Same for tree node streams.

rewind(marker=None)

Reset the stream so that next call to index would return marker. The marker will usually be index() but it doesn’t have to be. It’s just a marker to indicate what state the stream was in. This is essentially calling release() and seek(). If there are markers created after this marker argument, this routine must unroll them like a stack. Assume the state the stream was in when this marker was created.

If marker is None: Rewind to the input position of the last marker. Used currently only after a cyclic DFA and just before starting a sem/syn predicate to get the input position back to the start of the decision. Do not “pop” the marker off the state. mark(i) and rewind(i) should balance still. It is like invoking rewind(last marker) but it should not “pop” the marker off. It’s like seek(last marker’s input position).

release(marker)

You may want to commit to a backtrack but don’t want to force the stream to keep bookkeeping objects around for a marker that is no longer necessary. This will have the same behavior as rewind() except it releases resources without the backward seek. This must throw away resources for all markers back to the marker argument. So if you’re nested 5 levels of mark(), and then release(2) you have to release resources for depths 2..5.

seek(index)

Set the input cursor to the position indicated by index. This is normally used to seek ahead in the input stream. No buffering is required to do this unless you know your stream will use seek to move backwards such as when backtracking.

This is different from rewind in its multi-directional requirement and in that its argument is strictly an input cursor (index).

For char streams, seeking forward must update the stream state such as line number. For seeking backwards, you will be presumably backtracking using the mark/rewind mechanism that restores state and so this method does not need to update state when seeking backwards.

Currently, this method is only used for efficient backtracking using memoization, but in the future it may be used for incremental parsing.

The index is 0..n-1. A seek to position i means that LA(1) will return the ith symbol. So, seeking to 0 means LA(1) will return the first element in the stream.

size()

Only makes sense for streams that buffer everything up probably, but might be useful to display the entire stream or for testing. This value includes a single EOF.

getTokenSource()

Where is this stream pulling tokens from? This is not the name, but the object that provides Token objects.

getSourceName()

Where are you getting symbols from? Normally, implementations will pass the buck all the way to the lexer who can ask its input stream for the file name or whatever.

toString(start=None, stop=None)

Return the text of all tokens from start to stop, inclusive. If the stream does not buffer all the tokens then it can just return “” or null; Users should not access $ruleLabel.text in an action of course in that case.

Because the user is not required to use a token with an index stored in it, we must provide a means for two token objects themselves to indicate the start/end location. Most often this will just delegate to the other toString(int,int). This is also parallel with the TreeNodeStream.toString(Object,Object).

class schrodinger.application.desmond.antlr3.debug.DebugTreeAdaptor(dbg, adaptor)

Bases: schrodinger.application.desmond.antlr3.tree.TreeAdaptor

A TreeAdaptor proxy that fires debugging events to a DebugEventListener delegate and uses the TreeAdaptor delegate to do the actual work. All AST events are triggered by this adaptor; no code gen changes are needed in generated rules. Debugging events are triggered after invoking tree adaptor routines.

Trees created with actions in rewrite actions like “-> ^(ADD {foo} {bar})” cannot be tracked as they might not use the adaptor to create foo, bar. The debug listener has to deal with tree node IDs for which it did not see a createNode event. A single <unknown> node is sufficient even if it represents a whole tree.

__init__(dbg, adaptor)

Initialize self. See help(type(self)) for accurate signature.

createWithPayload(payload)

Create a tree node from Token object; for CommonTree type trees, then the token just becomes the payload. This is the most common create call.

Override if you want another kind of node to be built.

createFromToken(tokenType, fromToken, text=None)

Create a new node derived from a token, with a new token type and (optionally) new text.

This is invoked from an imaginary node ref on right side of a rewrite rule as IMAG[$tokenLabel] or IMAG[$tokenLabel “IMAG”].

This should invoke createToken(Token).

createFromType(tokenType, text)

Create a new node derived from a token, with a new token type.

This is invoked from an imaginary node ref on right side of a rewrite rule as IMAG[“IMAG”].

This should invoke createToken(int,String).

errorNode(input, start, stop, exc)

Return a tree node representing an error. This node records the tokens consumed during error recovery. The start token indicates the input symbol at which the error was detected. The stop token indicates the last symbol consumed during recovery.

You must specify the input stream so that the erroneous text can be packaged up in the error node. The exception could be useful to some applications; default implementation stores ptr to it in the CommonErrorNode.

This only makes sense during token parsing, not tree parsing. Tree parsing should happen only when parsing and tree construction succeed.

dupTree(tree)

Duplicate tree recursively, using dupNode() for each node

simulateTreeConstruction(t)

^(A B C): emit create A, create B, add child, …

dupNode(treeNode)

Duplicate a single tree node.

Override if you want another kind of node to be built.

nil()

Return a nil node (an empty but non-null node) that can hold a list of element as the children. If you want a flat tree (a list) use “t=adaptor.nil(); t.addChild(x); t.addChild(y);”

isNil(tree)

Is tree considered a nil node used to make lists of child nodes?

addChild(t, child)

Add a child to the tree t. If child is a flat tree (a list), make all in list children of t. Warning: if t has no children, but child does and child isNil then you can decide it is ok to move children to t via t.children = child.children; i.e., without copying the array. Just make sure that this is consistent with have the user will build ASTs. Do nothing if t or child is null.

becomeRoot(newRoot, oldRoot)

If oldRoot is a nil root, just copy or move the children to newRoot. If not a nil root, make oldRoot a child of newRoot.

old=^(nil a b c), new=r yields ^(r a b c) old=^(a b c), new=r yields ^(r ^(a b c))

If newRoot is a nil-rooted single child tree, use the single child as the new root node.

old=^(nil a b c), new=^(nil r) yields ^(r a b c) old=^(a b c), new=^(nil r) yields ^(r ^(a b c))

If oldRoot was null, it’s ok, just return newRoot (even if isNil).

old=null, new=r yields r old=null, new=^(nil r) yields ^(nil r)

Return newRoot. Throw an exception if newRoot is not a simple node or nil root with a single child node–it must be a root node. If newRoot is ^(nil x) return x as newRoot.

Be advised that it’s ok for newRoot to point at oldRoot’s children; i.e., you don’t have to copy the list. We are constructing these nodes so we should have this control for efficiency.

rulePostProcessing(root)

Given the root of the subtree created for this rule, post process it to do any simplifications or whatever you want. A required behavior is to convert ^(nil singleSubtree) to singleSubtree as the setting of start/stop indexes relies on a single non-nil root for non-flat trees.

Flat trees such as for lists like “idlist : ID+ ;” are left alone unless there is only one ID. For a list, the start/stop indexes are set in the nil node.

This method is executed after all rule tree construction and right before setTokenBoundaries().

getType(t)

For tree parsing, I need to know the token type of a node

setType(t, type)

Node constructors can set the type of a node

getText(t)
setText(t, text)

Node constructors can set the text of a node

getToken(t)

Return the token object from which this node was created.

Currently used only for printing an error message. The error display routine in BaseRecognizer needs to display where the input the error occurred. If your tree of limitation does not store information that can lead you to the token, you can create a token filled with the appropriate information and pass that back. See BaseRecognizer.getErrorMessage().

setTokenBoundaries(t, startToken, stopToken)

Where are the bounds in the input token stream for this node and all children? Each rule that creates AST nodes will call this method right before returning. Flat trees (i.e., lists) will still usually have a nil root node just to hold the children list. That node would contain the start/stop indexes then.

getTokenStartIndex(t)

Get the token start index for this subtree; return -1 if no such index

getTokenStopIndex(t)

Get the token stop index for this subtree; return -1 if no such index

getChild(t, i)

Get a child 0..n-1 node

setChild(t, i, child)

Set ith child (0..n-1) to t; t must be non-null and non-nil node

deleteChild(t, i)

Remove ith child and shift children down from right.

getChildCount(t)

How many children? If 0, then this is a leaf node

getUniqueID(node)

For identifying trees.

How to identify nodes so we can say “add node to a prior node”? Even becomeRoot is an issue. Use System.identityHashCode(node) usually.

getParent(t)

Who is the parent node of this node; if null, implies node is root. If your node type doesn’t handle this, it’s ok but the tree rewrites in tree parsers need this functionality.

getChildIndex(t)

What index is this node in the child list? Range: 0..n-1 If your node type doesn’t handle this, it’s ok but the tree rewrites in tree parsers need this functionality.

setParent(t, parent)

Who is the parent node of this node; if null, implies node is root. If your node type doesn’t handle this, it’s ok but the tree rewrites in tree parsers need this functionality.

setChildIndex(t, index)

What index is this node in the child list? Range: 0..n-1 If your node type doesn’t handle this, it’s ok but the tree rewrites in tree parsers need this functionality.

replaceChildren(parent, startChildIndex, stopChildIndex, t)

Replace from start to stop child index of parent with t, which might be a list. Number of children may be different after this call.

If parent is null, don’t do anything; must be at root of overall tree. Can’t replace whatever points to the parent externally. Do nothing.

getDebugListener()
setDebugListener(dbg)
getTreeAdaptor()
create(*args)

Deprecated, use createWithPayload, createFromToken or createFromType.

This method only exists to mimic the Java interface of TreeAdaptor.

class schrodinger.application.desmond.antlr3.debug.DebugEventListener

Bases: object

All debugging events that a recognizer can trigger.

I did not create a separate AST debugging interface as it would create lots of extra classes and DebugParser has a dbg var defined, which makes it hard to change to ASTDebugEventListener. I looked hard at this issue and it is easier to understand as one monolithic event interface for all possible events. Hopefully, adding ST debugging stuff won’t be bad. Leave for future. 4/26/2006.

PROTOCOL_VERSION = '2'
enterRule(grammarFileName, ruleName)

The parser has just entered a rule. No decision has been made about which alt is predicted. This is fired AFTER init actions have been executed. Attributes are defined and available etc… The grammarFileName allows composite grammars to jump around among multiple grammar files.

enterAlt(alt)

Because rules can have lots of alternatives, it is very useful to know which alt you are entering. This is 1..n for n alts.

exitRule(grammarFileName, ruleName)

This is the last thing executed before leaving a rule. It is executed even if an exception is thrown. This is triggered after error reporting and recovery have occurred (unless the exception is not caught in this rule). This implies an “exitAlt” event. The grammarFileName allows composite grammars to jump around among multiple grammar files.

enterSubRule(decisionNumber)

Track entry into any (…) subrule other EBNF construct

exitSubRule(decisionNumber)
enterDecision(decisionNumber)

Every decision, fixed k or arbitrary, has an enter/exit event so that a GUI can easily track what LT/consume events are associated with prediction. You will see a single enter/exit subrule but multiple enter/exit decision events, one for each loop iteration.

exitDecision(decisionNumber)
consumeToken(t)

An input token was consumed; matched by any kind of element. Trigger after the token was matched by things like match(), matchAny().

consumeHiddenToken(t)

An off-channel input token was consumed. Trigger after the token was matched by things like match(), matchAny(). (unless of course the hidden token is first stuff in the input stream).

mark(marker)

The parser is going to look arbitrarily ahead; mark this location, the token stream’s marker is sent in case you need it.

rewind(marker=None)

After an arbitrairly long lookahead as with a cyclic DFA (or with any backtrack), this informs the debugger that stream should be rewound to the position associated with marker.

beginBacktrack(level)
endBacktrack(level, successful)
location(line, pos)

To watch a parser move through the grammar, the parser needs to inform the debugger what line/charPos it is passing in the grammar. For now, this does not know how to switch from one grammar to the other and back for island grammars etc…

This should also allow breakpoints because the debugger can stop the parser whenever it hits this line/pos.

recognitionException(e)

A recognition exception occurred such as NoViableAltException. I made this a generic event so that I can alter the exception hierachy later without having to alter all the debug objects.

Upon error, the stack of enter rule/subrule must be properly unwound. If no viable alt occurs it is within an enter/exit decision, which also must be rewound. Even the rewind for each mark must be unwount. In the Java target this is pretty easy using try/finally, if a bit ugly in the generated code. The rewind is generated in DFA.predict() actually so no code needs to be generated for that. For languages w/o this “finally” feature (C++?), the target implementor will have to build an event stack or something.

Across a socket for remote debugging, only the RecognitionException data fields are transmitted. The token object or whatever that caused the problem was the last object referenced by LT. The immediately preceding LT event should hold the unexpected Token or char.

Here is a sample event trace for grammar:

b : C ({;}A|B) // {;} is there to prevent A|B becoming a set
D

;

The sequence for this rule (with no viable alt in the subrule) for input ‘c c’ (there are 3 tokens) is:

commence LT(1) enterRule b location 7 1 enter decision 3 LT(1) exit decision 3 enterAlt1 location 7 5 LT(1) consumeToken [c/<4>,1:0] location 7 7 enterSubRule 2 enter decision 2 LT(1) LT(1) recognitionException NoViableAltException 2 1 2 exit decision 2 exitSubRule 2 beginResync LT(1) consumeToken [c/<4>,1:1] LT(1) endResync LT(-1) exitRule b terminate
beginResync()

Indicates the recognizer is about to consume tokens to resynchronize the parser. Any consume events from here until the recovered event are not part of the parse–they are dead tokens.

endResync()

Indicates that the recognizer has finished consuming tokens in order to resychronize. There may be multiple beginResync/endResync pairs before the recognizer comes out of errorRecovery mode (in which multiple errors are suppressed). This will be useful in a gui where you want to probably grey out tokens that are consumed but not matched to anything in grammar. Anything between a beginResync/endResync pair was tossed out by the parser.

semanticPredicate(result, predicate)

A semantic predicate was evaluate with this result and action text

commence()

Announce that parsing has begun. Not technically useful except for sending events over a socket. A GUI for example will launch a thread to connect and communicate with a remote parser. The thread will want to notify the GUI when a connection is made. ANTLR parsers trigger this upon entry to the first rule (the ruleLevel is used to figure this out).

terminate()

Parsing is over; successfully or not. Mostly useful for telling remote debugging listeners that it’s time to quit. When the rule invocation level goes to zero at the end of a rule, we are done parsing.

consumeNode(t)

Input for a tree parser is an AST, but we know nothing for sure about a node except its type and text (obtained from the adaptor). This is the analog of the consumeToken method. Again, the ID is the hashCode usually of the node so it only works if hashCode is not implemented. If the type is UP or DOWN, then the ID is not really meaningful as it’s fixed–there is just one UP node and one DOWN navigation node.

LT(i, t)

The tree parser lookedahead. If the type is UP or DOWN, then the ID is not really meaningful as it’s fixed–there is just one UP node and one DOWN navigation node.

nilNode(t)

A nil was created (even nil nodes have a unique ID… they are not “null” per se). As of 4/28/2006, this seems to be uniquely triggered when starting a new subtree such as when entering a subrule in automatic mode and when building a tree in rewrite mode.

If you are receiving this event over a socket via RemoteDebugEventSocketListener then only t.ID is set.

errorNode(t)

Upon syntax error, recognizers bracket the error with an error node if they are building ASTs.

createNode(node, token=None)

Announce a new node built from token elements such as type etc…

If you are receiving this event over a socket via RemoteDebugEventSocketListener then only t.ID, type, text are set.

becomeRoot(newRoot, oldRoot)

Make a node the new root of an existing root.

Note: the newRootID parameter is possibly different than the TreeAdaptor.becomeRoot() newRoot parameter. In our case, it will always be the result of calling TreeAdaptor.becomeRoot() and not root_n or whatever.

The listener should assume that this event occurs only when the current subrule (or rule) subtree is being reset to newRootID.

If you are receiving this event over a socket via RemoteDebugEventSocketListener then only IDs are set.

:see antlr3.tree.TreeAdaptor.becomeRoot()

addChild(root, child)

Make childID a child of rootID.

If you are receiving this event over a socket via RemoteDebugEventSocketListener then only IDs are set.

:see antlr3.tree.TreeAdaptor.addChild()

setTokenBoundaries(t, tokenStartIndex, tokenStopIndex)

Set the token start/stop token index for a subtree root or node.

If you are receiving this event over a socket via RemoteDebugEventSocketListener then only t.ID is set.

__init__

Initialize self. See help(type(self)) for accurate signature.

class schrodinger.application.desmond.antlr3.debug.BlankDebugEventListener

Bases: schrodinger.application.desmond.antlr3.debug.DebugEventListener

A blank listener that does nothing; useful for real classes so they don’t have to have lots of blank methods and are less sensitive to updates to debug interface.

Note: this class is identical to DebugEventListener and exists purely for compatibility with Java.

LT(i, t)

The tree parser lookedahead. If the type is UP or DOWN, then the ID is not really meaningful as it’s fixed–there is just one UP node and one DOWN navigation node.

PROTOCOL_VERSION = '2'
__init__

Initialize self. See help(type(self)) for accurate signature.

addChild(root, child)

Make childID a child of rootID.

If you are receiving this event over a socket via RemoteDebugEventSocketListener then only IDs are set.

:see antlr3.tree.TreeAdaptor.addChild()

becomeRoot(newRoot, oldRoot)

Make a node the new root of an existing root.

Note: the newRootID parameter is possibly different than the TreeAdaptor.becomeRoot() newRoot parameter. In our case, it will always be the result of calling TreeAdaptor.becomeRoot() and not root_n or whatever.

The listener should assume that this event occurs only when the current subrule (or rule) subtree is being reset to newRootID.

If you are receiving this event over a socket via RemoteDebugEventSocketListener then only IDs are set.

:see antlr3.tree.TreeAdaptor.becomeRoot()

beginBacktrack(level)
beginResync()

Indicates the recognizer is about to consume tokens to resynchronize the parser. Any consume events from here until the recovered event are not part of the parse–they are dead tokens.

commence()

Announce that parsing has begun. Not technically useful except for sending events over a socket. A GUI for example will launch a thread to connect and communicate with a remote parser. The thread will want to notify the GUI when a connection is made. ANTLR parsers trigger this upon entry to the first rule (the ruleLevel is used to figure this out).

consumeHiddenToken(t)

An off-channel input token was consumed. Trigger after the token was matched by things like match(), matchAny(). (unless of course the hidden token is first stuff in the input stream).

consumeNode(t)

Input for a tree parser is an AST, but we know nothing for sure about a node except its type and text (obtained from the adaptor). This is the analog of the consumeToken method. Again, the ID is the hashCode usually of the node so it only works if hashCode is not implemented. If the type is UP or DOWN, then the ID is not really meaningful as it’s fixed–there is just one UP node and one DOWN navigation node.

consumeToken(t)

An input token was consumed; matched by any kind of element. Trigger after the token was matched by things like match(), matchAny().

createNode(node, token=None)

Announce a new node built from token elements such as type etc…

If you are receiving this event over a socket via RemoteDebugEventSocketListener then only t.ID, type, text are set.

endBacktrack(level, successful)
endResync()

Indicates that the recognizer has finished consuming tokens in order to resychronize. There may be multiple beginResync/endResync pairs before the recognizer comes out of errorRecovery mode (in which multiple errors are suppressed). This will be useful in a gui where you want to probably grey out tokens that are consumed but not matched to anything in grammar. Anything between a beginResync/endResync pair was tossed out by the parser.

enterAlt(alt)

Because rules can have lots of alternatives, it is very useful to know which alt you are entering. This is 1..n for n alts.

enterDecision(decisionNumber)

Every decision, fixed k or arbitrary, has an enter/exit event so that a GUI can easily track what LT/consume events are associated with prediction. You will see a single enter/exit subrule but multiple enter/exit decision events, one for each loop iteration.

enterRule(grammarFileName, ruleName)

The parser has just entered a rule. No decision has been made about which alt is predicted. This is fired AFTER init actions have been executed. Attributes are defined and available etc… The grammarFileName allows composite grammars to jump around among multiple grammar files.

enterSubRule(decisionNumber)

Track entry into any (…) subrule other EBNF construct

errorNode(t)

Upon syntax error, recognizers bracket the error with an error node if they are building ASTs.

exitDecision(decisionNumber)
exitRule(grammarFileName, ruleName)

This is the last thing executed before leaving a rule. It is executed even if an exception is thrown. This is triggered after error reporting and recovery have occurred (unless the exception is not caught in this rule). This implies an “exitAlt” event. The grammarFileName allows composite grammars to jump around among multiple grammar files.

exitSubRule(decisionNumber)
location(line, pos)

To watch a parser move through the grammar, the parser needs to inform the debugger what line/charPos it is passing in the grammar. For now, this does not know how to switch from one grammar to the other and back for island grammars etc…

This should also allow breakpoints because the debugger can stop the parser whenever it hits this line/pos.

mark(marker)

The parser is going to look arbitrarily ahead; mark this location, the token stream’s marker is sent in case you need it.

nilNode(t)

A nil was created (even nil nodes have a unique ID… they are not “null” per se). As of 4/28/2006, this seems to be uniquely triggered when starting a new subtree such as when entering a subrule in automatic mode and when building a tree in rewrite mode.

If you are receiving this event over a socket via RemoteDebugEventSocketListener then only t.ID is set.

recognitionException(e)

A recognition exception occurred such as NoViableAltException. I made this a generic event so that I can alter the exception hierachy later without having to alter all the debug objects.

Upon error, the stack of enter rule/subrule must be properly unwound. If no viable alt occurs it is within an enter/exit decision, which also must be rewound. Even the rewind for each mark must be unwount. In the Java target this is pretty easy using try/finally, if a bit ugly in the generated code. The rewind is generated in DFA.predict() actually so no code needs to be generated for that. For languages w/o this “finally” feature (C++?), the target implementor will have to build an event stack or something.

Across a socket for remote debugging, only the RecognitionException data fields are transmitted. The token object or whatever that caused the problem was the last object referenced by LT. The immediately preceding LT event should hold the unexpected Token or char.

Here is a sample event trace for grammar:

b : C ({;}A|B) // {;} is there to prevent A|B becoming a set
D

;

The sequence for this rule (with no viable alt in the subrule) for input ‘c c’ (there are 3 tokens) is:

commence LT(1) enterRule b location 7 1 enter decision 3 LT(1) exit decision 3 enterAlt1 location 7 5 LT(1) consumeToken [c/<4>,1:0] location 7 7 enterSubRule 2 enter decision 2 LT(1) LT(1) recognitionException NoViableAltException 2 1 2 exit decision 2 exitSubRule 2 beginResync LT(1) consumeToken [c/<4>,1:1] LT(1) endResync LT(-1) exitRule b terminate
rewind(marker=None)

After an arbitrairly long lookahead as with a cyclic DFA (or with any backtrack), this informs the debugger that stream should be rewound to the position associated with marker.

semanticPredicate(result, predicate)

A semantic predicate was evaluate with this result and action text

setTokenBoundaries(t, tokenStartIndex, tokenStopIndex)

Set the token start/stop token index for a subtree root or node.

If you are receiving this event over a socket via RemoteDebugEventSocketListener then only t.ID is set.

terminate()

Parsing is over; successfully or not. Mostly useful for telling remote debugging listeners that it’s time to quit. When the rule invocation level goes to zero at the end of a rule, we are done parsing.

class schrodinger.application.desmond.antlr3.debug.TraceDebugEventListener(adaptor=None)

Bases: schrodinger.application.desmond.antlr3.debug.DebugEventListener

A listener that simply records text representations of the events.

Useful for debugging the debugging facility ;)

Subclasses can override the record() method (which defaults to printing to stdout) to record the events in a different way.

__init__(adaptor=None)

Initialize self. See help(type(self)) for accurate signature.

record(event)
enterRule(grammarFileName, ruleName)

The parser has just entered a rule. No decision has been made about which alt is predicted. This is fired AFTER init actions have been executed. Attributes are defined and available etc… The grammarFileName allows composite grammars to jump around among multiple grammar files.

exitRule(grammarFileName, ruleName)

This is the last thing executed before leaving a rule. It is executed even if an exception is thrown. This is triggered after error reporting and recovery have occurred (unless the exception is not caught in this rule). This implies an “exitAlt” event. The grammarFileName allows composite grammars to jump around among multiple grammar files.

enterSubRule(decisionNumber)

Track entry into any (…) subrule other EBNF construct

exitSubRule(decisionNumber)
location(line, pos)

To watch a parser move through the grammar, the parser needs to inform the debugger what line/charPos it is passing in the grammar. For now, this does not know how to switch from one grammar to the other and back for island grammars etc…

This should also allow breakpoints because the debugger can stop the parser whenever it hits this line/pos.

consumeNode(t)

Input for a tree parser is an AST, but we know nothing for sure about a node except its type and text (obtained from the adaptor). This is the analog of the consumeToken method. Again, the ID is the hashCode usually of the node so it only works if hashCode is not implemented. If the type is UP or DOWN, then the ID is not really meaningful as it’s fixed–there is just one UP node and one DOWN navigation node.

LT(i, t)

The tree parser lookedahead. If the type is UP or DOWN, then the ID is not really meaningful as it’s fixed–there is just one UP node and one DOWN navigation node.

nilNode(t)

A nil was created (even nil nodes have a unique ID… they are not “null” per se). As of 4/28/2006, this seems to be uniquely triggered when starting a new subtree such as when entering a subrule in automatic mode and when building a tree in rewrite mode.

If you are receiving this event over a socket via RemoteDebugEventSocketListener then only t.ID is set.

createNode(t, token=None)

Announce a new node built from token elements such as type etc…

If you are receiving this event over a socket via RemoteDebugEventSocketListener then only t.ID, type, text are set.

becomeRoot(newRoot, oldRoot)

Make a node the new root of an existing root.

Note: the newRootID parameter is possibly different than the TreeAdaptor.becomeRoot() newRoot parameter. In our case, it will always be the result of calling TreeAdaptor.becomeRoot() and not root_n or whatever.

The listener should assume that this event occurs only when the current subrule (or rule) subtree is being reset to newRootID.

If you are receiving this event over a socket via RemoteDebugEventSocketListener then only IDs are set.

:see antlr3.tree.TreeAdaptor.becomeRoot()

addChild(root, child)

Make childID a child of rootID.

If you are receiving this event over a socket via RemoteDebugEventSocketListener then only IDs are set.

:see antlr3.tree.TreeAdaptor.addChild()

setTokenBoundaries(t, tokenStartIndex, tokenStopIndex)

Set the token start/stop token index for a subtree root or node.

If you are receiving this event over a socket via RemoteDebugEventSocketListener then only t.ID is set.

PROTOCOL_VERSION = '2'
beginBacktrack(level)
beginResync()

Indicates the recognizer is about to consume tokens to resynchronize the parser. Any consume events from here until the recovered event are not part of the parse–they are dead tokens.

commence()

Announce that parsing has begun. Not technically useful except for sending events over a socket. A GUI for example will launch a thread to connect and communicate with a remote parser. The thread will want to notify the GUI when a connection is made. ANTLR parsers trigger this upon entry to the first rule (the ruleLevel is used to figure this out).

consumeHiddenToken(t)

An off-channel input token was consumed. Trigger after the token was matched by things like match(), matchAny(). (unless of course the hidden token is first stuff in the input stream).

consumeToken(t)

An input token was consumed; matched by any kind of element. Trigger after the token was matched by things like match(), matchAny().

endBacktrack(level, successful)
endResync()

Indicates that the recognizer has finished consuming tokens in order to resychronize. There may be multiple beginResync/endResync pairs before the recognizer comes out of errorRecovery mode (in which multiple errors are suppressed). This will be useful in a gui where you want to probably grey out tokens that are consumed but not matched to anything in grammar. Anything between a beginResync/endResync pair was tossed out by the parser.

enterAlt(alt)

Because rules can have lots of alternatives, it is very useful to know which alt you are entering. This is 1..n for n alts.

enterDecision(decisionNumber)

Every decision, fixed k or arbitrary, has an enter/exit event so that a GUI can easily track what LT/consume events are associated with prediction. You will see a single enter/exit subrule but multiple enter/exit decision events, one for each loop iteration.

errorNode(t)

Upon syntax error, recognizers bracket the error with an error node if they are building ASTs.

exitDecision(decisionNumber)
mark(marker)

The parser is going to look arbitrarily ahead; mark this location, the token stream’s marker is sent in case you need it.

recognitionException(e)

A recognition exception occurred such as NoViableAltException. I made this a generic event so that I can alter the exception hierachy later without having to alter all the debug objects.

Upon error, the stack of enter rule/subrule must be properly unwound. If no viable alt occurs it is within an enter/exit decision, which also must be rewound. Even the rewind for each mark must be unwount. In the Java target this is pretty easy using try/finally, if a bit ugly in the generated code. The rewind is generated in DFA.predict() actually so no code needs to be generated for that. For languages w/o this “finally” feature (C++?), the target implementor will have to build an event stack or something.

Across a socket for remote debugging, only the RecognitionException data fields are transmitted. The token object or whatever that caused the problem was the last object referenced by LT. The immediately preceding LT event should hold the unexpected Token or char.

Here is a sample event trace for grammar:

b : C ({;}A|B) // {;} is there to prevent A|B becoming a set
D

;

The sequence for this rule (with no viable alt in the subrule) for input ‘c c’ (there are 3 tokens) is:

commence LT(1) enterRule b location 7 1 enter decision 3 LT(1) exit decision 3 enterAlt1 location 7 5 LT(1) consumeToken [c/<4>,1:0] location 7 7 enterSubRule 2 enter decision 2 LT(1) LT(1) recognitionException NoViableAltException 2 1 2 exit decision 2 exitSubRule 2 beginResync LT(1) consumeToken [c/<4>,1:1] LT(1) endResync LT(-1) exitRule b terminate
rewind(marker=None)

After an arbitrairly long lookahead as with a cyclic DFA (or with any backtrack), this informs the debugger that stream should be rewound to the position associated with marker.

semanticPredicate(result, predicate)

A semantic predicate was evaluate with this result and action text

terminate()

Parsing is over; successfully or not. Mostly useful for telling remote debugging listeners that it’s time to quit. When the rule invocation level goes to zero at the end of a rule, we are done parsing.

class schrodinger.application.desmond.antlr3.debug.RecordDebugEventListener(adaptor=None)

Bases: schrodinger.application.desmond.antlr3.debug.TraceDebugEventListener

A listener that records events as strings in an array.

__init__(adaptor=None)

Initialize self. See help(type(self)) for accurate signature.

record(event)
LT(i, t)

The tree parser lookedahead. If the type is UP or DOWN, then the ID is not really meaningful as it’s fixed–there is just one UP node and one DOWN navigation node.

PROTOCOL_VERSION = '2'
addChild(root, child)

Make childID a child of rootID.

If you are receiving this event over a socket via RemoteDebugEventSocketListener then only IDs are set.

:see antlr3.tree.TreeAdaptor.addChild()

becomeRoot(newRoot, oldRoot)

Make a node the new root of an existing root.

Note: the newRootID parameter is possibly different than the TreeAdaptor.becomeRoot() newRoot parameter. In our case, it will always be the result of calling TreeAdaptor.becomeRoot() and not root_n or whatever.

The listener should assume that this event occurs only when the current subrule (or rule) subtree is being reset to newRootID.

If you are receiving this event over a socket via RemoteDebugEventSocketListener then only IDs are set.

:see antlr3.tree.TreeAdaptor.becomeRoot()

beginBacktrack(level)
beginResync()

Indicates the recognizer is about to consume tokens to resynchronize the parser. Any consume events from here until the recovered event are not part of the parse–they are dead tokens.

commence()

Announce that parsing has begun. Not technically useful except for sending events over a socket. A GUI for example will launch a thread to connect and communicate with a remote parser. The thread will want to notify the GUI when a connection is made. ANTLR parsers trigger this upon entry to the first rule (the ruleLevel is used to figure this out).

consumeHiddenToken(t)

An off-channel input token was consumed. Trigger after the token was matched by things like match(), matchAny(). (unless of course the hidden token is first stuff in the input stream).

consumeNode(t)

Input for a tree parser is an AST, but we know nothing for sure about a node except its type and text (obtained from the adaptor). This is the analog of the consumeToken method. Again, the ID is the hashCode usually of the node so it only works if hashCode is not implemented. If the type is UP or DOWN, then the ID is not really meaningful as it’s fixed–there is just one UP node and one DOWN navigation node.

consumeToken(t)

An input token was consumed; matched by any kind of element. Trigger after the token was matched by things like match(), matchAny().

createNode(t, token=None)

Announce a new node built from token elements such as type etc…

If you are receiving this event over a socket via RemoteDebugEventSocketListener then only t.ID, type, text are set.

endBacktrack(level, successful)
endResync()

Indicates that the recognizer has finished consuming tokens in order to resychronize. There may be multiple beginResync/endResync pairs before the recognizer comes out of errorRecovery mode (in which multiple errors are suppressed). This will be useful in a gui where you want to probably grey out tokens that are consumed but not matched to anything in grammar. Anything between a beginResync/endResync pair was tossed out by the parser.

enterAlt(alt)

Because rules can have lots of alternatives, it is very useful to know which alt you are entering. This is 1..n for n alts.

enterDecision(decisionNumber)

Every decision, fixed k or arbitrary, has an enter/exit event so that a GUI can easily track what LT/consume events are associated with prediction. You will see a single enter/exit subrule but multiple enter/exit decision events, one for each loop iteration.

enterRule(grammarFileName, ruleName)

The parser has just entered a rule. No decision has been made about which alt is predicted. This is fired AFTER init actions have been executed. Attributes are defined and available etc… The grammarFileName allows composite grammars to jump around among multiple grammar files.

enterSubRule(decisionNumber)

Track entry into any (…) subrule other EBNF construct

errorNode(t)

Upon syntax error, recognizers bracket the error with an error node if they are building ASTs.

exitDecision(decisionNumber)
exitRule(grammarFileName, ruleName)

This is the last thing executed before leaving a rule. It is executed even if an exception is thrown. This is triggered after error reporting and recovery have occurred (unless the exception is not caught in this rule). This implies an “exitAlt” event. The grammarFileName allows composite grammars to jump around among multiple grammar files.

exitSubRule(decisionNumber)
location(line, pos)

To watch a parser move through the grammar, the parser needs to inform the debugger what line/charPos it is passing in the grammar. For now, this does not know how to switch from one grammar to the other and back for island grammars etc…

This should also allow breakpoints because the debugger can stop the parser whenever it hits this line/pos.

mark(marker)

The parser is going to look arbitrarily ahead; mark this location, the token stream’s marker is sent in case you need it.

nilNode(t)

A nil was created (even nil nodes have a unique ID… they are not “null” per se). As of 4/28/2006, this seems to be uniquely triggered when starting a new subtree such as when entering a subrule in automatic mode and when building a tree in rewrite mode.

If you are receiving this event over a socket via RemoteDebugEventSocketListener then only t.ID is set.

recognitionException(e)

A recognition exception occurred such as NoViableAltException. I made this a generic event so that I can alter the exception hierachy later without having to alter all the debug objects.

Upon error, the stack of enter rule/subrule must be properly unwound. If no viable alt occurs it is within an enter/exit decision, which also must be rewound. Even the rewind for each mark must be unwount. In the Java target this is pretty easy using try/finally, if a bit ugly in the generated code. The rewind is generated in DFA.predict() actually so no code needs to be generated for that. For languages w/o this “finally” feature (C++?), the target implementor will have to build an event stack or something.

Across a socket for remote debugging, only the RecognitionException data fields are transmitted. The token object or whatever that caused the problem was the last object referenced by LT. The immediately preceding LT event should hold the unexpected Token or char.

Here is a sample event trace for grammar:

b : C ({;}A|B) // {;} is there to prevent A|B becoming a set
D

;

The sequence for this rule (with no viable alt in the subrule) for input ‘c c’ (there are 3 tokens) is:

commence LT(1) enterRule b location 7 1 enter decision 3 LT(1) exit decision 3 enterAlt1 location 7 5 LT(1) consumeToken [c/<4>,1:0] location 7 7 enterSubRule 2 enter decision 2 LT(1) LT(1) recognitionException NoViableAltException 2 1 2 exit decision 2 exitSubRule 2 beginResync LT(1) consumeToken [c/<4>,1:1] LT(1) endResync LT(-1) exitRule b terminate
rewind(marker=None)

After an arbitrairly long lookahead as with a cyclic DFA (or with any backtrack), this informs the debugger that stream should be rewound to the position associated with marker.

semanticPredicate(result, predicate)

A semantic predicate was evaluate with this result and action text

setTokenBoundaries(t, tokenStartIndex, tokenStopIndex)

Set the token start/stop token index for a subtree root or node.

If you are receiving this event over a socket via RemoteDebugEventSocketListener then only t.ID is set.

terminate()

Parsing is over; successfully or not. Mostly useful for telling remote debugging listeners that it’s time to quit. When the rule invocation level goes to zero at the end of a rule, we are done parsing.

class schrodinger.application.desmond.antlr3.debug.DebugEventSocketProxy(recognizer, adaptor=None, port=None, debug=None)

Bases: schrodinger.application.desmond.antlr3.debug.DebugEventListener

A proxy debug event listener that forwards events over a socket to a debugger (or any other listener) using a simple text-based protocol; one event per line. ANTLRWorks listens on server socket with a RemoteDebugEventSocketListener instance. These two objects must therefore be kept in sync. New events must be handled on both sides of socket.

DEFAULT_DEBUGGER_PORT = 49100
__init__(recognizer, adaptor=None, port=None, debug=None)

Initialize self. See help(type(self)) for accurate signature.

log(msg)
handshake()
write(msg)
ack()
transmit(event)
commence()

Announce that parsing has begun. Not technically useful except for sending events over a socket. A GUI for example will launch a thread to connect and communicate with a remote parser. The thread will want to notify the GUI when a connection is made. ANTLR parsers trigger this upon entry to the first rule (the ruleLevel is used to figure this out).

terminate()

Parsing is over; successfully or not. Mostly useful for telling remote debugging listeners that it’s time to quit. When the rule invocation level goes to zero at the end of a rule, we are done parsing.

enterRule(grammarFileName, ruleName)

The parser has just entered a rule. No decision has been made about which alt is predicted. This is fired AFTER init actions have been executed. Attributes are defined and available etc… The grammarFileName allows composite grammars to jump around among multiple grammar files.

enterAlt(alt)

Because rules can have lots of alternatives, it is very useful to know which alt you are entering. This is 1..n for n alts.

exitRule(grammarFileName, ruleName)

This is the last thing executed before leaving a rule. It is executed even if an exception is thrown. This is triggered after error reporting and recovery have occurred (unless the exception is not caught in this rule). This implies an “exitAlt” event. The grammarFileName allows composite grammars to jump around among multiple grammar files.

enterSubRule(decisionNumber)

Track entry into any (…) subrule other EBNF construct

exitSubRule(decisionNumber)
enterDecision(decisionNumber)

Every decision, fixed k or arbitrary, has an enter/exit event so that a GUI can easily track what LT/consume events are associated with prediction. You will see a single enter/exit subrule but multiple enter/exit decision events, one for each loop iteration.

exitDecision(decisionNumber)
consumeToken(t)

An input token was consumed; matched by any kind of element. Trigger after the token was matched by things like match(), matchAny().

consumeHiddenToken(t)

An off-channel input token was consumed. Trigger after the token was matched by things like match(), matchAny(). (unless of course the hidden token is first stuff in the input stream).

LT(i, o)

The tree parser lookedahead. If the type is UP or DOWN, then the ID is not really meaningful as it’s fixed–there is just one UP node and one DOWN navigation node.

LT_token(i, t)
mark(i)

The parser is going to look arbitrarily ahead; mark this location, the token stream’s marker is sent in case you need it.

rewind(i=None)

After an arbitrairly long lookahead as with a cyclic DFA (or with any backtrack), this informs the debugger that stream should be rewound to the position associated with marker.

beginBacktrack(level)
endBacktrack(level, successful)
location(line, pos)

To watch a parser move through the grammar, the parser needs to inform the debugger what line/charPos it is passing in the grammar. For now, this does not know how to switch from one grammar to the other and back for island grammars etc…

This should also allow breakpoints because the debugger can stop the parser whenever it hits this line/pos.

recognitionException(exc)

A recognition exception occurred such as NoViableAltException. I made this a generic event so that I can alter the exception hierachy later without having to alter all the debug objects.

Upon error, the stack of enter rule/subrule must be properly unwound. If no viable alt occurs it is within an enter/exit decision, which also must be rewound. Even the rewind for each mark must be unwount. In the Java target this is pretty easy using try/finally, if a bit ugly in the generated code. The rewind is generated in DFA.predict() actually so no code needs to be generated for that. For languages w/o this “finally” feature (C++?), the target implementor will have to build an event stack or something.

Across a socket for remote debugging, only the RecognitionException data fields are transmitted. The token object or whatever that caused the problem was the last object referenced by LT. The immediately preceding LT event should hold the unexpected Token or char.

Here is a sample event trace for grammar:

b : C ({;}A|B) // {;} is there to prevent A|B becoming a set
D

;

The sequence for this rule (with no viable alt in the subrule) for input ‘c c’ (there are 3 tokens) is:

commence LT(1) enterRule b location 7 1 enter decision 3 LT(1) exit decision 3 enterAlt1 location 7 5 LT(1) consumeToken [c/<4>,1:0] location 7 7 enterSubRule 2 enter decision 2 LT(1) LT(1) recognitionException NoViableAltException 2 1 2 exit decision 2 exitSubRule 2 beginResync LT(1) consumeToken [c/<4>,1:1] LT(1) endResync LT(-1) exitRule b terminate
beginResync()

Indicates the recognizer is about to consume tokens to resynchronize the parser. Any consume events from here until the recovered event are not part of the parse–they are dead tokens.

endResync()

Indicates that the recognizer has finished consuming tokens in order to resychronize. There may be multiple beginResync/endResync pairs before the recognizer comes out of errorRecovery mode (in which multiple errors are suppressed). This will be useful in a gui where you want to probably grey out tokens that are consumed but not matched to anything in grammar. Anything between a beginResync/endResync pair was tossed out by the parser.

semanticPredicate(result, predicate)

A semantic predicate was evaluate with this result and action text

consumeNode(t)

Input for a tree parser is an AST, but we know nothing for sure about a node except its type and text (obtained from the adaptor). This is the analog of the consumeToken method. Again, the ID is the hashCode usually of the node so it only works if hashCode is not implemented. If the type is UP or DOWN, then the ID is not really meaningful as it’s fixed–there is just one UP node and one DOWN navigation node.

LT_tree(i, t)
serializeNode(buf, t)
nilNode(t)

A nil was created (even nil nodes have a unique ID… they are not “null” per se). As of 4/28/2006, this seems to be uniquely triggered when starting a new subtree such as when entering a subrule in automatic mode and when building a tree in rewrite mode.

If you are receiving this event over a socket via RemoteDebugEventSocketListener then only t.ID is set.

errorNode(t)

Upon syntax error, recognizers bracket the error with an error node if they are building ASTs.

createNode(node, token=None)

Announce a new node built from token elements such as type etc…

If you are receiving this event over a socket via RemoteDebugEventSocketListener then only t.ID, type, text are set.

becomeRoot(newRoot, oldRoot)

Make a node the new root of an existing root.

Note: the newRootID parameter is possibly different than the TreeAdaptor.becomeRoot() newRoot parameter. In our case, it will always be the result of calling TreeAdaptor.becomeRoot() and not root_n or whatever.

The listener should assume that this event occurs only when the current subrule (or rule) subtree is being reset to newRootID.

If you are receiving this event over a socket via RemoteDebugEventSocketListener then only IDs are set.

:see antlr3.tree.TreeAdaptor.becomeRoot()

addChild(root, child)

Make childID a child of rootID.

If you are receiving this event over a socket via RemoteDebugEventSocketListener then only IDs are set.

:see antlr3.tree.TreeAdaptor.addChild()

setTokenBoundaries(t, tokenStartIndex, tokenStopIndex)

Set the token start/stop token index for a subtree root or node.

If you are receiving this event over a socket via RemoteDebugEventSocketListener then only t.ID is set.

setTreeAdaptor(adaptor)
getTreeAdaptor()
serializeToken(t)
PROTOCOL_VERSION = '2'
escapeNewlines(txt)