Extending the AIR Equation Scoring Engine¶
The AIR Equation Scoring Engine supports a limited subset of the MathML standard. The most likely place that users will want to extend the engine is to implement support for MathML constructs that are not currently supported. This can be accomplished by writing subclasses of airscore.mathmlsympy.base_mathml_element.BaseMathmlElement or airscore.mathmlsympy.mathml_containers.BaseMathmlContainer, and registering the new classes with the parser using the airscore.mathmlsympy.parser.mathml_element() decorator. The interfaces for the relevant classes and methods are described below.
BaseMathmlElement¶
- class airscore.mathmlsympy.base_mathml_element.BaseMathmlElement(tag, attrib={}, **extra)¶
This is the base class for all MathML elements in the XML tree
- is_implicit_addend¶
bool() - This node may appear as the implicit addend of an integer (as the fractional part of a mixed number).
- is_implicit_multiplicand¶
bool() - When a number or a symbol appears to the left of this node, an implicit multiplication should be performed.
- is_inequality¶
bool() - This is an equal sign or inequality operator. A chained equation can be broken on this node.
- is_number¶
bool() - This node contains a number (digits, decimal points, etc).
- is_non_neg_integer¶
bool() - This node is a non-negative integer. This flag is used to detect when the numerator and denominator of a fraction contain simple numbers, allowing us to use the fraction as part of a mixed number (see is_implicit_addend).
- validate_max_children¶
int() - Maximum number of children permitted for this node.
- validate_min_children¶
int() - Minimum number of children permitted for this node.
- validate_no_text¶
bool() - If True, it is an error for this node to contain text directly. Text may still exist inside of nested nodes.
- validate_required_attributes¶
set of str() - A set containing the names of required attributes. Not namespace aware.
- decoded_text¶
unicode The text content
The default implementation decodes a limited dictionary of common unicode values to sympy equivalents.
- get_sympy_text()¶
Get a string representing this node, including children
This is the main method that you will need to override in order to control how this node and its children are rendered in the sympy output.
The default implementation simply returns the decode_text attribute.
Override to_sympy() instead if you need to control how this node relates to its neighbors.
Returns: unicode - The string representing this node in a sympy expression
- pick_subclass()¶
Change the leopard’s spots to zebra stripes
The xml.etree.ElementTree parsing mechanism forces us to choose the class for new elements when the start tag has been parsed, but before any of the content has been read. There are a few MathML constructs for which we need different classes, but they are represented by the same start tag. Our parser calls this method after the end tag has been processed, in order to give the element a chance to make any changes it needs to make to finalize its class selection.
In most cases, this method does nothing (the default behavior). In a few cases, however, this method will assign a new value to the instance’s __class__ property in order to change the object into an instance of a subclass of its original class.
We are using an admittedly obscure Python “feature,” and I can’t recommend that you make a habit of altering the classes of existing objects. But for this limited purpose it seemed the cleanest solution.
Returns: None
- to_sympy(tail=<airscore.mathmlsympy.partial_sympy_object.PartialSympyObject object at 0x7f8e34989b90>)¶
Return oneself as a “partial sympy object”
You will need to override this routine if you are changing the way this node combines with its neighbor nodes. The default implementation concatenates this node with its right-hand neighbor, adding a mutliplcation operator to the list first if the right-hand neighbor is a suitable implicit multiplicand.
Override get_sympy_text() instead if you need to change how this node and its children are represented in the output, but not how this node is related to its siblings.
Parameters: tail (airscore.mathmlsympy.partial_sympy_object.PartialSympyObject) – The head of a linked list of sympy objects which will become the tail of the newly created object Returns: airscore.mathmlsympy.partial_sympy_object.PartialSympyObject
- validate()¶
Validate the node content
This method is called during the processing of the XML end tag, immediately after the call to pick_subclass(). Subclasses should perform any required validation of the newly-created MathML object. An error should be raised for a validation failure (usually ValueError)
The default implementation performs the following steps:
- Validate the number of children against the validate_min_children and validate_max_children properties
- If validate_no_text is True, confirm that the element contains no text.
- Confirm that children (if any) are subclasses of BaseMathmlElement
- Confirm that any attributes listed in validate_required_attributes are present (but perform no validation on the attribute values)
Returns: None Raises: ValueError
BaseMathmlContainer¶
- class airscore.mathmlsympy.mathml_containers.BaseMathmlContainer(tag, attrib={}, **extra)¶
The base class for all MathML container elements.
The base class for all MathML elements that can contain an arbitrary list of other MathML elements. This includes elements like <math> and <row>, as well as elements listed in the MathML spec as containing an implicit <row> element.
- get_sympy_text_list()¶
Return a list containing text representations of simple equations or inequalities.
This method is the main loop of the parser for most MathML expressions. Most containers will not need to override this method, but it is worth understanding how it works. One oddity worth noting is that the parser parses the expression from right to left, instead of the usual left-to-right.
If the expression is a “chained” equation, containing more than one equals or inequality operator, then this function will return multiple strings in its return list. Each return value will be a simple equation or inequality–i.e., one that contains only one equals or inequality operator.
If the expression is already a simple equation or inequality, then a list containing a single string represnting that equation or inequality will be returned.
If the expression contains no equality or inequality operators, then a list containing a single string representing that expression will be returned.
Returns: list() of str()
mathml_element()¶
- airscore.mathmlsympy.parser.mathml_element(*args)¶
A decorator which registers the decorated class as a mathml element class
This decorator registers the association between a class and an element name for the MathMLBuilder. In order to have the MathMLBuilder use a particular class instead of the default xml.etree.ElementTree.Element class to represent a given XML element, decorate your class definition with this decorator.
There are three permitted calling conventions. You can use the decorator without arguments, like so:
@mathml_element class bob( BaseMathmlElement ): ...
in which case the new class will be used for elements named <bob> in the MathML namespace. This usage is not ideal, as it requires your Python class to have the same name (including case) as the MathML element.
You avoid this problem by specifying an element name like this:
@mathml_element( 'bob' ) class MathMLBob( BaseMathmlElement ): ...
Finally, if you have a class that should be associated with multiple MathML tag names, you can specify all of the names as arguments to the mathml_element decorator()
@mathml_element( 'bob', 'jim', 'joe' ) class MathMLBob( BaseMathmlElement ): ...
In every case, the classes will only be used for elements in the MathML namespace (http://www.w3.org/1998/Math/MathML)
No special effort beyond the use of this decorator is required to register new classes for handling MathML elements. However, you must be certain that the modules containing your classes have been imported before attempting to process the XML data.
PartialSympyObject¶
- class airscore.mathmlsympy.partial_sympy_object.PartialSympyObject(el, tail)¶
An element in a linked list that represents a Sympy expression
Parameters: - el (BaseMathmlElement) – The mathml element from which this object is derived
- tail (PartialSympyObject) – The next item in the list
- next¶
PartialSympyObject - The next rightward neighbor in the list.
- is_closed¶
bool() - Used in balancing absolute value bars. True if the parser has encountered an odd number of absolute value bars to the right of this point.
- is_implicit_mutliplicand¶
bool() - The is_implicit_multiplicand attribute of the airscore.mathmlsympy.base_mathml_element.BaseMathmlElement that generated this object.
- is_implicit_addend¶
bool() - The is_implicit_addend attribute of the airscore.mathmlsympy.base_mathml_element.BaseMathmlElement that generated this object.
- is_number¶
bool() - The is_number attribute of the airscore.mathmlsympy.base_mathml_element.BaseMathmlElement that generated this object.
- text¶
unicode() - The result of the BaseMathmlElement.get_sympy_text() method of the airscore.mathmlsympy.base_mathml_element.BaseMathmlElement object that generated this object.