Summary: MathML infrastructure consists of elements that affects the structure of display, Unicode characters that gives wide spectrum of character represnetation and attributes that provides special rendering effect.
The MathML infrastructure comprises of three important segments of resource base :
The MathML captures markup requirements for mathematical content display through layered encapsulation of markup design, using three groups of elements (token, layout and table elements).
The underlying philosophy of the mark up design is easily understood in terms of the implementation of a fraction in
| Layered coding |
|---|
![]() |
We see that “mfrac” element provides the core structure for nominator and denominator. Subsequently, either of numerator or denominator or both may have elements, which may further require display in two dimensions. Matter of fact, numerator expression, in the illustration above, consists of nested super scripts.
This type of layered coding design is an immense help in generating code for MathML. In contrast to the two dimensional display requirements, the coding requirement is essentially linear. The display of “Rogers-Ramanujan identity” is a point in the case. The figure below outlines the coding sequence. Note that we proceed writing code from left to right, encapsulating two dimensional aspects along the way, such as subscripting and ratio formation with MathML constructors. And we do it all, one after another in sequence.
| Sequential coding |
|---|
![]() |
The MathML displays data, symbols and text with the help of token elements, which are capable to render a set of certain characters. In all cases of rendering mechanism provided in MathML, the display output is a character or a series of characters. These characters are defined as the Unicode set of characters – about 900 in numbers and growing all the time to meet the requirement of mathematics in conveying special meaning to its expressions. Different font variants of a letter are treated as separate symbols as they may represent different mathematical meaning/ concept/ idea.
There are three ways to encode Unicode characters with token elements specialized in displaying characters:
1: Use the keyboard and type the characters as available on the keyboard.
2: Use the name reference known as “entity reference”. Encode the entity reference preceded by "&" and ending with semicolon “;”. For example, we would encode entity reference “beta” as "β" to represent “β”.
3: Use the corresponding Unicode numeric hexadecimal reference. For example, beta is encoded with its hexadecimal numeric reference "U003B2" as "&x003B2;" to represent “β”.
It is evident that keyboard typing is the easiest option. The difficulty is that keyboard supports only a limited numbers of characters. We can type ASCII characters holding “Alt” key and typing numbers from the numeric key pad on the right side of the keyboard. However, this approach is not very helpful as we have no clue of the visual form of the character corresponding to a number.
The best approach for encoding Unicode characters is to create a XML file with extension "xml" in the desktop Word program as instructed earlier in the course. Then, select top level “Insert” menu of the MS Word program and choose “Symbol”.
| Inserting symbols |
|---|
A “symbol” dialog box appears as shown below. It is important that we select the option “normal text” from the “Font” drop down list. Keep selecting and inserting the characters not available on the keyboard, but useful for our application as we forsee. Save the "xml" file under some meaningful name like "unicide characters" and reference the special characters as when required from the file.
| Inserting symbols |
|---|
This technique serves a great part of requirement, but for the characters which do not have visual forms like “ThickSpace”, “ApplyFunction”, “nbsp” etc., which are called ”invisible” characters. But these invisible characters are few in numbers. As such,we can store their entity reference names for easy reference as and when required.
The second approach of using entity reference like "β" is the second best approach as the names suggest the visual forms for many important characters. The third alternative to use Unicode numeric reference in the form "&x003B2;" is not desirable for those who intend to code MathML. Ofcourse, this could be the best approach for software developers – probably a numeric representation would be easier to handle in the programming algorithm.
A list of entity reference along with its display form is presented in the Appendix. The listing is divided in three modules (A to H) , (I to Q) and (R to Z) to accommodate about 900 Unicode characters. You may choose to visit these pages to have a first hand experience of what these characters look like and how they are named.
The example given here demonstrates character rendering, using each of the encoding techniques discussed above.
<m:math display="block">
<m:mtext>Alpha is displayed as : </m:mtext>
<m:mi> α</m:mi>
<m:mi> α </m:mi>
<m:mi> α</m:mi>
</m:math>
Save the file after editing as “test.xml”. The display looks like :
Some characters falling under “operator” category, beside dpsce rendering characters, do not have visual display. They are called non-marking characters and are used with “mo” element. These characters are important for the quality of print or alternative rendering like audio rendering. They provide specific and consistent spacing on the operand on which they act, thus distinguishing the mathematical operation from normal character rendering. The non-marking operator are :
"&InvisibleTime;", "⁣" and "⁡"
These non-marking characters must be distinguished from space or blank rendering characters, which are used to format display for improving readability like
" ", " " ,"&emsp4;", " "
etc. In addition to these space rendering elements, “mspace” token element is handy, where we need to implement and manage space.
MathML elements, including token elements, support an array of attributes. Its management is greatly simplified in a structured manner. There are four attribute classes and a specialized tool to manage attributes of the elements in MathML. It must, however, be emphasized these are not exclusive groups, but only they represent groups having certain common implementation features.
All MathML elements supports “class”, “style”, “id”, “xlink:href” and “xref” attributes in order to use style sheet mechanism. If the renderer does not use a style sheet, then these attributes may simply be ignored.
The style attributes of MathML elements are inherited from the rendering environment. We can change these inherited attributes from the environment in two ways. First mechanism is provided by common attribute design for token elements. Second mechanism involves using “mstyle” element belonging to layout presentation category.
Rendering by token elements is characterized by the default values used for attributes of these elements. Most of these attributes are inherited from the environment of renderer like a particular browser.
The style attributes of the individual element can, however, be set different to default values via a set of style attributes. To keep the matter simple, there is a group of attributes, which applies to five of the token elements all capable to render characters(“mi”, “mn”,”mo”, “mtext”, “ms”) and one layout element “mstyle”.
The common attribute mechanism is a great help in managing and studying style attributes for the content display. This feature allows us to concentrate on specific attributes of the element while common attributes are managed with common names for six elements.
The common style attribute values are :
The example below sets most of these attribute values and is suggestive of the ways attributes are set, using common attributes.
<m:math display="block">
<m:mrow>
<m:mi mathvariant="bold" mathsize="2em" mathcolor="Blue"> A </m:mi>
<m:mo> + </m:mo>
<m:mi mathvariant="italic" mathsize="10pt" > A </m:mi>
<m:mo> + </m:mo>
<m:mi mathvariant="bold-italic"> A </m:mi>
<m:mo> + </m:mo>
<m:mi mathvariant="double-struck"> A </m:mi>
<m:mo> + </m:mo>
<m:mi mathvariant="bold-fraktur" mathbackground="red"> A </m:mi>
<m:mo> + </m:mo>
<m:mi mathvariant="script"> A </m:mi>
<m:mo> + </m:mo>
<m:mi mathvariant="fraktur" mathsize="1cm" mathcolor="1cm"> A </m:mi>
<m:mo> + </m:mo>
<m:mi mathvariant="sans-serif" mathsize=".2in" > A </m:mi>
</m:mrow>
</m:math>
Save the file after editing as “test.xml”. The display looks like :
MathML also enables use of deprecated mechanism of setting common attributes. The deprecated attributes are :
However, if both mechanisms are used to set a particular attribute, then new mathstyle attribute prevails over deprecated mechanism.
The only layout element “mstyle” in the common attribute group is a single point specialized tool to set varieties of style attributes to elements enclosed by “mstyle” element. This is kind of one go attribute setting mechanism, which can set not only the common attributes, but additional attributes inherited from the rendering environment. Further, “mstyle” element can be used to set style attributes on any MathML elements in addition to the five (5) token elements, which implement common token attibutes. In this sense, this element serves as an extremely powerful tool to manage style attributes in MathML. Let us consider setting attributes to token elements as shown here.
<m:math display="block">
<m:mstyle mathvariant="bold" color="blue" mathsize="1cm">
<m:mi> A </m:mi>
<m:mo> + </m:mo>
<m:mi> B </m:mi>
</m:mstyle>
</m:math>
Save the file after editing as “test.xml”. The " " display looks like :
A lot of coding efficiency in mark up language like MathML depends on assigning appropriate values to the attributes. Usually, there exists more than one ways to assign attribute’s value.
Size attributes may be expressed either in terms of predefined numeric values as “small”, “normal” or “big” or as numerical values. There are vareity of units available to express numerical values. These units are used on the basis of context, some of which are expressed in relative terms with respect to the normal size : “em” or “ex” or “%”. Others are absolute values like “pt” (for point; 1 point = 1/72 inch), “px” (for pixel), “pc” (for picas; 1 pica = 12 points), “in” (for inch), “mm” (for millimeter) and “cm” (for centimeter).
In some cases, number can be assigned to an attribute. The number indicates the relative size with respect to default implementation. The "mfrac" element for example implements "linethickness" attribute. A number, say, "2" indicated that the bar will be drawsn twice in thickness with respect to the default thickness as determined by a particular renderer.
In some cases, like with "mo" element, we can assign terms of predertermined space to space controlling attributes like "lspace" and "rspace". These terms are called "named space". The named space used in MathML may be one of "veryverythinmathspace", "verythinmathspace", "thinmathspace", "mediummathspace", "thickmathspace", "verythickmathspace", or "veryverythickmathspace".
The mathcolor or color is set in three possible ways :
1: rgb scheme : Color is considered to be composed of the components red, green and blue. The component colors may be mixed in various proportions. The components are represented by single hexadecimal digit (0,1,….,9,a,b,c,d,e,f). This number is then preceded by “#” and assigned as : mathcolor =”#00f”.
2: rrggbb scheme : This is a similar to the rgb scheme except that each of the component is represented by two digit hexadecimal number. Thus, smallest pair is “00” and largest pair is “ff”. This number is then preceded by “#” and assigned as : color =”#0000ff”.
3: html-color-name : We can assign mathcolor or color attribute with HTML color names : "aqua", "black", "blue", "fuchsia", "gray", "green", "lime", "maroon", "navy", "olive", "purple", "red", "silver", "teal", "white", and "yellow".
The MathML provides control on individual levels of layout structure elements like "table" element. The control requires setting attributes on individual rows and columns. In such cases, attributes are provided with values, which are read in sequence and applied on consecutive rows and columns as the case may be.