Skip to content Skip to navigation

Connexions

You are here: Home » Content » MathML infrastructure

Navigation

Content Actions

  • Download module PDF
  • Add to ...
    Add the module to:
    • My Favorites
    • A lens
    • An external social bookmarking service
    • My Favorites (What is 'My Favorites'?)
      'My Favorites' is a special kind of lens which you can use to bookmark modules and collections directly in Connexions. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need a Connexions account to use 'My Favorites'.
    • A lens (What is a lens?)

      Definition of a lens

      Lenses

      A lens is a custom view of Connexions content. You can think of it as a fancy kind of list that will let you see Connexions through the eyes of organizations and people you trust.

      What is in a lens?

      Lens makers point to Connexions materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

      Who can create a lens?

      Any individual Connexions member, a community, or a respected organization.

    • External bookmarks
  • E-mail the author

Recently Viewed

This feature requires Javascript to be enabled.

MathML infrastructure

Module by: Sunil Kumar Singh

Summary: MathML infrastructure consists of Unicode characters, attributes for special rendering effect and structured design to present mathematics in a document.

Note: You are viewing an old version of this document. The latest version is available here.

The MathML infrastructure comprises of three important segments of resource base :

MathML resource base

  • Elements
  • Unicode character set
  • Attributes

Elements Elements

1: Elements : The MathML captures markup requirements for mathematical content display through layered encapsulation of markup design, using three groups of elements (token, layout and table elements).

The underlying philosophy of the mark up design is easily understood in terms of the implementation of a fraction in P Q P Q form. Once the basic form of the fraction is coded, then second layer of coding is taken up for the numerator expression, followed by coding for the denominator expression. Thus, we see that coding paradigm for a fraction in the form of P Q P Q is layered around the core structure of “mfrac” element. It must be emphasized that we refer layering in reference to the sequence of coding requirements, which should be written one after another and not the display, which is structured in two dimensions. The illustration shown in the figure displays a portion of “Rogers-Ramanujan identity”.

Figure 1: Coding is implemented around key layout elements
Layered coding
Layered coding (layereddesign.gif)

We see that “mfrac” element provides the core structure of nominator and denominator. Subsequently, each of them may have elements which may require display in two dimensions. Matter of fact, numerator expression consists of nested super scripting requirement.

This type of layered coding design is an immense help in generating code for MathML. In contrast to the two dimensional display requirements, the coding requirement is essentially linear. The display of “Rogers-Ramanujan identity” is a point in the case. The figure below outlines the coding sequence, involved to display the content in MathML. Note that we proceed from left to write encapsulating two dimensional aspects, such as subscripting and ratio formation with MathML constructors capable to provide two dimensional lay out on display. And we do it one after another in sequence. This means that coding MathML is a linear task.

Figure 2: Coding is implemented in sequence
Layered
Layered (coding.gif)

Unicode character set Unicode character set

The MathML displays data, symbols and text with the help of token elements, which are capable to render set of certain characters. In all cases of rendering mechanism provided in MathML, the display output is a character or a series of characters. These characters are defined as the Unicode set of characters – about 900 in numbers and growing all the time to meet the requirement of mathematics in conveying special meaning to its expressions. Different font variants of a letter, for this reason, are treated as separate symbols as they may represent different mathematical meaning/ concept/ idea.

There are three ways to encode Unicode characters with token elements specialized in displaying characters:

1: Use the keyboard and type the characters as available on the keyboard.

2: Use the name reference known as “entity reference”. Encode the entity reference preceded by “&” and ending with semicolon “;”. For example, we would encode entity reference “beta” as “β” as content of the token elements.

3: Use the corresponding Unicode numeric hexadecimal reference. For example, beta is encoded with its hexadecimal numeric reference "U003B2" as “β”.

It is evident that keyboard typing is the easiest option. The difficulty is that keyboard supports only a limited numbers of mostly commonly used characters. We can also type ASCII characters holding “Alt” key and typing numbers from the numeric key pad on the right side of the keyboard. However, this approach is not very helpful as we have no clue of visual forms just by typing some numbers.

The best approach is to create a “xml” file in the desktop Word program as instructed earlier in the course. Then, select top level “Insert” menu of the MS Word program and choose “Symbol”.

Figure 3:
Inserting symbols
Inserting symbols (symbol1.JPG)

A “symbol” dialog box appears as shown below. It is important that we select the option “normal text” from the “Font” drop down list. Keep selecting and inserting the characters not available on the keyboard, but useful as we think for our application. Save the "xml" file and reference the special characters as when required from the file.

Figure 4:
Inserting symbols
Inserting symbols (symbol2.JPG)

This serves a great part of requirement, but for the characters which do not have visual forms like “ThickSpace”, “ApplyFunction”, “nbsp” entity references called the ”invisible” characters. But these invisible characters are few in numbers. As such,we can store their entity reference names for easy reference as and when required.

The second approach of using entity reference as “β” is the second best approach as their name suggests their visual forms for many important characters. The third alternative to use Unicode numeric reference in the form “β” is not desirable for those who intend to code MathML. Ofcourse, this could be the best approach for software developers – probably a numeric representation would be easier to handle in programming algorithm.

A list of entity reference along with its display form is presented in the Appendix. The listing is divided in three modules (A to H) , (I to Q) and (R to Z) to accommodate about 900 Unicode characters. You may choose to visit these pages to have a first hand experience of what these characters look like and how they are named.

The example given here demonstrates character rendering, using each of the encoding techniques.

Example 1: Display using Unicode hexadecimal number


	    
    <m:math display="block">
       <m:mtext>Alpha is displayed as :  </m:mtext>
       <m:mi> α</m:mi>
       <m:mi> &alpha; </m:mi>
       <m:mi> &#x003B1;</m:mi>
    </m:math> 

	    
	  

Save the file after editing as “test.xml”. The display looks like :

Alpha is displayed as : α α α Alpha is displayed as : α α α

Some characters falling under “operator” category do not have visual display. They are called non-marking characters and are used with “mo” element. These characters are important for the quality of print or alternative rendering like audio rendering. They provide specific and consistent spacing on the operand on which they act, thus distinguishing the mathematical operation from normal character rendering. The non-marking operator are :

"&InvisibleTime;", "&InvisibleComma;" and "&ApplyFunction;"

These non-marking characters must be distinguished from space or blank rendering characters, which are used to format display for improving readability like "&nbsp;", "&ThinSpace;" ,"&emsp4;", "&emsp13;" etc. Further, “mspace” token element is handy, where we need to implement space management in addition to space rendering elements.

Managing attributes Managing attributes

MathML elements, including token elements, support an array of attributes. Its management, however, is greatly simplified in a structured manner. There are four attribute classes and a specialized tool to manage attributes of the elements in MathML.

Attribute class

  1. Attributes common to all MathML elements
  2. Attributes inherited from the rendering enviornment
  3. Attributes common to a group of token elements
  4. Attributes specific to an element (additional attributes)

All MathML elements supports “class”, “style”, “id”, “xlink:href” and “xref” attributes in order to use style sheet mechanism. If the renderer does not use a style sheet, then these attributes may simply be ignored.

The style attributes of MathML elements are inherited from the rendering environment. We can change these inherited attributes from the environment in two ways. First mechanism is provided by common attributes of token elements. Second mechanism is using “mstyle” element belonging to layout presentation elements.

Attributes common to a group of token elements Attributes common to a group of token elements

Rendering by token elements is characterized by the default values used for attributes of these elements. Most of these attributes are inherited from the environment of renderer like a particular browser.

The style attributes of the individual element can, however, be set different to default values via a set of style attributes. To keep the matter simple, there is a group of attributes, which applies to five of the token elements (“mi”, “mn”,”mo”, “mtext”, “ms”) and one layout element “mstyle”.

The common attribute mechanism is a great help in managing and studying style attributes for the content display. This feature allows us to concentrate on only the element specific attributes.

The common style attribute values are :

Attribute values types

  • mathvariant (default =normal for all, italic for “mi”) : normal | bold | italic | bold-italic | double-struck | bold-fraktur | script | bold-script | fraktur | sans-serif | bold-sans-serif | sans-serif-italic | sans-serif-bold-italic | monospace
  • mathsize (default = inherited) : small | normal | big | number vertical unit
  • mathcolor (default = inherited) : #rgb | #rrggbb | html-color-name
  • mathbackground (default = inherited) : #rgb | #rrggbb | html-color-name

The example below sets most of these attribute values and is suggestive of the ways attributes are set, using common attributes.

Example 2: Using common attributes


	    
    <m:math display="block">
      <m:mrow> 
        <m:mi mathvariant="bold" mathsize="2em" mathcolor="Blue"> A </m:mi> 
        <m:mo> + </m:mo> 
        <m:mi mathvariant="italic" mathsize="10pt" > A </m:mi> 
        <m:mo> + </m:mo> 
        <m:mi mathvariant="bold-italic"> A </m:mi> 
        <m:mo> + </m:mo> 
        <m:mi mathvariant="double-struck"> A </m:mi> 
        <m:mo> + </m:mo> 
        <m:mi mathvariant="bold-fraktur" mathbackground="red">  A </m:mi> 
        <m:mo> + </m:mo> 
        <m:mi mathvariant="script"> A </m:mi> 
        <m:mo> + </m:mo> 
        <m:mi mathvariant="fraktur" mathsize="1cm" mathcolor="1cm"> A </m:mi> 
        <m:mo> + </m:mo> 
        <m:mi mathvariant="sans-serif" mathsize=".2in" > A </m:mi> 
      </m:mrow> 
    </m:math> 

	    
	  

Save the file after editing as “test.xml”. The display looks like :

A + A + A + A + A + A + A + A A + A + A + A + A + A + A + A

MathML also enables use of deprecated mechanism of setting these common attributes. The deprecated attributes are :

Attribute values types

  • Fontsize : number v-unit
  • fontstyle(normal | italic)
  • fontweight (normal | bold)
  • color : #rgb | #rrggbb | html-color-name
  • fontfamily : string

However, if both mechanisms are used to set a particular attribute, then new mathstyle attribute shall prevail over deprecated mechanism. The following code, for example, would change the color of the character “red” – not “blue” as set by the deprecated mechanism.

Specialized style setting tool Specialized style setting tool

The only layout element “mstyle” in the common attribute group is a single point specialized tool to set varieties of style attributes to elements enclosed by “mstyle” element. This is kind of one go attribute setting mechanism, which can set not only the common attributes, but more attributes inherited from the rendering environment. Further, “mstyle” element can be used to set style attributes on any MathML elements in addition to the five (5) token elements listed above. In this sense, this element serves as an extremely powerful tool to manage attributes in MathML. Let us consider setting attributes to token elements as shown here.

Example 3: Using common attributes


	    
    <m:math display="block">
      <m:mstyle mathvariant="bold" color="blue" mathsize="1cm"> 
        <m:mi> A </m:mi> 
        <m:mo> + </m:mo> 
        <m:mi> B </m:mi> 
        </m:mstyle> 
    </m:math> 

	    
	  

Save the file after editing as “test.xml”. The "&nbsp;" display looks like :

A + B A + B

Attribute values Attribute values

A lot of coding efficiency in mark up language like MathML depends on assigning appropriate values to the attributes. Usually, there exists more than one ways to assign attribute’s value.

Size attributes may be expressed either as “small”, “normal” or “big” when specified or as numerical values. There are different types of units available to express numerical values. These units are used on the basis of context, some of which are expressed in relative terms with respect to the normal size : “em” or “ex” or “%”. Others are absolute values like “pt” (for point; 1 point = 1/72 inch), “px” (for pixel), “pc” (for picas; 1 pica = 12 points), “in” (for inch), “mm” (for millimeter) and “cm” (for centimeter).

The mathcolor or color is set in three possible ways :

1: rgb scheme : Color is considered to be composed of the components red, green and blue. The component colors may be mixed in various proportions. The components are represented by single hexadecimal digit (0,1,….,9,a,b,c,d,e,f). This number is then preceded by “#” and assigned as : mathcolor =”#00f”.

2: rrggbb scheme : This is a similar to the rgb scheme except that each of the component is represented by two digit hexadecimal number. Thus, smallest pair is “00” and largest pair is “ff”. This number is then preceded by “#” and assigned as : color =”#0000ff”.

3: html-color-name : We can assign mathcolor or color attribute with HTML color names : "aqua", "black", "blue", "fuchsia", "gray", "green", "lime", "maroon", "navy", "olive", "purple", "red", "silver", "teal", "white", and "yellow".

Comments, questions, feedback, criticisms?

Send feedback