File r38/packages/mathml/introduction.tex from the latest check-in


\chapter{Introduction}

Nearly eight years after the appearance of the World Wide Web, it is still a difficult medium to use for the transmission
of mathematics and scientific material in spite of its success in other areas. Sending mathematics via e-mail or reading
mathematics into a software package from a web page is not a simple task, depriving the scientific community from a
powerful communications tool which is the Internet. Likewise, displaying mathematics on the Internet in a way that allows
editing and reuse has until now been impossible.

As the Internet continues to grow it is becoming ever more important to facilitate the exchange of mathematics amongst
users and computer algebra software packages, offering automatic processing of expressions, searching, editing and reuse.

To overcome these difficulties, various companies and societies have joined together to produce standards for representing mathematics whilst
preserving mathematical meaning. The World Wide Web Consortium\index{World Wide Web Consortium}~\cite{w3c} and the OpenMath\index{OpenMath
Society} society~\cite{openmath} have developed the two leading standards currently receiving most attention. These are MathML\index{MathML}
\cite{mathml} and OpenMath\index{OpenMath} \cite{openmathspec} respectively.

The chief purpose of OpenMath\index{OpenMath} is to facilitate consistent communication of mathematics between
mathematical applications. MathML\index{MathML} however, concentrates on displaying mathematics on the web whilst
maintaining its meaning. Both standards are complementary and used together can provide the opportunity to expand our
ability to represent, encode and successfully communicate mathematical ideas with one another across the Internet.

The primary aim of this project is to understand the differences and similarities between OpenMath\index{OpenMath} and
MathML\index{MathML}, to assess their exchangeability and develop a way of mapping one standard to the other. The main
objective will be to ultimately design and implement an interface running on REDUCE\index{REDUCE} which will translate
OpenMath\index{OpenMath} into MathML\index{MathML} and vice versa. This interface will provide REDUCE\index{REDUCE} with
the capability of exchanging mathematics with other applications as well as displaying output on the World Wide Web and
reading from it, allowing REDUCE to join the MathML/OpenMath trend.

\chapter{Literature Review}

The notation of mathematics has constantly evolved with the appearance of new concepts and ideas. Modern mathematical
notation is the result of centuries of refinement. As a result of this, the sophisticated symbols with which we write
mathematics pose certain problems when bringing them onto printed paper. Publishing mathematics is a difficult task simply
because mathematics do not lend themselves easily to publication.

Recently, the advances in Internet publishing, following the Internet expansion, have added a new dimension to
mathematical publishing. New problems as well as new requirements must be dealt with. We want the Internet not only to be
a medium for displaying mathematics around the world, but also a communications tool for transmitting them.

How can we ensure that mathematics published on a web page are reusable? Editable? The outputs of one application should
be displayed on the Internet in a way humans can understand and other applications can reuse. But because there is a
distinction between presenting mathematical objects, and transmitting their content, merging both into one notation to
achieve this duality is a non-trivial task.

In order to fully understand the motivations of this project, as well as appreciating its outcome, it is important to
carefully illustrate any related issues. We will look into the development of mathematical publishing and how it has
evolved with the growth of the Internet. This will permit us to better understand the need for mathematical representation
standards such as MathML\index{MathML} and OpenMath\index{OpenMath} which we shall introduce. Finally we will talk about
the relation between these standards, the existing software supporting them, and their future.

With such an overview of the current situation, the necessity of a MathML\index{MathML} to OpenMath\index{OpenMath}
interface for REDUCE\index{REDUCE} will become clear.

\section{Mathematical Publishing}

Before the foundation of the World Wide Web, encoding of mathematical documents was already a widespread practice. Back in
the days when computers were starting to become popular, the ASCII\index{ASCII} character set (and encodings based on it)
was the only widely available encoding scheme. The restrictions of such a limited symbol set were soon apparent.

In the mid seventies, Donald Knuth developed \TeX\index{\TeX}, from which variants such as \LaTeX\index{\LaTeX} stemmed. Layout and
typesetting of mathematics is extremely demanding and until now, Donald Knuth's \TeX\index{\TeX} had been able to address
these difficulties in a successful way, appealing to the scientific community who has now made it a standard in scientific
publishing. \TeX\index{\TeX} has become the tool of choice for producing scientific and mathematical documents.

Despite its widespread use and ease with which it is authored, \TeX\index{\TeX} does not preserve mathematical semantic
value, making it unpractical for use in web documents and useless for transmission between applications. \TeX\index{\TeX}
is only concerned with describing the presentation of mathematics, not the content. Because people are interested in
transmitting their ideas and research via e-mail or web pages it is fundamental that semantic value is kept.

While \TeX\index{\TeX} is mainly a UNIX based application, PC applications dealing with mathematical encoding have also emerged. Generally these
are equipped with a graphical user interface making them easier to use: Design Science\index{Design Science}'s MS Word Equation Editor,
FrameMaker\index{FrameMaker}, WordPerfect\index{WordPerfect} or ScientificWord\index{ScientificWord} are a few to name examples. All these
applications\footnote{It is worth noting that PC applications have not had the same success as \TeX\index{\TeX}.} just deal with displaying
mathematics and ignore semantic value. They are usually vendor specific making them unpractical for use in mathematical web publishing.

\section{Mathematics and the Internet Challenge}

\subsection{Html and Mathematics}

In the early 1990's, The World Wide Web Consortium\index{World Wide Web Consortium}'s Html \index{Html} became the
standard markup language for publishing on the World Wide Web. It has since evolved and has become an extensible and very
powerful means of representing interactive Internet documents. In terms of representing mathematics however, Html has
little support.

In the first versions of Html\index{Html} , no support for mathematics was included. It was not until 1993 that the first
intent of embedding mathematics within Internet documents was attempted in the Html+\index{Html!Html+} draft \cite{htmlp}
presented by the World Wide Web Consortium\index{World Wide Web Consortium}. Equations were represented directly as
Html+\index{Html!Html+} using an SGML\index{SGML} \cite{sgml} based notation, inspired by \LaTeX's\index{\LaTeX} approach.

In 1994, the World Wide Web Consortium\index{World Wide Web Consortium} went further in mathematics Internet publishing by
presenting the Html 3.0\index{Html!Html 3.0} draft \cite{html3} (which later was officially published as the Html
3.2\index{Html!Html 3.2} \cite{html3.2} specification with a few modifications) which offered a more comprehensive support.
They claimed {\it ``Html math is powerful enough to describe the range of math expressions you can create in common word
processing packages, as well as being suitable for rendering to speech.''}

Nonetheless, both drafts failed because of lack of interest from popular browser vendors. But even though the mathematical
ideas in the Html 3.2\index{Html!Html 3.2} specification were never fully deployed, people started thinking more carefully
about mathematics, and how they could be represented on the WWW.

In the meantime, while the World Wide Web Consortium\index{World Wide Web Consortium} and other societies continued
working on developing mathematical support for Internet documents, other solutions to transmitting mathematics on the web
arose. The lack of a standard approach to uniformly represent mathematics on the Internet pushed mathematicians and
scientists to use a variety of different techniques to achieve this purpose. Let us give a brief overview of the main
ones.
	
\subsection{Embedded Graphics}

One way of displaying mathematics on the web is by the use of embedded graphics inside Html documents. Mathematical
equations are represented by graphical images (e.g. gifs) which all browsers display without difficulties. Formulae can be
viewed in their original rendering, without the browser requiring additional fonts or external viewing programs.

Nevertheless, these images display low resolutions and printing them results in poor quality documents. There are also
problems with alignment and sizing. Because graphical images are generally slow to download, documents might take more
time than desired to be rendered. Since we are only dealing with images, the equations are not editable. No modifications
can be done on them. For the same reasons, they are not reusable, because semantic value is completely lost.

This method is widespread but not very appreciated. In the Html 3.0\index{Html!Html 3.0} draft, the World Wide Web
Consortium\index{World Wide Web Consortium} specifically states its intention of helping users avoid the use of inline
images to display equations.

This is the approach used by programs such as \LaTeX\index{\LaTeX}2Html \cite{latex2html} or \TeX\index{\TeX}4ht
\cite{tex4ht} which can convert \LaTeX\index{\LaTeX} and \TeX\index{\TeX} documents to Html\index{Html} format for direct insertion into the
Internet. \LaTeX\index{\LaTeX} markup is translated into Html while mathematical equations are converted into graphical
images. It is worth noting however, that there exist programs such as TtM\index{TtM} \cite{TtM} which translate the
mathematical sections directly into MathML\index{MathML} presentation markup \index{MathML!presentation markup}.

\subsection{Graphical Page Display}

Another way of approaching the problem is by using graphical page displays. The page is rendered into a page-description
language such as postscript\index{postscript} or PDF\index{PDF}. Internet browsers, aided by an external viewer or plug-in
can then display the page in its integrity, including any mathematical formulae within it. When using this method,
documents are displayed with exactly the same layout as the original documents, which could be \TeX\index{\TeX} documents
for instance. The printing resolution is also maintained at a high quality level.

But using an external viewer or plug-in involves everyone possessing a copy. A viewer also requires a verbose and large
file format including all the non-standard fonts used. Just in the same way as the embedded graphics display, any
mathematics contained within these documents looses its semantic value, as well as the possibility to edit it or modify
it.



\section{OpenMath\index{OpenMath} and MathML\index{MathML}}

These interim solutions have only contributed to the problem by putting in evidence the need of a consistent standardized methodology for the
transmission of mathematics via the World Wide Web. In view of the failure of existing methods MathML and OpenMath's\footnote{Describing these
standards in detail is not in the scope of this report. We do encourage the reader to have a careful read through both standard specifications
\cite{openmath}\cite{mathml} in order to better understand this report and its implications.} significance and importance increased. Both standards
are complementary yet serving different purposes.

The primary aim of OpenMath\index{OpenMath} is to facilitate reliable communication of mathematical objects between mathematical applications. It
ensures semantic content is preserved within the notation. The semantic scope of OpenMath\index{OpenMath} is defined within its content
dictionaries\index{content dictionaries} (CD) where all symbols used are described defining their semantic value. Related symbols and functions are
grouped into CD groups. It is expected that applications using OpenMath\index{OpenMath} declare which CD groups they understand.

MathML\index{MathML} however is World Wide Web oriented in that it seeks to display mathematics on web pages.
MathML\index{MathML} has two combinable versions, one encoding mathematical objects (presentation
markup\index{MathML!presentation markup}) and the other encoding mathematical meaning (content markup\index{content
markup}). Both versions allow authors to encode both the notation which represents a mathematical object and the
mathematical structure of the object itself. Moreover, authors can mix both kinds of encoding in order to specify both the
presentation and content of a mathematical idea.

In fact there are strong links between both recommendations. The communities developing both standards are closely
related, with some members belonging to both groups. This has resulted in both standards superceding each other in some
areas.

The {\it core} OpenMath\index{OpenMath} CD group is the principal CD group. The {\it core} CD group was designed based on
MathML\index{MathML!MathML 1.0} 1.0, extending the set of symbols covered by MathML\index{MathML!MathML 1.0} 1.0. Its
intention is not to be very specific, only covering everyday and K-12 (kindergarden to high school level) mathematics just
as MathML\index{MathML} does.

For completeness, a MathML\index{MathML} CD group was introduced in the OpenMath\index{OpenMath} standard. It is a subset
of the {\it core} CD group and has the same semantic scope as do the content elements of MathML\index{MathML}. It is
expected that most applications will understand the {\it core} CD group, automatically understanding the
MathML\index{MathML} CD group.

The recently published MathML\index{MathML!MathML 2.0} 2.0 version has incorporated elements of the {\it core}
OpenMath\index{OpenMath} CD group which weren't before in MathML\index{MathML!MathML 1.0} 1.0. But in order to keep the
scope of content markup\index{content markup} down to a reasonable size, the designers of MathML\index{MathML} have
restricted the mathematics that it attempts to cover to high school level mathematics limiting MathML\index{MathML}'s
ability to convey mathematical meaning. Because OpenMath\index{OpenMath} is more powerful in this respect, the designers
of MathML\index{MathML} have introduced means allowing for extensibility. It is possible to encode semantic information
inside MathML by embeding OpenMath\index{OpenMath} objects within MathML\index{MathML} code.

This demonstrates the close ties existing between both the World Wide Web Consortium\index{World Wide Web Consortium} and
the OpenMath\index{OpenMath Society} society. In the MathML\index{MathML!MathML 2.0} 2.0 specification one can read: {\it
``The MathML\index{MathML} content elements are heavily indebted to the OpenMath\index{OpenMath} project \ldots''}

\section{Current Support}
   
Both standards have received considerable attention, and have mobilized many developers. Support for MathML\footnote{For a comprehensive list of software supporting MathML look at the W3C web site~\cite{w3c}}
\index{MathML}
and OpenMath\index{OpenMath} is being introduced in many areas now that a future seems to profile itself. 

The dominance of Java\index{Java} on the Internet today has made it a good candidate for offering a solution to the
problem of publishing mathematics. The flexibility and power of Java\index{Java} applets can be used in conjunction with
MathML or OpenMath to display mathematical formulae.

This approach is currently best represented by WebEQ\index{WebEQ} \cite{webeq}. WebEQ\index{WebEQ} is a collection of programs and Java\index{Java}
programming libraries dealing with all aspects of putting math on the Web.  Because WebEQ\index{WebEQ} is based on MathML\index{MathML},
WebEQ\index{WebEQ} tools can easily be combined with each other and with other MathML\index{MathML} software to accomplish a wide range of tasks.
The applet takes a representation of an equation as input, and displays it. The representation has to be some markup language which the applet
supports (MathML\index{MathML} or Web\TeX\index{WebTeX}). Another Java\index{Java} application is ICEBrowser \cite{ice}. A browser component
written in Java\index{Java} which renders MathML\index{MathML}.

By using a Java\index{Java} applet we encounter the same difficulties as when using embedded graphics. In addition to
this, Java\index{Java} applets have a larger initial download overhead, which can be disturbing to some users.
Java\index{Java} applets usually offer good equation displays, but different vendors supply different solutions and markup
languages.

Another set of applications currently offering MathML support are plug-ins. The main distinction in principle between
using plug-ins or Java\index{Java} applets is that plug-ins need to be pre-installed on the Internet browser for any
rendering to take place. IBM\index{IBM} Techexplorer\index{TechExplorer} \cite{ibm} is a representative example under
development. It currently supports MathML\index{MathML} encodings. IBM\index{IBM}'s approach to the problem is definetely
bordering the solution the scientific community is hoping to see. Techexplorer can display MathML\index{MathML} and the
quality of display is acceptable. Hopefully, IBM\index{IBM}'s techexplorer initiative will push other browser vendors and
companies to adopt MathML\index{MathML} as the leading standard.

But as with the other temporary solutions, plug-ins also have their limitations. 
Plug-ins have trouble getting the current HTML document font size, changing the size of the window to fit the display, or getting the current HTML document background color. Plug-ins such as IBM\index{IBM}'s are not
yet widespread, and most people are not familiar with plug-in download and installation.

In the area of computer algebra, soon many computer algebra packages should have interfaces to both standards. An example
of this is the MathML\index{MathML} to REDUCE\index{REDUCE} interface available in REDUCE\index{REDUCE} 3.7, or the MathML
interface built in Mathematica Version 4.

Various programs convert \LaTeX~documents into MathML. This is important because of the large amount of documents written
in LaTeX\index{\LaTeX} until now. An example of a program accomplishing this task is TtM\index{TtM} \cite{TtM} for
instance.

Various equation editors such as MathType or Design Science\index{Design Science}'s MS equation editor also support
MathML\index{MathML}. They manipulate expressions and offer easy to use graphical user interfaces. It is possible to
export equations to MathML format.

Until now however, both Explorer\index{Explorer} and Netscape\index{Netscape} have not yet incorporated support for
MathML\index{MathML}, although they have committed themselves in doing so in the near future. Because these are the most
popular browsers, it is important that they soon provide MathML\index{MathML} facilities in order to boost the use of
MathML\index{MathML}.

\newpage

\section{The future}

\begin{quotation}

\emph{``While many in the mathematical and scientific community have already adopted \LaTeX~as the standard for writing
papers, it appears that MathML\index{MathML} is the future of scientific and mathematical notation on the Web.''} Bob
Henshaw, UNC.

\end{quotation}

Regardless of how efficient MathML \index{MathML}and OpenMath are in transmitting and displaying mathematics, it is clear
that they will only be of any use if all communities adopt it. It is expected however that most popular software companies
working on the Internet or on computer algebra packages will soon support MathML and OpenMath. It seems as if MathML and
OpenMath will recieve the necessary support due to the commitment that various big companies have already shown
(IBM\index{IBM}, Netscape\index{Netscape}, Microsoft\index{Microsoft}, Wolfram\index{Wolfram}, Design Science\index{Design
Science}, and many others).

At the moment some browsers have already implemented MathML\index{MathML} rendering facilities (Amaya\index{Amaya} for
instance), and soon other bigger browser vendors will join the trend. Mozilla has recently released its latest browser
which does render MathML. Netscape should follow soon with Navigator5\index{Netscape!Navigator 5}. MathType from Design
Science\index{Design Science} has released a new version incorporating various tools for dealing with MathML and OpenMath.
For those not familiar with Design Science\index{Design Science}, they also make MS Word's equation editor. Other
companies (mainly Stilo) are developing equation editors with MathML and OpenMath facilities which will soon hit the
market.

While substantial progress has been made, there are still areas in which more work is required before MathML can be
incorporated easily into the Internet. Further improvement in coordination between browsers and embedded elements will be
necessary. Furthermore, higher printing resolution must be achieved.

MathML and OpenMath are the first XML\index{XML} based markup language to appear on the Internet. They will show the power and limitations of XML.
An example has been set for other specialist areas which also want to benefit from the Internet.; areas such as Chemical Engineering or Music are
using XML to develop representation standards. Both standards have been recieved enthousiastically and it will surely not take long before they are
used widely by the scientific community.




REDUCE Historical
REDUCE Sourceforge Project | Historical SVN Repository | GitHub Mirror | SourceHut Mirror | NotABug Mirror | Chisel Mirror | Chisel RSS ]