Artifact df189509e24d2eddaad962bbece7da22ea233ce310b0a294b5da4144ae0011d8:
- Executable file
r38/packages/mathml/introduction.tex
— part of check-in
[f2fda60abd]
at
2011-09-02 18:13:33
on branch master
— Some historical releases purely for archival purposes
git-svn-id: https://svn.code.sf.net/p/reduce-algebra/code/trunk/historical@1375 2bfe0521-f11c-4a00-b80e-6202646ff360 (user: arthurcnorman@users.sourceforge.net, size: 22315) [annotate] [blame] [check-ins using] [more...]
\chapter{Introduction} Nearly eight years after the appearance of the World Wide Web, it is still a difficult medium to use for the transmission of mathematics and scientific material in spite of its success in other areas. Sending mathematics via e-mail or reading mathematics into a software package from a web page is not a simple task, depriving the scientific community from a powerful communications tool which is the Internet. Likewise, displaying mathematics on the Internet in a way that allows editing and reuse has until now been impossible. As the Internet continues to grow it is becoming ever more important to facilitate the exchange of mathematics amongst users and computer algebra software packages, offering automatic processing of expressions, searching, editing and reuse. To overcome these difficulties, various companies and societies have joined together to produce standards for representing mathematics whilst preserving mathematical meaning. The World Wide Web Consortium\index{World Wide Web Consortium}~\cite{w3c} and the OpenMath\index{OpenMath Society} society~\cite{openmath} have developed the two leading standards currently receiving most attention. These are MathML\index{MathML} \cite{mathml} and OpenMath\index{OpenMath} \cite{openmathspec} respectively. The chief purpose of OpenMath\index{OpenMath} is to facilitate consistent communication of mathematics between mathematical applications. MathML\index{MathML} however, concentrates on displaying mathematics on the web whilst maintaining its meaning. Both standards are complementary and used together can provide the opportunity to expand our ability to represent, encode and successfully communicate mathematical ideas with one another across the Internet. The primary aim of this project is to understand the differences and similarities between OpenMath\index{OpenMath} and MathML\index{MathML}, to assess their exchangeability and develop a way of mapping one standard to the other. The main objective will be to ultimately design and implement an interface running on REDUCE\index{REDUCE} which will translate OpenMath\index{OpenMath} into MathML\index{MathML} and vice versa. This interface will provide REDUCE\index{REDUCE} with the capability of exchanging mathematics with other applications as well as displaying output on the World Wide Web and reading from it, allowing REDUCE to join the MathML/OpenMath trend. \chapter{Literature Review} The notation of mathematics has constantly evolved with the appearance of new concepts and ideas. Modern mathematical notation is the result of centuries of refinement. As a result of this, the sophisticated symbols with which we write mathematics pose certain problems when bringing them onto printed paper. Publishing mathematics is a difficult task simply because mathematics do not lend themselves easily to publication. Recently, the advances in Internet publishing, following the Internet expansion, have added a new dimension to mathematical publishing. New problems as well as new requirements must be dealt with. We want the Internet not only to be a medium for displaying mathematics around the world, but also a communications tool for transmitting them. How can we ensure that mathematics published on a web page are reusable? Editable? The outputs of one application should be displayed on the Internet in a way humans can understand and other applications can reuse. But because there is a distinction between presenting mathematical objects, and transmitting their content, merging both into one notation to achieve this duality is a non-trivial task. In order to fully understand the motivations of this project, as well as appreciating its outcome, it is important to carefully illustrate any related issues. We will look into the development of mathematical publishing and how it has evolved with the growth of the Internet. This will permit us to better understand the need for mathematical representation standards such as MathML\index{MathML} and OpenMath\index{OpenMath} which we shall introduce. Finally we will talk about the relation between these standards, the existing software supporting them, and their future. With such an overview of the current situation, the necessity of a MathML\index{MathML} to OpenMath\index{OpenMath} interface for REDUCE\index{REDUCE} will become clear. \section{Mathematical Publishing} Before the foundation of the World Wide Web, encoding of mathematical documents was already a widespread practice. Back in the days when computers were starting to become popular, the ASCII\index{ASCII} character set (and encodings based on it) was the only widely available encoding scheme. The restrictions of such a limited symbol set were soon apparent. In the mid seventies, Donald Knuth developed \TeX\index{\TeX}, from which variants such as \LaTeX\index{\LaTeX} stemmed. Layout and typesetting of mathematics is extremely demanding and until now, Donald Knuth's \TeX\index{\TeX} had been able to address these difficulties in a successful way, appealing to the scientific community who has now made it a standard in scientific publishing. \TeX\index{\TeX} has become the tool of choice for producing scientific and mathematical documents. Despite its widespread use and ease with which it is authored, \TeX\index{\TeX} does not preserve mathematical semantic value, making it unpractical for use in web documents and useless for transmission between applications. \TeX\index{\TeX} is only concerned with describing the presentation of mathematics, not the content. Because people are interested in transmitting their ideas and research via e-mail or web pages it is fundamental that semantic value is kept. While \TeX\index{\TeX} is mainly a UNIX based application, PC applications dealing with mathematical encoding have also emerged. Generally these are equipped with a graphical user interface making them easier to use: Design Science\index{Design Science}'s MS Word Equation Editor, FrameMaker\index{FrameMaker}, WordPerfect\index{WordPerfect} or ScientificWord\index{ScientificWord} are a few to name examples. All these applications\footnote{It is worth noting that PC applications have not had the same success as \TeX\index{\TeX}.} just deal with displaying mathematics and ignore semantic value. They are usually vendor specific making them unpractical for use in mathematical web publishing. \section{Mathematics and the Internet Challenge} \subsection{Html and Mathematics} In the early 1990's, The World Wide Web Consortium\index{World Wide Web Consortium}'s Html \index{Html} became the standard markup language for publishing on the World Wide Web. It has since evolved and has become an extensible and very powerful means of representing interactive Internet documents. In terms of representing mathematics however, Html has little support. In the first versions of Html\index{Html} , no support for mathematics was included. It was not until 1993 that the first intent of embedding mathematics within Internet documents was attempted in the Html+\index{Html!Html+} draft \cite{htmlp} presented by the World Wide Web Consortium\index{World Wide Web Consortium}. Equations were represented directly as Html+\index{Html!Html+} using an SGML\index{SGML} \cite{sgml} based notation, inspired by \LaTeX's\index{\LaTeX} approach. In 1994, the World Wide Web Consortium\index{World Wide Web Consortium} went further in mathematics Internet publishing by presenting the Html 3.0\index{Html!Html 3.0} draft \cite{html3} (which later was officially published as the Html 3.2\index{Html!Html 3.2} \cite{html3.2} specification with a few modifications) which offered a more comprehensive support. They claimed {\it ``Html math is powerful enough to describe the range of math expressions you can create in common word processing packages, as well as being suitable for rendering to speech.''} Nonetheless, both drafts failed because of lack of interest from popular browser vendors. But even though the mathematical ideas in the Html 3.2\index{Html!Html 3.2} specification were never fully deployed, people started thinking more carefully about mathematics, and how they could be represented on the WWW. In the meantime, while the World Wide Web Consortium\index{World Wide Web Consortium} and other societies continued working on developing mathematical support for Internet documents, other solutions to transmitting mathematics on the web arose. The lack of a standard approach to uniformly represent mathematics on the Internet pushed mathematicians and scientists to use a variety of different techniques to achieve this purpose. Let us give a brief overview of the main ones. \subsection{Embedded Graphics} One way of displaying mathematics on the web is by the use of embedded graphics inside Html documents. Mathematical equations are represented by graphical images (e.g. gifs) which all browsers display without difficulties. Formulae can be viewed in their original rendering, without the browser requiring additional fonts or external viewing programs. Nevertheless, these images display low resolutions and printing them results in poor quality documents. There are also problems with alignment and sizing. Because graphical images are generally slow to download, documents might take more time than desired to be rendered. Since we are only dealing with images, the equations are not editable. No modifications can be done on them. For the same reasons, they are not reusable, because semantic value is completely lost. This method is widespread but not very appreciated. In the Html 3.0\index{Html!Html 3.0} draft, the World Wide Web Consortium\index{World Wide Web Consortium} specifically states its intention of helping users avoid the use of inline images to display equations. This is the approach used by programs such as \LaTeX\index{\LaTeX}2Html \cite{latex2html} or \TeX\index{\TeX}4ht \cite{tex4ht} which can convert \LaTeX\index{\LaTeX} and \TeX\index{\TeX} documents to Html\index{Html} format for direct insertion into the Internet. \LaTeX\index{\LaTeX} markup is translated into Html while mathematical equations are converted into graphical images. It is worth noting however, that there exist programs such as TtM\index{TtM} \cite{TtM} which translate the mathematical sections directly into MathML\index{MathML} presentation markup \index{MathML!presentation markup}. \subsection{Graphical Page Display} Another way of approaching the problem is by using graphical page displays. The page is rendered into a page-description language such as postscript\index{postscript} or PDF\index{PDF}. Internet browsers, aided by an external viewer or plug-in can then display the page in its integrity, including any mathematical formulae within it. When using this method, documents are displayed with exactly the same layout as the original documents, which could be \TeX\index{\TeX} documents for instance. The printing resolution is also maintained at a high quality level. But using an external viewer or plug-in involves everyone possessing a copy. A viewer also requires a verbose and large file format including all the non-standard fonts used. Just in the same way as the embedded graphics display, any mathematics contained within these documents looses its semantic value, as well as the possibility to edit it or modify it. \section{OpenMath\index{OpenMath} and MathML\index{MathML}} These interim solutions have only contributed to the problem by putting in evidence the need of a consistent standardized methodology for the transmission of mathematics via the World Wide Web. In view of the failure of existing methods MathML and OpenMath's\footnote{Describing these standards in detail is not in the scope of this report. We do encourage the reader to have a careful read through both standard specifications \cite{openmath}\cite{mathml} in order to better understand this report and its implications.} significance and importance increased. Both standards are complementary yet serving different purposes. The primary aim of OpenMath\index{OpenMath} is to facilitate reliable communication of mathematical objects between mathematical applications. It ensures semantic content is preserved within the notation. The semantic scope of OpenMath\index{OpenMath} is defined within its content dictionaries\index{content dictionaries} (CD) where all symbols used are described defining their semantic value. Related symbols and functions are grouped into CD groups. It is expected that applications using OpenMath\index{OpenMath} declare which CD groups they understand. MathML\index{MathML} however is World Wide Web oriented in that it seeks to display mathematics on web pages. MathML\index{MathML} has two combinable versions, one encoding mathematical objects (presentation markup\index{MathML!presentation markup}) and the other encoding mathematical meaning (content markup\index{content markup}). Both versions allow authors to encode both the notation which represents a mathematical object and the mathematical structure of the object itself. Moreover, authors can mix both kinds of encoding in order to specify both the presentation and content of a mathematical idea. In fact there are strong links between both recommendations. The communities developing both standards are closely related, with some members belonging to both groups. This has resulted in both standards superceding each other in some areas. The {\it core} OpenMath\index{OpenMath} CD group is the principal CD group. The {\it core} CD group was designed based on MathML\index{MathML!MathML 1.0} 1.0, extending the set of symbols covered by MathML\index{MathML!MathML 1.0} 1.0. Its intention is not to be very specific, only covering everyday and K-12 (kindergarden to high school level) mathematics just as MathML\index{MathML} does. For completeness, a MathML\index{MathML} CD group was introduced in the OpenMath\index{OpenMath} standard. It is a subset of the {\it core} CD group and has the same semantic scope as do the content elements of MathML\index{MathML}. It is expected that most applications will understand the {\it core} CD group, automatically understanding the MathML\index{MathML} CD group. The recently published MathML\index{MathML!MathML 2.0} 2.0 version has incorporated elements of the {\it core} OpenMath\index{OpenMath} CD group which weren't before in MathML\index{MathML!MathML 1.0} 1.0. But in order to keep the scope of content markup\index{content markup} down to a reasonable size, the designers of MathML\index{MathML} have restricted the mathematics that it attempts to cover to high school level mathematics limiting MathML\index{MathML}'s ability to convey mathematical meaning. Because OpenMath\index{OpenMath} is more powerful in this respect, the designers of MathML\index{MathML} have introduced means allowing for extensibility. It is possible to encode semantic information inside MathML by embeding OpenMath\index{OpenMath} objects within MathML\index{MathML} code. This demonstrates the close ties existing between both the World Wide Web Consortium\index{World Wide Web Consortium} and the OpenMath\index{OpenMath Society} society. In the MathML\index{MathML!MathML 2.0} 2.0 specification one can read: {\it ``The MathML\index{MathML} content elements are heavily indebted to the OpenMath\index{OpenMath} project \ldots''} \section{Current Support} Both standards have received considerable attention, and have mobilized many developers. Support for MathML\footnote{For a comprehensive list of software supporting MathML look at the W3C web site~\cite{w3c}} \index{MathML} and OpenMath\index{OpenMath} is being introduced in many areas now that a future seems to profile itself. The dominance of Java\index{Java} on the Internet today has made it a good candidate for offering a solution to the problem of publishing mathematics. The flexibility and power of Java\index{Java} applets can be used in conjunction with MathML or OpenMath to display mathematical formulae. This approach is currently best represented by WebEQ\index{WebEQ} \cite{webeq}. WebEQ\index{WebEQ} is a collection of programs and Java\index{Java} programming libraries dealing with all aspects of putting math on the Web. Because WebEQ\index{WebEQ} is based on MathML\index{MathML}, WebEQ\index{WebEQ} tools can easily be combined with each other and with other MathML\index{MathML} software to accomplish a wide range of tasks. The applet takes a representation of an equation as input, and displays it. The representation has to be some markup language which the applet supports (MathML\index{MathML} or Web\TeX\index{WebTeX}). Another Java\index{Java} application is ICEBrowser \cite{ice}. A browser component written in Java\index{Java} which renders MathML\index{MathML}. By using a Java\index{Java} applet we encounter the same difficulties as when using embedded graphics. In addition to this, Java\index{Java} applets have a larger initial download overhead, which can be disturbing to some users. Java\index{Java} applets usually offer good equation displays, but different vendors supply different solutions and markup languages. Another set of applications currently offering MathML support are plug-ins. The main distinction in principle between using plug-ins or Java\index{Java} applets is that plug-ins need to be pre-installed on the Internet browser for any rendering to take place. IBM\index{IBM} Techexplorer\index{TechExplorer} \cite{ibm} is a representative example under development. It currently supports MathML\index{MathML} encodings. IBM\index{IBM}'s approach to the problem is definetely bordering the solution the scientific community is hoping to see. Techexplorer can display MathML\index{MathML} and the quality of display is acceptable. Hopefully, IBM\index{IBM}'s techexplorer initiative will push other browser vendors and companies to adopt MathML\index{MathML} as the leading standard. But as with the other temporary solutions, plug-ins also have their limitations. Plug-ins have trouble getting the current HTML document font size, changing the size of the window to fit the display, or getting the current HTML document background color. Plug-ins such as IBM\index{IBM}'s are not yet widespread, and most people are not familiar with plug-in download and installation. In the area of computer algebra, soon many computer algebra packages should have interfaces to both standards. An example of this is the MathML\index{MathML} to REDUCE\index{REDUCE} interface available in REDUCE\index{REDUCE} 3.7, or the MathML interface built in Mathematica Version 4. Various programs convert \LaTeX~documents into MathML. This is important because of the large amount of documents written in LaTeX\index{\LaTeX} until now. An example of a program accomplishing this task is TtM\index{TtM} \cite{TtM} for instance. Various equation editors such as MathType or Design Science\index{Design Science}'s MS equation editor also support MathML\index{MathML}. They manipulate expressions and offer easy to use graphical user interfaces. It is possible to export equations to MathML format. Until now however, both Explorer\index{Explorer} and Netscape\index{Netscape} have not yet incorporated support for MathML\index{MathML}, although they have committed themselves in doing so in the near future. Because these are the most popular browsers, it is important that they soon provide MathML\index{MathML} facilities in order to boost the use of MathML\index{MathML}. \newpage \section{The future} \begin{quotation} \emph{``While many in the mathematical and scientific community have already adopted \LaTeX~as the standard for writing papers, it appears that MathML\index{MathML} is the future of scientific and mathematical notation on the Web.''} Bob Henshaw, UNC. \end{quotation} Regardless of how efficient MathML \index{MathML}and OpenMath are in transmitting and displaying mathematics, it is clear that they will only be of any use if all communities adopt it. It is expected however that most popular software companies working on the Internet or on computer algebra packages will soon support MathML and OpenMath. It seems as if MathML and OpenMath will recieve the necessary support due to the commitment that various big companies have already shown (IBM\index{IBM}, Netscape\index{Netscape}, Microsoft\index{Microsoft}, Wolfram\index{Wolfram}, Design Science\index{Design Science}, and many others). At the moment some browsers have already implemented MathML\index{MathML} rendering facilities (Amaya\index{Amaya} for instance), and soon other bigger browser vendors will join the trend. Mozilla has recently released its latest browser which does render MathML. Netscape should follow soon with Navigator5\index{Netscape!Navigator 5}. MathType from Design Science\index{Design Science} has released a new version incorporating various tools for dealing with MathML and OpenMath. For those not familiar with Design Science\index{Design Science}, they also make MS Word's equation editor. Other companies (mainly Stilo) are developing equation editors with MathML and OpenMath facilities which will soon hit the market. While substantial progress has been made, there are still areas in which more work is required before MathML can be incorporated easily into the Internet. Further improvement in coordination between browsers and embedded elements will be necessary. Furthermore, higher printing resolution must be achieved. MathML and OpenMath are the first XML\index{XML} based markup language to appear on the Internet. They will show the power and limitations of XML. An example has been set for other specialist areas which also want to benefit from the Internet.; areas such as Chemical Engineering or Music are using XML to develop representation standards. Both standards have been recieved enthousiastically and it will surely not take long before they are used widely by the scientific community.