Proceedings of the 14th International
Conference on Information and Intelligent Systems (IIS 2003 Proceedings),
24-26 september 2003., Varaždin – Croatia; ISBN 953-6071-22-3; Published
by Faculty of
Information System Transformation after Implementing XML Technology
dr. sc. Vilko Žiljak
University of Zagreb, Faculty of Graphics Art, Zagreb
Abstract: Information scientists have become nervous when it comes to defining the development steps of new technologies. They have realized it is inevitable to alter operation technique and technology. It is necessary to unite in business in respect to multi-media implementation, e-dealings and digital publishing because it seems to be the only way to preserve informatization culture. Data bases were organized and oriented towards internal servicing, without spreading into the Internet type of communication. The time has arrived when one must have a wider focus in comprehending the cycle covering: dislocated search; autonomy in respect to: computer networks, computer platforms, programming experience, and to data bases and sources. In this network era it is necessary to create a new type of workplace for more successful production, trade and marketing. This requires a new type of education and training for working in a computer center of the future, as well as studying the general trends in developing new communication technologies.
Keywords: iis, XML, WebPoskok
1. AN INTRADUCTION INTO WEB, HTML AND XML RELATIONS
XML (eXtensible Markup Language) has offered autonomy in respect to networking of the information system inside and outside the computer center. XML technology is independent in respect to computer types any category, even general purpose computers; independent as to the programming operative bases inside the computer, and independent in respect to linking with various data bases, and data banks too. XML and it’s languages are becoming generally known and special speedy training is carried out in order to make XML application possible in the shortest period of time. Based on the fact that knowledge is increasing exponentially, it is possible to carry out XML implementation in a short period of time, and reprogramming in its language family. “New” programmers, “new” computer experts and “new” managers are formed. It was not these new computers, new bases, and new proposals on computer center systems’ arrangement that have altered the directions information science is taking, but the creation of a new technology named XML, a communication language uniting all the previous computer technology, integrating Web systems so they have an overall connection with the Internet.
XML is a point in information systems history altering them more than any other event. The Web, as to number and effect, as to quality and quantity is spreading at the speed of duplicating itself every 13 months. This has been going on for 23 years (www.nw.com “Internet domain survey” www.isc.org/ds/host-count-history.html). Let’s try to believe that this exponential trend will continue so that we could succumb more easily to the pleasures coming with the Web. Even software standards are not immune to radical changes and we are witnessing how HTML is not fitting into the further development. XML was created as the consequence of wishing the Internet to be the backbone for data and information exchange. XML is the common denominator for information exchange between different data bases with the goal to raise compatibility and communication to an Internet level. XML is becoming the format for communication between data bases, “the language of machines”, services, archives, technical and administrative production plans, commercial and investment systems. When placing data on the Web we aim at using such forms that can be taken over not only by us, but by anyone, - in a simple, acceptable and understandable manner. We are aiming at improving the paths of information on the Web, structuring enormous content quantities in order to make use of them. From the technological point of view XML is an independent concept. The glorious appearance of the Internet and HTML has given new fields for computer science implementation. Fast spreading, simplicity of implementation, no reserves in respect to acceptance, - all have evoked new desires, and have motivated computer users, programmers and information scientists to create new values in the digital world. The inadequacies in implementing new ideas for application in global information science were observed. Trouble-shooters were defined: It was necessary to separate the data from the design so that data could be transformed and used for some other, completely different needs, at a different time and at a different place. It was necessary to expand the possibility of linking not only from one position to just one other position, but to make linking be possible in a multifold way to heterogeneous sources, so as to take data from various data bases simultaneously, creating thus a new document. Another requirement was set: to have one document contain data not only from the commercial data base but to have also live information from the most different digital units. This was the reason for creating a new Web-information logic contained in XML. Through XML, data and data structure have been separated from their physical presentation and the design of edited reports. Secondly, through XML there has been multifold linking of data files, data bases – various sources into one single document. Thirdly, XML is used for collecting, linking and using data gathered from instruments, machines and other units that can be connected into a digital network and then used interactively, integrated with the data on production control. XML allows Web contents to spread on all digital devices and has offered more complex Web services at all points of networking. The web of the future is XML-Web.
The same, only once stored XML entry may be displayed in many different ways: rearranged in tables on the screen, differently through the printer, on the mobile, video, as a book layout or through some other media. The difference in comparison to HTML is that HTML is not modular, and is applicable to be demonstrated only in the way it was programmed. On the contrary, XML is independent: the same entry is operable on any communication unit (the path to multimedia is open). Due to the fact that an XML entry does not have incorporated instructions for demonstration, many languages are developed. Among the first is XSL, building a HTML data file that contains XML data and commands for display design. XML has thus become the main technical platform for exchange of information. In the period just preceding the appearance of XML, many new methods, techniques, theories and data manipulation were developed in the world, but almost all of them were quickly abandoned. Many are now boasting of having already programmed direct translations for transforming the existing situations and recent development concepts into an XML system. Such announcements are aimed at not losing the existing customers, at investment coverage, and creating a lot of work for employees with this new appearance. Authors of many good development patterns admit the XML-Web’s surroundings superiority. There is no vanity that can compete with the advantages of XML. Scientists do not have any more time to continue developing information science technologies that do not include XML.
2. WEBPOSKOK, THE PROJECT FOR IMPROVING DIGITAL CULTURE IN COMPUTER SCIENCE
This paper describes project work on WebPoskok (Web Programsko Orijentirani Sustav Komponenata za Otvorenu Komunikaciju) (Web Program Oriented Component System for Open Communication). It is a Croatian project for developing XML-based program components needed for working out Internet communications for existing relative data bases (Informix, DB2, Microsoft SQLServer, Oracle), the communication between them and communication with Web applications and Web services. WebPoskok is being developed for industrial and computer application. Contemporary XML technologies are used - set as industrial standards: XQuery, XPath, XMLSchema, XSL/XSLT/XSL-FO, JDF, SOUP, WSDL and Web Service. The WebPoskok project will enable coexistence of old and new data bases and their interconnection with optimal investments into hardware and software. WebPoskok aims at achieving the following goal: information from data bases should be available everywhere, anytime, anyplace. WebPoskok, the project and system first described and published in this paper, organizes work groups acting with the same interest in developing a new information system:
2.1. The Group for Training, Development and Study
WebPoskok is developing methods for mass studying of XML principles through multiple examples and pilot solutions for real projects. Without studying the XML technology and without creating one’s own languages for XML as early as the first semester in school or university that have in their names “information, computers, graphics”, there will not be enough time to cover material with a contemporary orientation in information science. XML should be built into all subjects. The following titles must not be skipped: DTD (Document Type Definition), DOM (Document Object Model), JDF (Job Description Format), XSLT, PPF (Print Production Format), PJTF (Portable Job Tickets Format), CIP4 (Processes in Prepress, Press and Postpress), PDF, JMF (Job Messaging Format) PPML (Personalized Print Markup Language)
2.2. Group for the Conversion and Linking of Various Data Bases
XML technology makes it possible to create formats for various media from the same source and from different sources. XML is a “code” or the way to realize a publication in various ways: a printed book in various sizes and designs, a CD-ROM book, processed book excerpts, book statistic analysis. An XML designed book may be redesigned for another type of publication. The XML book is open to creating another future form of the book that could be, for example, a compilation of book contents with a common theme. Electronic XML entries of daily newspapers will finally open the door to research work through archives according to contents, title, author, time. It will be necessary to include “marking” when preparing electronic XML entries. People in the newspaper business, as well as those working in other fields are developing their own XML dictionary, XML patterns. The goal of this work group is to work out procedures for linking various sources in order to create one document.
- Data analysis in production and preparation for the XML entry technique
- Data base structure organization
- Form and linking pattern organization
- Conversion of existing bases: Informix – XML, MS SQL – XML, Access – XML, EXCEL – XML, program packages – XML,
- Interface XML in respect to bases: Informix, Oracle, MS SQL, Access, Excel
WebPoskok is developing its own set of tags for data description and interpretation. New languages are created for the XML family according to application, branch of business, specialization, research methods and data decomposing. Translators and converters are developed for interrelation interpretation in the existing data bases and various format entry. Computer science is strongly holding onto and fighting for the idea of relation bases to survive, but for Web application we need input and output paths, and for this it is necessary to alter the organization of information centers.
2.3. Work Group for Developing XSL Display Families and XML Instrumentation Patterns
In order to control paths according to data, patterns are created that are in fact a set of rules defining the scope and contents of excerpted data, and how the document with selected information will be able to look like. This group is working on developing XSL, XSLT, XSL patterns according to different demonstrators: the monitor, PostScript, Web, CD, Wap.
HTML has an extraordinary solution for linking one web site with another, and this is the basis for Web success. Linking has different technological and graphic solutions but only two subjects are at stake in the said task: the source and the sought Web site. XML offers a freer mechanism with the goal of multi-pole linking into complex integration. XML can temporarily create data, making a generated base by linking with various sources and many various data bases.
XML can communicate with exterior data, with various applications; it is possible to read data off instruments. The display of the existing sizes may be modified. From the primarily designed reports there is a shift of display, a different media. Due to the fact that XML is not a language but a language defining system, XML does not have a universal DTD procedure. Each application can define it’s own DTD. In order for the “instrumental integration” to be successful, there must be a cooperation between XML experts and the “process leading” experts who also understand the XML technology. This problem is the main reason for organizing a training center with a stress on the development of one’s own XML pattern languages. The up-to-date data transactions have an innumerous quantity of entry formats, various coding, various command systems. Quite the opposite, there is compatibility in this segment gained with XML. All the entries are textual, they can be read and recognized by people in a more natural way. As the form is that of a text, XML may be used in the integration of various types of data: audio, text, picture, video, sound. Different branches of business are linked using the same data sources. Each user draws out only the data he needs and data he is authorized to have. Besides the horizontal integration, there is major significance in the vertical integration; each branch of business forms it’s own rules for making tags, multiplies it’s own vocabulary in an extremely simplified form. Such a dictionary is the basis for writing XML patterns within certain branches of business and in order to understand the textual basis it is possible to spread horizontally towards other types of business, firstly, for example, towards management and finances. In practical terms, for instance, if an insurance company administration needs to be linked with non-insurance companies, as for instance, operations in a hospital, XML technology is the one allowing such a link.
2.4. The Group of XML Dictionaries for XML Pattern Development in Various Branches of Business
Each branch of business develops it’s own style of presentation and it’s own group of languages, specific terms, it’s own vocabulary support. For instance, JDF (Job Definition Format) with MIS is developed in the printing business. Geography develops GML (Geography Markup Language) alongside with RDF and the standard DTD. On the other hand, all branches of business use common independent languages such as SVG (Scalable Vector Graphics). The XML language is started with defining the vocabulary and cookbook for a certain branch of business. Neither the vocabulary nor the cookbook have been defined by far in JDF, the XML printing pattern that is being developed during the past five years. The data files on processes, materials, methods, equipment and other chapters falling into printing business classification have not been completed yet. Up to date approximately eighty processes have been defined, such as imposition, RIPing, approval and around two hundred logical and physical actions. Creating a printing business language on basis of access organized in such a manner leads to standardization. Optimization requires defining the borders within which we are moving and adapting the data to the conditions of their moving between the systems. A vocabulary is formed as a common basis for JDF development, a workflow language and their integration.
3. MARKUP LANGUAGES AND GRAPHICS ART
What circumstances had provoked the situation for creating the XML technology? I believe this had not been initiated by the computer business lobbies. They had a classified state of business with related data bases. Only now during the past three years have the owners of Informiks, Oracle and others linked XML to their enter-exit paths, but they could be strongly criticized. Their solutions are more similar to the HTML logic than to the thesis strongly represented by XML, and this applies to dividing the data as to data display and data source. The XML technology threatened with the logic of native-XML data bases that would develop in a major way and be the first reason for computer technology to be reconstructed. It is better to define XML data as “contents or structured text”. By textual data markup it is possible for each user to recognize what type of data is in question. Markup languages were known in industry before XML had been defined in today’s form. Some business branch groups have formed their consortiums for transfer standardization, data altering and search. The best example is the general integration of the printing industry where competitors in machine production, users and publishers have joined the CIP4 consortium: www.cip4.org/./ CIP4 (International organization following the CIP3 idea from 1995; Integration in Prepress, Press and Postpress). Their language today is XML, and their communication vocabulary is what would be named today as the XML pattern; all created before XML appeared.
The most pleasing surprise for me were the graphic languages of the photo style developed in the mid-seventies (www.fotosoft.hr computer style book). Those languages are real markup languages on two levels. The basic level is identical to HTML logic created twenty years afterwards. This applies to the joint position of the commands and contents (text). In the mid-eighties screens and images were programmed in photo style. The second level were the sub-programs having several functions, as well as the sub-programs in general application languages. However, the use of sub-programs were interesting in the organization sense. The text was simply “marked”, the beginning and the end of a certain activity, for example: the title, stressing, sub-title, graphical shifts and many other typographical interventions. Some newspaper publishers developed up to a thousand sub-programs so that they could enable their reporters to mark the contents with the help of copying systems helping them avoid being forced into doing typographical and layout work as well. Graphic editors could form texts marked in such a way in any manner they wished. The work we have today with XML data and XML as a language is almost identical. However, this is not mentioned today. In general, the literature today exaggerates the significance of SGML (Standard Generalized Markup Language) as a link in the creation of XML technology. The misfortune of typographical languages was in each machine producer (Linotyp, Monotyp, Boobst, Hell) having their own languages, their expensive dedicated machines. Their failure occurred in the mid-eighties when PCs were introduced into graphic preparation and with the creation of many new graphics languages. All of them died out except PostScript which was published without restrictions. An army of Post-Script programmers was created who programmed typography and reproduction photography for the photo style, using general purpose computers.
New programming languages are expected to be created in the XML family, and a new generation of XML-Web languages. Languages are needed that will simplify the exchange and search of targeted useful information with stressed personality and individuality, in order to simplify the use of the Web for creating new knowledge.
By using the Internet in business XML is becoming a global skeleton for electronic global data exchange, as well as for storing, searching and acquiring corporate information. As to the future, relational data bases will remain and play an important role as central data banks spreading to the XML and coexisting with the native XML data base technology. XML technology may integrate the XML native data base with the technology of related bases, for instance. The following recommendation may be followed at present: if we are representing hierarchically structured data, native XML should be stressed. There will be problems with data in the related format because there are still no good-quality software tools needed for conversions. WebPoskok is solely XML-oriented, developed for the Web. XML is built into the browser, accessible on our computer. XML is becoming the general tool for tool description, but many tools are missing for a wider application. WebPoskok has the intention to promote and offer new tools in the XML environment so they would become widely used in the XML implementation process as soon as possible. All XML technology characteristics are respected by WebPoskok, and the goal is to take part in the media internationalization process and in independent electronic publishing. By creating XML entries it is not necessary to burden oneself in knowing about their application. WebPoskok groups are studying data exchange between different applications and operative systems, data exchange between various data bases. The fact must be accepted: due to it’s simplicity, XML will be accepted rapidly, and this will alter the existing information science system significantly. The area of extensive XML pattern programming remains open. WebPoskok has the most intense development going on in creating the patterns for each data manipulation action. The patterns are Web filters; - on one hand they store information in the existing bases, or they create their constant or temporary XML forms. On the other hand data is drawn from various sources through specified-purpose patterns making a unique temporary document as a data base for technologies – for displaying, most often, dynamic reports.
www.isc.org/ds/host-count-history.html www.xml.com/pub/a/2001/11/21/svgtools.html www.svgfoundation.org/siggraph.html