Koren Antun, Cigula Tomislav, Wendling Ernest.

Abstract :

The development of computer equipment and particularly the development of networked computing systems in last years has been growing exponentially, but still some working and research areas in which performances of such systems are inadequate still exist. In some multimedia applications in which large files (complex pictures, video etc.) are handled, or in time-consuming processes distributed computing systems and multiprocessing and cluster computing is used. The purpose of this paper is to point out main advantages of distributed processing and their usage in multimedia applications.

Key words: multimedia, computing systems, multiprocessing

1 Introduction

Distributed computing can be defined in many different ways, but one type of distributed computing has received a lot of attention lately - an environment where idle CPU cycles and storage space of tens, hun-dreds, or even thousands of networked systems are connected to work together on a potentially process-ing-intensive problem. Increasing desktop CPU power and communications bandwidth has also helped to make distributed computing a more practical idea. The numbers of real applications are still limited, and the challenges, particularly standardization, are still significant.

2 Types of distributed computing

There are two similar trends - distributed computing and grid computing. Grid Computing got its name from an ideal scenario in which the CPU cycles and storage of millions of systems across a worldwide network function as a flexible, readily accessible pool that could be used by anyone who needs it, similar to the way power companies and their users share the electrical grid.

Grid computing can utilize desktop PCs, but more often its focus is on more powerful workstations, servers, and even mainframes and supercomputers working on problems involving huge datasets that can run for days.

Large-scale distributed computing usually refers to a similar concept, but is more geared to pooling the resources of hundreds or thousands of networked end-user PCs, which individually are more limited in their memory and processing power, and whose primary purpose is not distributed computing, but rather serving their user. There are various levels and types of distributed computing architectures, and both Grid and distributed computing can be limited to CPUs among a group of users, a department, several departments inside a corporate firewall, or a few trusted partners across the firewall.

2.1 Operational Basics

In most cases today, a distributed computing architecture consists of software agents installed on a number of client systems, and one or more distributed computing management servers. There may also be requesting clients with software that allows them to submit jobs along with lists of their required resources.

An agent running on a processing client detects when the system is idle, notifies the management server that the system is available for processing, and usually requests an application package. The client then receives an application package from the server and runs the software when it has spare CPU cycles, and sends the results back to the server. The application may run as a screen saver, or simply in the background, without impacting normal use of the computer. If the user of the client system needs to run his own applications at any time, control is immediately returned, and processing of the distributed application package ends.

3 Applicatins view

The most obvious advantage of this type of architecture is the ability to provide access to supercomputer level processing power or better for a fraction of the cost of a typical supercomputer. According to (1) the most powerful computer, IBM's ASCI White, is rated at 12 TeraFLOPS and costs $110 million, while one of the most famous distributed applications SETI@home currently gets about 15 TeraFLOPs and has cost about $500 thousands so far. Further savings comes from the fact that distributed computing doesn't require all the pricey electrical power, environmental controls, and extra infrastructure that a supercomputer requires. And while supercomputing applications are written in specialized languages like mpC, distributed applications can be written in C, C++, etc.

Scalability is also a great advantage of distributed computing. Though they provide massive processing power, super computers are typically not very scalable once they are installed. A distributed computing installation is infinitely scalable - simply add more systems to the environment. Another advantage of distributed computing is more efficient use of existing system resources. Estimates by various analysts have indicated that up to 90 percent of the CPU cycles are not used. This can be solved by allocating applications that require processing power to a grid of client machines or servers that can spare some idle time.

Not all applications are suitable for distributed computing. The closer an application gets to running in real time, the less appropriate it is. Generally the most appropriate applications, according to (2), consist of “loosely coupled, non-sequential tasks in batch processes with a high compute-to-data ratio.”

4 Distributed computing applications examples

Besides the very popular SETI@Home application, there are a lot of other types of application tasks that can have advantage of distributed computing.

• A query search against a huge database that can be split across lots of desktops, with the submitted query running concurrently against each fragment on each desktop.

• Complex modeling and simulation techniques that increase the accuracy of results by increasing the number of random trials would also be appropriate, as trials could be run concurrently on many desktops, and combined to achieve greater statistical significance (this is a common method used in various types of financial risk analysis).

• Exhaustive search techniques that require searching through a huge number of results to find solutions to a problem also make sense. Drug screening is a prime example.

• Complex financial modeling, weather forecasting, and geophysical exploration are on the radar screens of these vendors, as well as car crash and other complex simulations.

The need for computer power in graphic industry is on a rise, especially in prepress and graphic design, where the originals need to be prepared, and improved before printing to get best possible results on the end. Rendering and picture processing are very demanding for computers, and multimedia, as one part of the graphic industry that raises every day, needs even more computer power to meet its demands. This area is definitely the place where this technology can be used.

5 Distributed computing in multimedia graphical applications

Figure 1: Distributed computing example in multimedia technology

Current trends indicate that 3D computer graphics requirements are increasing at an extremely fast pace. In order for graphics systems to keep up with these demands, graphics systems need to be high performance as well as scalable. Many projects are focused on designing high-performance rendering systems by clustering a number of low-cost PCs, each of which contains a PC graphics accelerator. As a result of this systems being comprised purely of commodity parts, it is easy to keep track with the technology progress since the components can be easily upgraded. The key factors that set this system apart from traditional parallel rendering systems is the loosely coupled nature of rendering systems and the high costs of communication between the nodes in the system. As seen in figure 1, the clients computer doesn't need to do large jobs, they just have to send the job to the servers, where each server makes only a part of the whole image, and servers do all the hard work. The result is the same as it's done on the one computer, but is in this example minimum four times faster.

5.1 Cluster System at Faculty of Graphic Arts

Hypermedia is an interactive environment that includes text, color, voice, sound, graphics, and video. Hypermedia allows user interactivity in the information retrieval process. Educational institutions are moving towards a situation in which all students will become able of reading (that is, using) hypermedia. Students are learning to retrieve information stored on CD-ROMs, in hypermedia computer files, in computerized databases, and on the Internet's World Wide Web. Another view is to enable students to write (create) hypermedia documents. In total, facilitating students in developing basic skills in reading and writing hypermedia - interactivity, sound, color, still photography, computer-based drawing and painting, and video add new dimensions to communication. With the limited resources of educational institutions, it is hard to achieve those goals. One of the obvious solutions is using distributed and grid based computer systems.

A project of this nature took place at Faculty of Graphic Arts in Zagreb. One of the goals of that project was to build a cluster within multimedia computer classroom. In this classroom 12 personals computers with 1GHz CPU are located that are used for educational purposes, laboratory exercises and presentations. The usage of such organizational scheme is very low; approximately 10% of the time computers are working, and to avoid that, the multifunctional computer classroom was developed. It is used for educational purposes on weekdays from 7:00 am to 7:00 pm. During this time the computers are running under Windows OS, and during the night and over week-ends they turn to Linux cluster system. Automated transition between these working modes is achieved by parallel implementation of some ‘open source' tools and development and adaptation of parts of system software.

This remote cluster was initially connected to main cluster located in IRB (Ruđer Bošković Institute) via CARNet network resources. The main cluster is working 24 hours a day, so during the night computing power of this initial “grid” is increased by power of this remote cluster.

However, a significant problem with job distribution towards dislocated cluster occurred. Some attempts to solve this problem were made – the Globus toolkit is being implemented, and Silver Metascheduler is being tested. Additional problem is bottleneck of 2 Mb Internet access from Faculty towards CARNet. This capacity is much to slow to allow distribution of user home directories over cluster. To increase data transfer speed, an RF link between IRB and Faculty is built, and in the next step the implementation of laser optical communication link.

6 Conclusion

A large number of present processes in multimedia and graphic design and in all graphic industry need substantial computer power. It is technologically challenging and expensive to keep pace with this trend. Deadlines are getting shorten, the competition is big, so tasks must be performed not only fast, but cheap too. To make this happen, the cost of needed technology must be low and in the same time very powerful. One of the solutions, that meet our demands, is distributed computing.

7 Literature

1., visited: May 26, 2004
2., visited: May 26, 2004
3. L. Erlanger, Distributed Computing, An Introduction
4. H. Chen et al., Data Distribution Strategies for High-Resolution Displays, Princeton University
5. R. Samanta et al., Sort-First Parallel Rendering with a Cluster of PCs, Princeton University