Welcome to Ambujex Technologies - An Emerging pioneer in Enterprise software

Grid computing is supposed to be the next major revolution in information technology after the advent of the Internet.

A Brief History of the Grid

The ancestor of the Grid is Metacomputing. This term was originated in the early eighties by NCSA Director, Larry Smarr. The idea of Metacomputing was to interconnect supercomputer centers in order to achieve superior processing resources. One of the first infrastructures in this area, named Information Wide Area Year (I-WAY), was demonstrated at Supercomputing 1995. This project strongly influenced the subsequent Grid computing activities. In fact one of the researchers who lead the project I-WAY was Ian Foster who along with Carl Kesselman published in 1997 a paper that clearly links the Globus Toolkit [5], which is currently the heart of many Grid projects, to Metacomputing.

The Foster-Kesselman duo organized in 1997, at Argonne National Laboratory, a workshop entitled ?Building a Computational Grid? [6]. At this moment the term ?Grid? was born. The workshop was followed in 1998 by the publication of the book ?The Grid: Blueprint for a New Computing Infrastructure? by Foster and Kesselman themselves. For these reasons they are not only to be considered the fathers of the Grid but their book, which in the meantime was almost entirely rewritten and re-published in 2003, is also considered the ?Grid bible?.

What Is the Grid?
In 1998 Foster and Kesselman defined the Grid as follows: ?A computational grid is a hardware and software infrastructure that provides dependable, consistent, pervasive, and inexpensive access to high-end computational capabilities.?

In fact one of the main ideas of the Grid, which also explains the origin of the word itself, was to make computational resources available like electricity. One remarkable fact of the electric power grid infrastructure is that when we plug an appliance into it we don?t care where the generators are located and how they are wired. We are only interested in getting the electric power, and that?s all! Unfortunately, in practice, the similarities between the electric power grid and the computational Grid are very few. Actually from a computational Grid we cannot draw on computational resources, instead we have to provide the Grid with the program to be processed along with the access to the data needed for the computation.

According to a Foster?s check list the minimum properties of a Grid system are the following:

? A Grid coordinates resources that are not subject to centralized control (e.g. resources owned by different companies or under the control of different administrative units) and at the same time addresses the issues of security, policy, payment, membership, and so forth that arise in these settings.

? A Grid uses standard, open, general-purpose protocols and interfaces that address such fundamental issues as authentication, authorization, resource discovery and resource access.

? A Grid delivers nontrivial service qualities, i.e. it is able to meet complex user demands.

Therefore, a Grid should have a middleware that integrates distributed and heterogeneous computational resources in a large, virtual computer that can be used to solve a single problem at a given time. Of course, to achieve this result, the applications must be completely decoupled from the physical components, i.e. an application, instead of directly accessing a physical component of the Grid, has to request it through a middleware.

Grid computing provides clustering of remotely distributed computing. The principal focus of grid computing to date has been on maximizing the use of available processor resources for compute-intensive applications. Grid computing along with storage virtualization and server virtualization enables a Utility Computing.

Functionally, we can also speak of several types of grids:

? Computational grids (including CPU Scavenging grids) which focuses primarily on computationally-intensive operations.
? Data grids or the controlled sharing and management of large amounts of distributed data.
? Equipment grids which have a primary piece of equipment e.g. a telescope, and where the surrounding Grid is used to control the equipment remotely and to analyze the data produced.
CPU Scavenging CPU-scavenging, cycle-scavenging, cycle stealing, or shared computing creates a "grid" from the unused resources in a network of participants. Usually this technique is used to make use of instruction cycles on desktop computers that would otherwise be wasted at night, during lunch, or even in the scattered seconds throughout the day when the computer is waiting for user input or slow devices.

Volunteer computing projects use the CPU scavenging model almost exclusively.

In practice, participating computers also donate some supporting amount of disk storage space, RAM, and network bandwidth, in addition to raw CPU power. Nodes in this model are also more vulnerable to going "offline" in one way or another from time to time, as their owners use their resources for their primary purpose.

Grids versus conventional supercomputers

"Distributed" or "grid computing" in general is a special type of parallel computing which relies on complete computers (with onboard CPU, storage, power supply, network interface, etc.) connected to a network (private, public or the Internet) by a conventional network interface, such as Ethernet. This is in contrast to the traditional impression of a supercomputer, which has many CPUs connected by a local high-speed computer bus.

The primary advantage of distributed computing is that each node can be purchased as commodity hardware, which when combined can produce similar computing resources to a many-CPU supercomputer, but at lower cost. This is due to the economies of scale of producing commodity hardware, compared to the lower efficiency of designing and constructing a small number of custom supercomputers. The primary performance disadvantage is that the various CPUs and local storage areas do not have high-speed connections. This arrangement is thus well-suited to applications where multiple parallel computations can take place independently, without the need to communicate intermediate results between CPUs.

The high-end scalability of geographically dispersed grids is generally favorable, due to the low need for connectivity between nodes relative to the capacity of the public Internet. Conventional supercomputers also create physical challenges in supplying sufficient electricity and cooling capacity in a single location. Both supercomputers and grids can be used to run multiple parallel computations at the same time, which might be different simulations for the same project, or computations for completely different applications. The infrastructure and programming considerations needed to do this on each type of platform are different, however.

There are also differences in programming and deployment. It can be costly and difficult to write programs so that they can be run in the environment of a supercomputer, which may have a custom operating system, or require the program to address concurrency issues.

Understanding Grids

Grid computing is increasingly being viewed as the next phase of distributed computing. Built on pervasive Internet standards, grid computing enables organizations to share computing and information resources across department and organizational boundaries in a secure, highly efficient manner.

Organizations around the world are utilizing grid computing today in such diverse areas as collaborative scientific research, drug discovery, financial risk analysis, and product design. Grid computing enables research-oriented organizations to solve problems that were infeasible to solve due to computing and data-integration constraints. Grids also reduce costs through automation and improved IT resource utilization. Finally, grid computing can increase an organization?s agility enabling more efficient business processes and greater responsiveness to change. Over time grid computing will enable a more flexible, efficient and utility-like global computing infrastructure.

The key to realizing the benefits of grid computing is standardization, so that the diverse resources that make up a modern computing environment can be discovered, accessed, allocated, monitored, and in general managed as a single virtual system?even when provided by different vendors and/or operated by different organizations.

Grid Computing on Windows

The Digipede Network is a grid computing solution that provides the advantages of traditional grid solutions with additional features to simplify job creation and allow developers to grid-enable applications. The Digipede Network is a grid infrastructure which comprises a server to manage the system and many agent-nodes to execute the distributed work. The Digipede Server receives all job requests and maintains a prioritized queue of work to be done, along with a history of work completed on the system. It also guarantees execution of work on the system by monitoring the status of all Digipede Agents working on the grid. The Digipede Agent software is installed on each of the grid compute nodes; it manages that compute node's work on the grid. Although the Digipede Server receives all the job requests, the Digipede Agent decides what work can be performed on the compute node. It moves files and data as appropriate, controls the execution of the task, and returns results and status information. The Digipede Network also includes comprehensive job creation tools, detailed below.

The Digipede Server runs natively on all editions of Windows Server 2003; the Digipede Agent runs on any 32-bit or 64-bit Windows operating system since Windows 2000. As a result, the Digipede Network can be installed quickly and easily onto new or existing networked Windows computers.

Development Considerations

With the Digipede Network the development considerations are mainly architectural. Command-line applications and scripts can be written without the Digipede Framework SDK and easily distributed using the Digipede Workbench or command-line interface.

Applications writing using the Digipede Framework initiate the distribution of work themselves and can be any type of application: GUI, command-line, or script.

Executables and Scripts

Distributable applications can be written using any Windows development environment and language. These applications must be command-line applications or scripts that do not require user interaction. The user can then use the Digipede Workbench to define jobs to quickly and easily distribute execution. The Digipede Network will distribute the appropriate applications and files and manage remote execution.

Grid-enabled Applications

Grid-enabled applications require the Digipede Framework SDK and can be written using any development environment and language that supports COM, .NET 1.1, or .NET 2.0, including all versions of Visual Studio. Because the Digipede Framework supports existing API methodologies, Windows developers can quickly and easily grid-enable applications and scripts.

Strategic Benefits

? Collaboration
? Increased productivity
? Cost-efficient storage
? Efficient use of resources

1. Value of Collaboration

? Bring together resources in a dynamic and geographically distributed fashion.
? Use resources to provide a powerful tool for today?s enterprise.

2. Strategic Benefit: Productivity

? Productivity jumps due to increased computational activity.
? Conserve resources in a costeffective approach as power is drawn from downtime.

3. Grids Powered By Under Utilized Resources

? PCs not in use on evenings, weekends, and day time hours can provide significant computational resources.

4. Cost-Effective Storage

? Prior to grid technology, enterprises overbought storage.
? Grids tear down storage silos. Effective sharing of investments in storage with other enterprises.

5. Storage ROI

? Efficient use of modular storage; balanced workloads, and capacity on demand.
? By scaling out with in increments, low cost performance and reliability.
? Costs could be allocated by storage usage or processing capacity utilization.
? Grid costs still to be determined as give and take is needed.

6. Flavors of Grid Computing

? Enterprise could use a grid internally.
? Enterprise could share a grid with external partners and groups.

7. Capital Savings Provided

? Storage costs decreased
? Processing power costs decreased
? Use resources efficiently
? Ability to pool resources

8. Driving Forces

? Maturation
? Greater awareness
? Standardization of software
? Desire to go beyond traditional high-performance computing applications

9. Staffing Implications

? Grid allows enterprises to reallocate network admininistrators
? Grids require specific skill sets and personnel

10. Utilization Facts

? PCs free 80-90% of the time.
? PCs provide tremendous potential use as they are not used every hour.
? Pulls together compute power via scheduling.
? Compute jobs done when resources are idle.
? Approach similar to mainframe scheduling.

11. Security Implications For Enterprise Customer

? Virtually no security issues for internal grids due to firewalls.
? Open grids more insecure but prompt stronger security measures.
12. Service Level Agreements

? Contract is used by vendors and customers as well as internally.
? Can specify bandwidth availability, response times and problem resolution.

What is Grid ?

Grid is a type of parallel and distributed system that enables the sharing, selection, and aggregation of geographically distributed "autonomous" resources dynamically at runtime depending on their availability, capability, performance, cost, and users' quality-of-service requirements.

What are key Design Issues in Grid Computing?

Read the following articles that explains on designing of Grid computing in detail:
http://www.gridbus.org/papers/gridtech.pdf
http://www.aip.org/pt/vol-55/iss-2/p42.html

How is Grid different from other technologies such as Clusters/P2P/ASP? What about Grid Economy and Resource Management?

The key distinction between clusters and grids is mainly lie in the way resources are managed. In case of clusters, the resource allocation is performed by a centralised resource manager and all nodes cooperatively work together as a single unified resource. In case of Grids, each node has its own resource manager and don't aim for providing a single system view. Autonomous resources in the Grid can span across a single or multiple organisations.

What sort of "problems" is grid computing used for?

Those problems that are beyond the processing limits of individual computers. Right now that primarily means scientific or technical projects such as cancer and other medical research -- projects that involve the analysis of inordinate amounts of data.

What are some examples of current uses of grid computing?

Perhaps the most ambitious is Oxford University's Centre for Computational Drug Discovery's project that utilizes more than one million PCs to look for a cancer cure. People around the world donate a few CPU cycles from their PCs through "screensaver time." The project eventually will analyze 3.5 billion molecules for cancer-fighting potential. More than 50,000 years of CPU power (based on a 1.5 gigahertz chip) have been put to work so far.

One highly publicized project is the SETI (Search for Extraterrestrial Intelligence) @Home project, in which PC users worldwide donate unused processor cycles to help the search for signs of extraterrestrial life by analyzing signals coming from outer space.

How does grid computing work in practice? Is special hardware or software needed?

Having a computer tied to a network is a good start. The most far-reaching network, of course, is the Internet, which is enabling regular people with home PCs to participate in grid computing projects from anywhere in the world.

Beyond that, PC owners must download simple software from the project's host site. On the other end, grid computing projects use software that can divide and distribute pieces of a program to thousands of computers for processing.

Are any commercial companies involved with grid computing?

Absolutely. Ambujex Technologies have started to work on this particular technology recently, Sun Microsystems released its Grid Engine software in September 2000. IBM is involved in several grid computing projects. In November 2001, IBM announced it is building a computing grid for the University of Pennsylvania designed to bring advanced methods of breast cancer diagnosis and screening to patients across the nation, while reducing costs. Intel and Compaq are also involved in grid computing, as are a number of private companies.

What all software technologies are needed to build a Grid network?

Several Grid middleware software systems are available for building high-end computing Grids and their applications.

In building and enhancing World-Wide Grid network in collaboration with colleagues from around the globe, it is important to choose Grid technologies that support and work on a wide variety of resources, which are heterogenous in terms of various factors including architecture, instruction set, configuration, node operating system, and local resource managers.

To Grid-enable such resources, we used the following middleware technologies:

? Gridbus/Alchemi for Windows-based Resources (running .NET framework)
? Globus for Unix-based resources.
? Gridbus Broker: for aggregrating and deploying applications on such Grid resources.
? Gridscape for creating interactive and dynamic testbed/network portals.
? Gridbus G-monitor portal for accessing and monitoring applications.

Can you give me examples of some collaborative Grid networks?

The collaborative Grid networks are generally assembled to meet some challenges or demonstrate some new ideas. Examples include:

? World-Wide Grid (WWG): This Grid network constructed by the Global Data-Intensive Grid Collaboration to demonstrate several data-intensive Grid applications as part of the SC 2003 HPC Challenge.
? UK eScience Grid: A Grid network of UK academics.
? UK Tera Grid:A Grid network of American researchers
? LHC Grid: A Grid network of High-energy physics researchers.

GET INFORMED

ASK FOR HELP

NEWS

CLIENT LIST