en ASCI Red

ASCI Red
Active	Two-Thirds Operational March 1997, Fully Operational June 1997, decommissioned 2006
Sponsors	Intel Corporation
Operators	Sandia National Laboratories, US Department of Energy
Location	Sandia National Laboratories, United States
Power	850 kW
Operating system	Cougar / TOS (a Mach kernel derivative)
Space	1,600 sq ft (150 m2)
Memory	1212 gigabytes
Speed	1.3 teraflops (peak)
Ranking	TOP500: 1, June 2000
Purpose	nuclear materials testing, other
Legacy	First Supercomputer to achieve over 1.0 teraflops on LINPACK test
Website	web.archive.org

ASCI Red (also known as ASCI Option Red or TFLOPS) was the first computer built under the Accelerated Strategic Computing Initiative (ASCI),^[5]^[6] the supercomputing initiative of the United States government created to help the maintenance of the United States nuclear arsenal after the 1992 moratorium on nuclear testing.

ASCI Red was built by Intel and installed at Sandia National Laboratories in late 1996. The design was based on the Intel Paragon computer. The original goals to deliver a true teraflop machine by the end of 1996 that would be capable of running an ASCI application using all memory and nodes by September 1997 were met.^[7] It was used by the US government from the years of 1997 to 2005 and was the world's fastest supercomputer until late 2000.^[4]^[6] It was the first ASCI machine that the Department of Energy acquired,^[6] and also the first supercomputer to score above one teraflops on the LINPACK benchmark, a test that measures a computer's calculation speed. Later upgrades to ASCI Red allowed it to perform above two teraflops.

ASCI Red earned a reputation for reliability that some veterans say has never been beaten. Sandia director Bill Camp said that ASCI Red had the best reliability of any supercomputer ever built, and “was supercomputing’s high-water mark in longevity, price, and performance.” ^[8]

ASCI Red was decommissioned in 2006.^[2]

System structure

The ASCI Red supercomputer was a distributed memory MIMD (Multiple Instruction, Multiple Data) message-passing computer. The design provided high degrees of scalability for I/O, memory, compute nodes, storage capacity, and communications; standard parallel interfaces also made it possible to port parallel applications to the machine. The machine was structured into four partitions: Compute, Service, I/O, and System. Parallel applications executed in the Compute Partition which contained nodes optimized for floating point performance. The compute nodes had only the features required for efficient computation – they were not purposed for general interactive services. The Service Partition provided an integrated, scalable host that supported interactive users (log-in sessions), application development, and system administration. The I/O Partition supported disk I/O, a scalable parallel file system and network services. The System Partition supported initial booting and system Reliability, Availability, and Serviceability (RAS) capabilities.^[7]

The Service partition helps integrate all of the different parts of ASCI Red together. It provides a scalable host for users, and it is used for general system administration.^[1] The I/O Partition provides a file system and network services, and the Service partition is made up of the log-in screens, tools for application development, and utilities for network connections.^[5] The Compute partition contains nodes that are designed for floating point performance. This is where the actual computing takes place.^[5] Every one of the compute nodes accommodated two 200 MHz Pentium Pro processors, each with a 16 KB level-1 cache and a 256 KB level-2 cache, which were upgraded later to two 333 MHz Pentium II OverDrive processors, each with a 32 KB level-1 cache and a 512 KB level-2 cache.^[9] According to Intel, the ASCI Red Computer is also the first large scale supercomputer to be built entirely of common commercially available components.^[10]

All of ASCI Red's partitions are interconnected to form one supercomputer, however at the same time none of the nodes support global shared memory. Each of the nodes works in its own memory, and each shares data with the others through "explicit message-passing".^[11]

Technical specifications

The computer itself took up almost 1,600 square feet (150 m²) of space,^[3] and was made up of 104 "cabinets". Of those cabinets, 76 are computers (processors), 8 are switches, and 20 are disks. It had a total of 1212 GB of RAM, and 9298 separate processors. The original machine used Intel Pentium Pro processors each clocked at 200 MHz. These were later upgraded to specially packaged Pentium II Xeon processors, each clocked at 333 MHz. Overall, it required 850 kW of power (not including air conditioning). What sets ASCI Option Red aside from all of its predecessors in supercomputing is its high I/O bandwidth. Previous supercomputers had multi-GFLOPS performance, yet their slow I/O speeds would slow down, or bottleneck the systems. Intel's TFLOPS PFS is an extremely efficient "Parallel File System" that can sustain transfer speeds of up to 1 GB/s, eliminating bottlenecks.^[12]

First to TFLOPS

In December, 1996, three quarters of ASCI Red was measured at a world record 1.06 TFLOPS on MP LINPACK and held the record for fastest supercomputer in the world for several consecutive years, maxing out at 2.38 TFLOPS after a processor and memory upgrade in 1999.^[4]^[7] The system used Pentium Pro processors when initially constructed and when it recorded performance above one TFLOPS. In that configuration, when fully built it recorded 1.6 TFLOPS of performance. Upgrades later in 1999, to specially packaged Pentium II Xeon processors, pushed performance to 3.1 TFLOPS.^[8]

Operating system

The different partitions of ASCI Red run on different operating systems. For example, users of the computer work in an environment called "Teraflops OS", an operating system (once called Paragon OS) that was originally developed for the Intel Paragon XP/S Supercomputer.^[5] ASCI Red's Compute partition runs on an operating system named Cougar.^[11] Cougar is a Sandia Labs and University of New Mexico collaboration; it is a lightweight OS based on PUMA and SUNMOS, two systems that were also designed for use on the Paragon supercomputer.^[11] It consists of a light weight kernel, the Process Control Thread, and other utilities and libraries. The Linux 2.4 kernel was ported to the system and a custom CNIC driver was written, but the heavy weight OS did not perform as well as the Cougar lightweight kernel on many benchmarks.^[11]

References

^ ^a ^b ^c ^d Thomas, Robert. "ASCI Red Homepage". Sandia National Laboratories. Archived from the original on September 26, 2011. Retrieved October 30, 2011.
^ ^a ^b "Sandia's ASCI Red, world's first teraflop supercomputer, is decommissioned". sandia.gov. June 29, 2006. Archived from the original on September 29, 2013. Retrieved May 26, 2014.
^ ^a ^b Mattson, Timothy. "An Overview of the Intel TFLOPS Supercompute" (PDF). MIT. Retrieved October 30, 2011.
^ ^a ^b ^c "TOP500.org Ranking History for ASCI Red". TOP500 Supercomputer Sites. Retrieved October 29, 2011.
^ ^a ^b ^c ^d Mattson, Timothy. "The ASCI Option Red Supercomputer". Archived from the original on May 28, 2010. Retrieved October 27, 2011.
^ ^a ^b ^c Garg, Sharad (2001). "Performance Evaluation of Parallel File Systems for PC Clusters and ASCI Red". Proceedings 2001 IEEE International Conference on Cluster Computing. IEEE. pp. 172–177. doi:10.1109/CLUSTR.2001.959973. ISBN 0-7695-1116-3. S2CID 13224481.
^ ^a ^b ^c "7X Performance Results – Final Report: ASCI Red vs. Red Storm" (PDF). Retrieved November 17, 2011.
^ ^a ^b "Sandia's ASCI Red, world's first teraflop supercomputer, is decommissioned" (PDF). Retrieved January 8, 2013.
^ "TOP500.org feature page on the ASCI Red of the Sandia National Laboratory". Archived from the original on January 9, 2016. Retrieved January 8, 2016.
^ Warren, Michael (November 1997). "Pentium Pro Inside: I. A Treecode at 430 Gigaflops on ASCI Red, II. Price/Performance of $50/Mflop on Loki and Hyglac". Proceedings of the ACM/IEEE Conference. IEEE: 61. doi:10.1109/SC.1997.10057. S2CID 13167835.
^ ^a ^b ^c ^d Brightwell; Riesen; Underwood; Hudson; Bridges; MacCabe (2003). "A performance comparison of Linux and a lightweight kernel". Proceedings IEEE International Conference on Cluster Computing CLUSTR-03. pp. 251–258. doi:10.1109/CLUSTR.2003.1253322. ISBN 0-7695-2066-9. S2CID 7454194.
^ Garg, Sharad (1998). "TFLOPS PFS: Architecture and Design of A Highly Efficient Parallel File System". Proceedings of the IEEE/ACM SC98 Conference. IEEE. p. 2. doi:10.1109/SC.1998.10003. ISBN 0-8186-8707-X. S2CID 8683745.

Records
Preceded by CP-PACS/2048 368.20 gigaflops	World's most powerful supercomputer June 1997 – June 2000	Succeeded by ASCI White 4.938 teraflops

[ASCI-1] Thomas, Robert. "ASCI Red Homepage". Sandia National Laboratories. Archived from the original on September 26, 2011. Retrieved October 30, 2011.

[decom-2] "Sandia's ASCI Red, world's first teraflop supercomputer, is decommissioned". sandia.gov. June 29, 2006. Archived from the original on September 29, 2013. Retrieved May 26, 2014.

[MIT-3] Mattson, Timothy. "An Overview of the Intel TFLOPS Supercompute" (PDF). MIT. Retrieved October 30, 2011.

[top500-4] "TOP500.org Ranking History for ASCI Red". TOP500 Supercomputer Sites. Retrieved October 29, 2011.

[Sandia-5] Mattson, Timothy. "The ASCI Option Red Supercomputer". Archived from the original on May 28, 2010. Retrieved October 27, 2011.

[Garg-6] Garg, Sharad (2001). "Performance Evaluation of Parallel File Systems for PC Clusters and ASCI Red". Proceedings 2001 IEEE International Conference on Cluster Computing. IEEE. pp. 172–177. doi:10.1109/CLUSTR.2001.959973. ISBN 0-7695-1116-3. S2CID 13224481.

[cug.org-7] "7X Performance Results – Final Report: ASCI Red vs. Red Storm" (PDF). Retrieved November 17, 2011.

[jacobsequity.com-8] "Sandia's ASCI Red, world's first teraflop supercomputer, is decommissioned" (PDF). Retrieved January 8, 2013.

[9] "TOP500.org feature page on the ASCI Red of the Sandia National Laboratory". Archived from the original on January 9, 2016. Retrieved January 8, 2016.

[tree-10] Warren, Michael (November 1997). "Pentium Pro Inside: I. A Treecode at 430 Gigaflops on ASCI Red, II. Price/Performance of $50/Mflop on Loki and Hyglac". Proceedings of the ACM/IEEE Conference. IEEE: 61. doi:10.1109/SC.1997.10057. S2CID 13167835.

[Brightwell-11] Brightwell; Riesen; Underwood; Hudson; Bridges; MacCabe (2003). "A performance comparison of Linux and a lightweight kernel". Proceedings IEEE International Conference on Cluster Computing CLUSTR-03. pp. 251–258. doi:10.1109/CLUSTR.2003.1253322. ISBN 0-7695-2066-9. S2CID 7454194.

[PFS-12] Garg, Sharad (1998). "TFLOPS PFS: Architecture and Design of A Highly Efficient Parallel File System". Proceedings of the IEEE/ACM SC98 Conference. IEEE. p. 2. doi:10.1109/SC.1998.10003. ISBN 0-8186-8707-X. S2CID 8683745.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]