S (programming language)

S
ParadigmMulti-paradigm: imperative, object oriented
DeveloperRick Becker, Allan Wilks, John Chambers, William S. Cleveland, Trevor Hastie
First appeared1976; 48 years ago (1976)
Typing disciplinedynamic, strong
Licensedepends on implementation
Websiteect.bell-labs.com/sl/S/ at the Wayback Machine (archived 2018-10-14)
Major implementations
S-PLUS
Influenced by
C, APL, PPL, Fortran
Influenced
R

S[1] is a statistical programming language developed primarily by John Chambers and (in earlier versions) Rick Becker, Trevor Hastie, William Cleveland and Allan Wilks of Bell Laboratories. The aim of the language, as expressed by John Chambers, is "to turn ideas into software, quickly and faithfully".[1] It is widely used by academic researchers.[2]

A major implementation of S is S-PLUS, a commercial product that was formerly sold by TIBCO Software.

The modern R, a part of the GNU free software project, was based on S[3] and can run many S programs, although it is not fully backwards compatible.[4]

History

"Old S"

S is one of several statistical computing languages that were designed at Bell Laboratories, and first took form between 1975–1976. Up to that time, much of the statistical computing was done by directly calling Fortran subroutines; however, S was designed to offer an alternate and more interactive approach, motivated in part by exploratory data analysis advocated by John Tukey.[5] Early design decisions that hold even today include interactive graphics devices (printers and character terminals at the time), and providing easily accessible documentation for the functions.[citation needed]

Development of the project was led by John Chambers and Trevor Hastie, and included developers Richard Becker, Allan Wilks, John Chambers, and William Cleveland,[6] all of whom were then employees of AT&T.[7] Out of the developers who contributed to S, Chambers is generally agreed to be the most significant contributor.[3] Chambers received the Software System Award from the Association for Computing Machinery for his work on S.[8]

The first working version of S was built in 1976, and operated on the GCOS operating system. At this time, S was unnamed, and suggestions included ISCS (Interactive SCS), SCS (Statistical Computing System), and SAS (Statistical Analysis System) (which was already taken: see SAS System). The name 'S' (used with single quotation marks until 1979) was chosen, as it was a common letter in the suggestions and consistent with other programming languages designed from the same institution at the time (namely the C programming language).[5] It stands for the word "statistics".[9]

When UNIX/32V was ported to the (then new) 32-bit DEC VAX, computing on the Unix platform became feasible for S. In late 1979, S2 was ported from GCOS to UNIX, which would become the new primary platform.[10]

In 1980 the first version of S was distributed outside Bell Laboratories and in 1981 source versions were made available.[5] S was distributed freely in academic circles, and became popular among academic statisticians.[11] In 1984 two books were published by the research team at Bell Laboratories: S: An Interactive Environment for Data Analysis and Graphics[12] (1984 Brown Book) and Extending the S System.[13] Also, in 1984 the source code for S became licensed through AT&T Software Sales for education and commercial purposes.

"New S"

The first version of S-PLUS was released by Statistical Sciences, Inc. in 1988. S-PLUS was later sold to TIBCO Software.[9] By this time, many changes were made to S and the syntax of the language with the release of S3.[10] The New S Language[14] (1988 Blue Book) was published to introduce the new features, such as the transition from macros to functions and how functions can be passed to other functions (such as apply). Many other changes to the S language were to extend the concept of "objects", and to make the syntax more consistent (and strict). However, many users found the transition to New S difficult, since their macros needed to be rewritten. Many other changes to S took hold, such as the use of X11 and PostScript graphics devices, rewriting many internal functions from Fortran to C, and the use of double precision (only) arithmetic. The New S language is very similar to that used in modern versions of S-PLUS and R.

The graphical user interface of S was also updated interactive graphical features after integration with Axum.[9]

In 1991, Statistical Models in S[15] (1991 White Book) was published, which introduced the use of formula-notation[16] (which use the ~ operator), data frame objects, and modifications to the use of object methods and classes.

S4

The latest version of the S standard is S4, released in 1998.[17] It provides advanced object-oriented features. S4 classes differ markedly from S3 classes; S4 formally defines the representation and inheritance for each class, and has multiple dispatch: the generic function can be dispatched to a method based on the class of any number of arguments, not just one.[18]

See also

References

  1. ^ a b Chambers, John M (1998). Programming with Data: A Guide to the S Language. Springer. ISBN 978-0-387-98503-9.
  2. ^ "S-Plus: An Introduction". www.stat.rice.edu. Retrieved 2024-02-28.
  3. ^ a b Ashwani, Kumar; Satyanarayana, Reddy, Seelam Sai (2020-09-25). Advancements in Security and Privacy Initiatives for Multimedia Images. IGI Global. p. 179. ISBN 978-1-7998-2797-9.{{cite book}}: CS1 maint: multiple names: authors list (link)
  4. ^ Nicholls, Andy; Pugh, Richard; Gott, Aimee (2015-12-16). R in 24 Hours, Sams Teach Yourself. Sams Publishing. ISBN 978-0-13-428880-2.
  5. ^ a b c Becker, Richard A., A Brief History of S, Murray Hill, New Jersey: AT&T Bell Laboratories, archived from the original (PS) on 2015-07-23, retrieved 2015-07-23
  6. ^ Berry, Kenneth J.; Johnston, Janis E.; Jr, Paul W. Mielke (2014-04-11). A Chronicle of Permutation Statistical Methods: 1920–2000, and Beyond. Springer Science & Business Media. pp. 207–208. ISBN 978-3-319-02744-9.
  7. ^ Encyclopedia of Statistical Sciences, Volume 12. John Wiley & Sons. 2005-12-16. p. 8088. ISBN 978-0-471-74406-1.
  8. ^ Charpentier, Arthur (2014-08-26). Computational Actuarial Science with R. CRC Press. p. 4. ISBN 978-1-4987-5982-3.
  9. ^ a b c Nicholls, Andy; Pugh, Richard; Gott, Aimee (2015-12-16). R in 24 Hours, Sams Teach Yourself. Sams Publishing. ISBN 978-0-13-428880-2.
  10. ^ a b Chambers, John (2008-06-14). Software for Data Analysis: Programming with R. Springer. pp. 477–478. ISBN 978-0-387-75936-4.
  11. ^ Hardin, James W.; Hilbe, Joseph M. (2002-07-30). Generalized Estimating Equations. CRC Press. p. 12. ISBN 978-1-4200-3528-5.
  12. ^ Becker, R.A.; Chambers, J.M. (1984). S: An Interactive Environment for Data Analysis and Graphics. Pacific Grove, CA, USA: Wadsworth & Brooks/Cole. ISBN 0-534-03313-X.
  13. ^ Becker, R.A.; Chambers, J.M. (1985). Extending the S System. Pacific Grove, CA, USA: Wadsworth & Brooks/Cole. ISBN 0-534-05016-6.
  14. ^ Becker, R.A.; Chambers, J.M.; Wilks, A.R. (1988). The New S Language: A Programming Environment for Data Analysis and Graphics. Pacific Grove, CA, USA: Wadsworth & Brooks/Cole. ISBN 0-534-09192-X.
  15. ^ Chambers, J.M.; Hastie, T.J. (1991). Statistical Models in S. Pacific Grove, CA, USA: Wadsworth & Brooks/Cole. p. 624. ISBN 0-412-05291-1.
  16. ^ Wilkinson, G.N.; Rogers, C.E. (1973). "Symbolic description of factorial models for analysis of variance". Applied Statistics. 22 (3): 392–399. doi:10.2307/2346786. JSTOR 2346786.
  17. ^ Chambers, John (January 1, 2001). "The S System". Bell Labs. Archived from the original on 2018-10-14.
  18. ^ Wickham, Hadley (2019). "S4". Advanced R. adv-r.had.co.nz. ISBN 9781466586963. Retrieved 2020-02-18.