Pipeline (Unix)

A pipeline of three program processes run on a text terminal

In Unix-like computer operating systems, a pipeline is a mechanism for inter-process communication using message passing. A pipeline is a set of processes chained together by their standard streams, so that the output text of each process (stdout) is passed directly as input (stdin) to the next one. The second process is started as the first process is still executing, and they are executed concurrently.

The concept of pipelines was championed by Douglas McIlroy at Unix's ancestral home of Bell Labs, during the development of Unix, shaping its toolbox philosophy. It is named by analogy to a physical pipeline. A key feature of these pipelines is their "hiding of internals". This in turn allows for more clarity and simplicity in the system.

The pipes in the pipeline are anonymous pipes (as opposed to named pipes), where data written by one process is buffered by the operating system until it is read by the next process, and this uni-directional channel disappears when the processes are completed. The standard shell syntax for anonymous pipes is to list multiple commands, separated by vertical bars ("pipes" in common Unix verbiage).

History

The pipeline concept was invented by Douglas McIlroy[1] and first described in the man pages of Version 3 Unix.[2][3] McIlroy noticed that much of the time command shells passed the output file from one program as input to another. The concept of pipelines was championed by Douglas McIlroy at Unix's ancestral home of Bell Labs, during the development of Unix, shaping its toolbox philosophy.[4][5]

His ideas were implemented in 1973 when ("in one feverish night", wrote McIlroy) Ken Thompson added the pipe() system call and pipes to the shell and several utilities in Version 3 Unix. "The next day", McIlroy continued, "saw an unforgettable orgy of one-liners as everybody joined in the excitement of plumbing." McIlroy also credits Thompson with the | notation, which greatly simplified the description of pipe syntax in Version 4.[6][2]

Although developed independently, Unix pipes are related to, and were preceded by, the 'communication files' developed by Ken Lochner [7] in the 1960s for the Dartmouth Time Sharing System.[8]

Other operating systems

This feature of Unix was borrowed by other operating systems, such as MS-DOS and the CMS Pipelines package on VM/CMS and MVS, and eventually came to be designated the pipes and filters design pattern of software engineering.

Further concept development

In Tony Hoare's communicating sequential processes (CSP), McIlroy's pipes are further developed.[9]

Implementation

A pipeline mechanism is used for inter-process communication using message passing. A pipeline is a set of processes chained together by their standard streams, so that the output text of each process (stdout) is passed directly as input (stdin) to the next one. The second process is started as the first process is still executing, and they are executed concurrently. It is named by analogy to a physical pipeline. A key feature of these pipelines is their "hiding of internals".[10] This in turn allows for more clarity and simplicity in the system.

In most Unix-like systems, all processes of a pipeline are started at the same time, with their streams appropriately connected, and managed by the scheduler together with all other processes running on the machine. An important aspect of this, setting Unix pipes apart from other pipe implementations, is the concept of buffering: for example a sending program may produce 5000 bytes per second, and a receiving program may only be able to accept 100 bytes per second, but no data is lost. Instead, the output of the sending program is held in the buffer. When the receiving program is ready to read data, the next program in the pipeline reads from the buffer. If the buffer is filled, the sending program is stopped (blocked) until at least some data is removed from the buffer by the receiver. In Linux, the size of the buffer is 65,536 bytes (64KiB). An open source third-party filter called bfr is available to provide larger buffers if required.

Network pipes

Tools like netcat and socat can connect pipes to TCP/IP sockets.

Pipelines in command line interfaces

All widely used Unix shells have a special syntax construct for the creation of pipelines. In all usage one writes the commands in sequence, separated by the ASCII vertical bar character | (which, for this reason, is often called "pipe character"). The shell starts the processes and arranges for the necessary connections between their standard streams (including some amount of buffer storage).

The pipeline uses anonymous pipes. For anonymous pipes, data written by one process is buffered by the operating system until it is read by the next process, and this uni-directional channel disappears when the processes are completed; this differs from named pipes, where messages are passed to or from a pipe that is named by making it a file, and remains after the processes are completed. The standard shell syntax for anonymous pipes is to list multiple commands, separated by vertical bars ("pipes" in common Unix verbiage):

command1 | command2 | command3

For example, to list files in the current directory (ls), retain only the lines of ls output containing the string "key" (grep), and view the result in a scrolling page (less), a user types the following into the command line of a terminal:

ls -l | grep key | less

The command ls -l is executed as a process, the output (stdout) of which is piped to the input (stdin) of the process for grep key; and likewise for the process for less. Each process takes input from the previous process and produces output for the next process via standard streams. Each | tells the shell to connect the standard output of the command on the left to the standard input of the command on the right by an inter-process communication mechanism called an (anonymous) pipe, implemented in the operating system. Pipes are unidirectional; data flows through the pipeline from left to right.

Example

Below is an example of a pipeline that implements a kind of spell checker for the web resource indicated by a URL. An explanation of what it does follows.

curl "https://en.wikipedia.org/wiki/Pipeline_(Unix)" |
sed 's/[^a-zA-Z ]/ /g' |
tr 'A-Z ' 'a-z\n' |
grep '[a-z]' |
sort -u |
comm -23 - <(sort /usr/share/dict/words) |
less
  1. curl obtains the HTML contents of a web page (could use wget on some systems).
  2. sed replaces all characters (from the web page's content) that are not spaces or letters, with spaces. (Newlines are preserved.)
  3. tr changes all of the uppercase letters into lowercase and converts the spaces in the lines of text to newlines (each 'word' is now on a separate line).
  4. grep includes only lines that contain at least one lowercase alphabetical character (removing any blank lines).
  5. sort sorts the list of 'words' into alphabetical order, and the -u switch removes duplicates.
  6. comm finds lines in common between two files, -23 suppresses lines unique to the second file, and those that are common to both, leaving only those that are found only in the first file named. The - in place of a filename causes comm to use its standard input (from the pipe line in this case). sort /usr/share/dict/words sorts the contents of the words file alphabetically, as comm expects, and <( ... ) outputs the results to a temporary file (via process substitution), which comm reads. The result is a list of words (lines) that are not found in /usr/share/dict/words.
  7. less allows the user to page through the results.

Error stream

By default, the standard error streams ("stderr") of the processes in a pipeline are not passed on through the pipe; instead, they are merged and directed to the console. However, many shells have additional syntax for changing this behavior. In the csh shell, for instance, using |& instead of | signifies that the standard error stream should also be merged with the standard output and fed to the next process. The Bash shell can also merge standard error with |& since version 4.0[11] or using 2>&1, as well as redirect it to a different file.

Pipemill

In the most commonly used simple pipelines the shell connects a series of sub-processes via pipes, and executes external commands within each sub-process. Thus the shell itself is doing no direct processing of the data flowing through the pipeline.

However, it's possible for the shell to perform processing directly, using a so-called mill or pipemill (since a while command is used to "mill" over the results from the initial command). This construct generally looks something like:

command | while read -r var1 var2 ...; do
    # process each line, using variables as parsed into var1, var2, etc
    # (note that this may be a subshell: var1, var2 etc will not be available
    # after the while loop terminates; some shells, such as zsh and newer
    # versions of Korn shell, process the commands to the left of the pipe
    # operator in a subshell)
    done

Such pipemill may not perform as intended if the body of the loop includes commands, such as cat and ssh, that read from stdin:[12] on the loop's first iteration, such a program (let's call it the drain) will read the remaining output from command, and the loop will then terminate (with results depending on the specifics of the drain). There are a couple of possible ways to avoid this behavior. First, some drains support an option to disable reading from stdin (e.g. ssh -n). Alternatively, if the drain does not need to read any input from stdin to do something useful, it can be given < /dev/null as input.

As all components of a pipe are run in parallel, a shell typically forks a subprocess (a subshell) to handle its contents, making it impossible to propagate variable changes to the outside shell environment. To remedy this issue, the "pipemill" can instead be fed from a here document containing a command substitution, which waits for the pipeline to finish running before milling through the contents. Alternatively, a named pipe or a process substitution can be used for parallel execution. GNU bash also has a lastpipe option to disable forking for the last pipe component.[13]


Creating pipelines programmatically

Pipelines can be created under program control. The Unix pipe() system call asks the operating system to construct a new anonymous pipe object. This results in two new, opened file descriptors in the process: the read-only end of the pipe, and the write-only end. The pipe ends appear to be normal, anonymous file descriptors, except that they have no ability to seek.

To avoid deadlock and exploit parallelism, the Unix process with one or more new pipes will then, generally, call fork() to create new processes. Each process will then close the end(s) of the pipe that it will not be using before producing or consuming any data. Alternatively, a process might create new threads and use the pipe to communicate between them.

Named pipes may also be created using mkfifo() or mknod() and then presented as the input or output file to programs as they are invoked. They allow multi-path pipes to be created, and are especially effective when combined with standard error redirection, or with tee.

The robot in the icon for Apple's Automator, which also uses a pipeline concept to chain repetitive commands together, holds a pipe in homage to the original Unix concept.

See also

References

  1. ^ "The Creation of the UNIX Operating System". Bell Labs. Archived from the original on September 14, 2004.
  2. ^ a b McIlroy, M. D. (1987). A Research Unix reader: annotated excerpts from the Programmer's Manual, 1971–1986 (PDF) (Technical report). CSTR. Bell Labs. 139.
  3. ^ Thompson K, Ritchie DM (February 1973). UNIX Programmer's Manual Third Edition (PDF) (Technical report) (3rd ed.). Bell Labs. p. 178.
  4. ^ Mahoney, Michael S. "The Unix Oral History Project: Release.0, The Beginning". McIlroy: It was one of the only places where I very nearly exerted managerial control over Unix, was pushing for those things, yes.
  5. ^ "Prophetic Petroglyphs". www.bell-labs.com. Archived from the original on 8 May 1999. Retrieved 22 May 2022.
  6. ^ "Pipes: A Brief Introduction". The Linux Information Project. August 23, 2006 [Created April 29, 2004]. Retrieved January 7, 2024.
  7. ^ "Dartmouth Timesharing" (DOC). Rochester Institute of Technology. Retrieved January 7, 2024.
  8. ^ "Data". www.bell-labs.com. Archived from the original on 20 February 1999. Retrieved 22 May 2022.
  9. ^ Cox, Russ. "Bell Labs and CSP Threads". Swtchboard. Retrieved January 7, 2024.
  10. ^ Ritchie & Thompson, 1974
  11. ^ "Bash release notes". tiswww.case.edu. Retrieved 2017-06-14.
  12. ^ "Shell Loop Interaction with SSH". 6 March 2012. Archived from the original on 6 March 2012.
  13. ^ John1024. "How can I store the "find" command results as an array in Bash". Stack Overflow.{{cite web}}: CS1 maint: numeric names: authors list (link)

Read other articles:

Cet article est une ébauche concernant une localité italienne et le Trentin-Haut-Adige. Vous pouvez partager vos connaissances en l’améliorant (comment ?) selon les recommandations des projets correspondants. Croviana Administration Pays Italie Région Trentin-Haut-Adige  Province Trentin   Code postal 38027 Code ISTAT 022068 Code cadastral D188 Préfixe tel. 0463 Démographie Population 699 hab. (1er janvier 2023[1]) Densité 140 hab./km2 Géographie Coordonnées 4…

Japanese Prime Minister for 1922–23 In this Japanese name, the surname is Katō. Katō Tomosaburō加藤 友三郎Prime Minister of JapanIn office12 June 1922 – 24 August 1923MonarchTaishōRegentHirohitoPreceded byTakahashi KorekiyoSucceeded byUchida Kosai (Acting)Minister of the NavyIn office10 August 1915 – 15 May 1923Preceded byYashiro RokuroSucceeded byTakeshi Takarabe Personal detailsBorn(1861-02-22)22 February 1861Hiroshima Domain, Aki Province, JapanDied24 August 1…

Greek mythological figure For other uses, see Deidamia (Greek myth).DeidamiaPrincess of ScyrosDeidamiaAbodeSkyrosPersonal informationParentsLycomedesSiblingssix sistersConsortAchillesOffspringNeoptolemus (or Pyrrhus) and Oneiros In Greek mythology, Deidamia (/ˌdeɪdəˈmaɪə/; Ancient Greek: Δηϊδάμεια Deïdameia) was a princess of Scyros as a daughter of King Lycomedes.[1] Mythology Deidamia was one of King Lycomedes's seven daughters with whom Achilles was concealed.[2 …

Battle of the American Civil War For the battle during the American Revolutionary War, see Battle of Baton Rouge (1779). This article needs additional citations for verification. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed.Find sources: Battle of Baton Rouge 1862 – news · newspapers · books · scholar · JSTOR (August 2011) (Learn how and when to remove this message) Battle …

يفتقر محتوى هذه المقالة إلى الاستشهاد بمصادر. فضلاً، ساهم في تطوير هذه المقالة من خلال إضافة مصادر موثوق بها. أي معلومات غير موثقة يمكن التشكيك بها وإزالتها. (مارس 2023) برامج تحرير الفيديو هي تطبيقات برمجية تقوم بتحرير الفيديو على أجهزة الحاسوب، أي تعديل مقاطع الفيديو ودمج بع…

British video game developer Ubisoft LeamingtonFormerlyFreeStyleGames Limited (2002–2017)Company typeSubsidiaryIndustryVideo gamesFounded29 November 2002; 21 years ago (2002-11-29) in Warwick, EnglandFounderAlex DarbyAlex ZoroDavid OsbournJamie JacksonJonny AmbrosePhil HindleHeadquartersLeamington Spa, EnglandKey peopleLisa Opie (managing director)ProductsDJ Hero seriesGuitar Hero seriesNumber of employees50 (2016)ParentActivision (2008–2017)Ubisoft (2017–present)Web…

Noer Alie Anggota KonstituanteMasa jabatan13 Mei 1957 (1957-05-13) – 5 Juli 1959 (1959-7-5)PresidenSoekarnoKetua KonstituanteWilopoPendahuluSjafruddin PrawiranegaraPenggantiPetahanaDaerah pemilihanJawa BaratWakil Ketua Dewan Perwakilan Rakyat Daerah Kabupaten BekasiMasa jabatan1950–1956PresidenSoekarno Informasi pribadiLahir15 Juli 1914Bekasi, Hindia BelandaMeninggal29 Januari 1992(1992-01-29) (umur 77)Bekasi, Jawa Barat, IndonesiaPartai politik Partai MasyumiPekerjaa…

National Rail station in London, England Seven Kings Station entrance seen in May 2022Seven KingsLocation of Seven Kings in Greater LondonLocationSeven KingsLocal authorityLondon Borough of RedbridgeManaged byElizabeth lineOwnerNetwork RailStation codeSVKDfT categoryC2Number of platforms4AccessibleYes[1]Fare zone4National Rail annual entry and exit2018–19 3.168 million[2]2019–20 3.157 million[2]2020–21 1.286 million[2]2021–22 2.317 million[2]2022…

For the Alkaline Trio song, see This Addiction. 2012 American filmThe American ScreamFilm posterDirected byMichael StephensonProduced byMichael StephensonLindsay StephensonRod OlsonMeyer ShartzstienZack CarlsonStarringVictor BariteauMatthew BrodeurRichard BrodeurManny SouzaLori SouzaCinematographyKatie GrahamEdited byAndrew MatthewsMusic byBobby TahouriProductioncompanyBrainstorm MediaDistributed byBrainstorm MediaChiller FilmsRelease date September 23, 2012 (2012-09-23) [cita…

2020年夏季奥林匹克运动会波兰代表團波兰国旗IOC編碼POLNOC波蘭奧林匹克委員會網站olimpijski.pl(英文)(波兰文)2020年夏季奥林匹克运动会(東京)2021年7月23日至8月8日(受2019冠状病毒病疫情影响推迟,但仍保留原定名称)運動員206參賽項目24个大项旗手开幕式:帕维尔·科热尼奥夫斯基(游泳)和马娅·沃什乔夫斯卡(自行车)[1]闭幕式:卡罗利娜·纳亚(皮划艇)[2…

Alessandro FarnesePotret Farnese oleh Jean Baptiste de SaiveAdipati Parma dan PiacenzaBerkuasa15 September 1586 – 3 Desember 1592PendahuluOttavioPenerusRanuccio IGubernur Belanda SpanyolBerkuasa1 Oktober 1578 - 3 Desember 1592PendahuluJohann dari AustriaPenerusPeter Ernst I von Mansfeld-VorderortInformasi pribadiKelahiran(1545-08-27)27 Agustus 1545Roma, Negara GerejaKematian3 Desember 1592(1592-12-03) (umur 47)ArrasWangsaWangsa FarneseAyahOttavio FarneseIbuMargaret dari ParmaPasanganInfan…

Artikel ini sebatang kara, artinya tidak ada artikel lain yang memiliki pranala balik ke halaman ini.Bantulah menambah pranala ke artikel ini dari artikel yang berhubungan atau coba peralatan pencari pranala.Tag ini diberikan pada Desember 2022. Robert BerriLahir(1912-12-16)16 Desember 1912Paris, PrancisMeninggal22 November 1989(1989-11-22) (umur 76)Rueil-Malmaison, PrancisPekerjaanPemeranTahun aktif1937–1979 Robert Berri (16 Desember 1912 – 22 November 1989) adalah s…

1966 United States Senate special election in Virginia ← 1964 November 8, 1966 1970 →   Nominee Harry F. Byrd Jr. Lawrence M. Traylor John W. Carter Party Democratic Republican Independent Popular vote 389,028 272,804 57,692 Percentage 53.30% 37.38% 7.90% County and Independent City ResultsByrd:      30-40%      40-50%      50-60%      60-70%     …

Queen Street, one of the major roads in Brisbane, after the 1893 floods. Residents are seen rowing boats to move about due to the flooding. South Brisbane during the 1893 flood. Map from the Irrigation and Water Supply Commission. Albert Bridge The 1893 Brisbane flood, occasionally referred to as the Great Flood of 1893 or the Black February flood, occurred in 1893 in Brisbane, Queensland, Australia. The Brisbane River burst its banks on three occasions in February 1893. It was the occurrence of…

American paleontologist John Bell HatcherBorn(1861-10-11)October 11, 1861Cooperstown, Illinois, USDiedJuly 3, 1904(1904-07-03) (aged 42)Pittsburgh, Pennsylvania, USResting placeHomewood Cemetery (Pittsburgh, Pennsylvania)Alma materGrinnell College Yale University's Sheffield Scientific SchoolSpouseAnna Matilda PetersonScientific careerFieldsPaleontology, BotanyInstitutionsUnited States Geological SurveyPeabody Museum of Natural HistoryPrinceton UniversityThesis On the Genus of Mosses t…

Ethnic group Greeks of Melbourne Έλληνες της ΜελβούρνηςTotal populationGreeks173,598 by ancestry, 45,618 by birth (3.87% of Greater Melbourne's population)[1]LanguagesAustralian EnglishGreekReligionPredominantly Greek OrthodoxRelated ethnic groupspart of Greek Australians Part of a series onGreeks Etymology Greek names By countryNative communities Greece Cyprus Albania Italy Russia and Ukraine Turkey Greek diaspora Australia Melbourne Canada Toronto Germany United Kin…

Musée Paul et Alexandra KanellopoulosInformations généralesSite web (el + en) pacf.gr/enLocalisationLocalisation dème des Athéniens GrèceCoordonnées 37° 58′ 22″ N, 23° 43′ 33″ Emodifier - modifier le code - modifier Wikidata Le musée Paul et Alexandra Kanellopoulos (grec moderne : Μουσείο Παύλου και Αλεξάνδρας Κανελλοπούλου) est un musée d'antiquités situé à Athènes, en Grèce. Il…

العلاقات الدنماركية الباربادوسية الدنمارك باربادوس   الدنمارك   باربادوس تعديل مصدري - تعديل   العلاقات الدنماركية الباربادوسية هي العلاقات الثنائية التي تجمع بين الدنمارك وباربادوس.[1][2][3][4][5] مقارنة بين البلدين هذه مقارنة عامة ومرجعية لل…

Mulyadi Irsan Mulyadi Irsan (lahir 17 Mei 1967) adalah seorang birokrat Indonesia. Ia menempuh pendidikan di SDN 22 Tanjungkarang, SMPN 1 Tanjungkarang dan SMAN 2 Tanjungkarang. Ia kemudian melanjutkan pendidikannya ke Universitas Diponegoro, jurusan Teknik Sipil. Ia meraih gelar Magister Teknik Industri di Institut Teknologi Bandung (ITB). Ia mengawali karirnya sebagai Aparatur Sipil Negara (ASN) pada 1 Maret 1993. Ia pernah menduduki berbagai jabatan di Lampung Barat, mulai dari Pj Kasi Gedung…

Broncos de ReynosaInformationLeagueLiga Mexicana de Béisbol (North Zone)LocationReynosa, Tamaulipas, MexicoBallparkAdolfo Lopez Mateos stadiumFounded1963Folded2016League championships1969ManagerRafael CastenedaWebsiteOfficial Site Los Broncos de Reynosa (English: Reynosa Broncos) were a Mexican team that played in the Adolfo Lopez Mateos stadium in the city of Reynosa, Tamaulipas. The Broncos de Reynosa played in the Liga Mexicana de Beisbol and it was part of the Zona Norte division. The …