7z

7z file format
Filename extension
.7z
Internet media type
application/x-7z-compressed
Uniform Type Identifier (UTI)org.7-zip.7-zip-archive
Magic number'7', 'z', 0xBC, 0xAF, 0x27, 0x1C
Size limitation264 bytes (roughly 18 exabytes)
Developed byIgor Pavlov[1]
Initial release1999; 25 years ago (1999)[2]
Type of formatData compression
Open format?Yes: GNU Lesser General Public License / Public domain
Website7-zip.org

7z is a compressed archive file format that supports several different data compression, encryption and pre-processing algorithms. The 7z format initially appeared as implemented by the 7-Zip archiver. The 7-Zip program is publicly available under the terms of the GNU Lesser General Public License. The LZMA SDK 4.62 was placed in the public domain in December 2008. The latest stable version of 7-Zip and LZMA SDK is version 24.05.[2]

The official, informal 7z file format specification is distributed with 7-Zip's source code since 2015. The specification can be found in plain text format in the 'doc' sub-directory of the source code distribution.[3] There have been additional third-party attempts at writing more concrete documentation based on the released code.[4]

Features and enhancements

The 7z format provides the following main features:

  • Open, modular architecture that allows any compression, conversion, or encryption method to be stacked.
  • High compression ratios (depending on the compression method used).
  • AES-256 bit encryption.
  • Zip 2.0 (Legacy) Encryption
  • Large file support (up to approximately 16 exbibytes, or 264 bytes).
  • Unicode file names.
  • Support for solid compression, where multiple files of like type are compressed within a single stream, in order to exploit the combined redundancy inherent in similar files.
  • Compression and encryption of archive headers.
  • Support for multi-part archives : e.g. xxx.7z.001, xxx.7z.002, ... (see the context menu items Split File... to create them and Combine Files... to re-assemble an archive from a set of multi-part component files).
  • Support for custom codec plugin DLLs.

The format's open architecture allows additional future compression methods to be added to the standard.

Compression methods

The following compression methods are currently defined:

  • LZMA – A variation of the LZ77 algorithm, using a sliding dictionary up to 4 GB in length for duplicate string elimination. The LZ stage is followed by entropy coding using a Markov chain-based range coder and binary trees.
  • LZMA2 – modified version of LZMA providing better multithreading support and less expansion of incompressible data.[5]
  • Bzip2 – The standard Burrows–Wheeler transform algorithm. Bzip2 uses two reversible transformations; BWT, then Move to front with Huffman coding for symbol reduction (the actual compression element).
  • PPMd – Dmitry Shkarin's 2002 PPMdH (PPMII (Prediction by Partial matching with Information Inheritance) and cPPMII (complicated PPMII)) with small changes: PPMII is an improved version of the 1984 PPM compression algorithm (prediction by partial matching).
  • DEFLATE – Standard algorithm based on 32 kB LZ77 and Huffman coding. Deflate is found in several file formats including ZIP, gzip, PNG and PDF. 7-Zip contains a from-scratch DEFLATE encoder that frequently beats the de facto standard zlib version in compression size, but at the expense of CPU usage.

A suite of recompression tools called AdvanceCOMP contains a copy of the DEFLATE encoder from the 7-Zip implementation; these utilities can often be used to further compress the size of existing gzip, ZIP, PNG, or MNG files.

Pre-processing filters

The LZMA SDK comes with the BCJ and BCJ2 preprocessors included, so that later stages are able to achieve greater compression: For x86, ARM, PowerPC (PPC), IA-64 Itanium, and ARM Thumb processors, jump targets are 'normalized' [5] before compression by changing relative position into absolute values. For x86, this means that near jumps, calls and conditional jumps (but not short jumps and conditional jumps) are converted from the machine language "jump 1655 bytes backwards" style notation to normalized "jump to address 5554" style notation; all jumps to 5554, perhaps a common subroutine, are thus encoded identically, making them more compressible.

  • BCJ – Converter for 32-bit x86 executables. Normalise target addresses of near jumps and calls from relative distances to absolute destinations.
  • BCJ2– Pre-processor for 32-bit x86 executables. BCJ2 is an improvement on BCJ, adding additional x86 jump/call instruction processing. Near jump, near call, conditional near jump targets are split out and compressed separately in another stream.
  • Delta encoding – delta filter, basic preprocessor for multimedia data.

Similar executable pre-processing technology is included in other software; the RAR compressor features displacement compression for 32-bit x86 executables and IA-64 executables, and the UPX runtime executable file compressor includes support for working with 16-bit values within DOS binary files.

Encryption

The 7z format supports encryption with the AES algorithm with a 256-bit key. The key is generated from a user-supplied passphrase using an algorithm based on the SHA-256 hash function. The SHA-256 is executed 219 (524288) times,[6] which causes a significant delay on slow PCs before compression or extraction starts. This technique is called key stretching and is used to make a brute-force search for the passphrase more difficult. Current GPU-based, and custom hardware attacks limit the effectiveness of this particular method of key stretching,[7] so it is still important to choose a strong password. The 7z format provides the option to encrypt the filenames of a 7z archive.

Limitations

The 7z format does not store filesystem permissions (such as UNIX owner/group permissions or NTFS ACLs), and hence can be inappropriate for backup/archival purposes. A workaround on UNIX-like systems for this is to convert data to a tar bitstream before compressing with 7z. But GNU tar (common in many UNIX environments) can also compress with the LZMA2 algorithm ("xz") natively, without the use of 7z, using the "-J" switch. The resulting file extension is ".tar.xz" or ".txz" and not ".tar.7z". This method of compression has been adopted with many distributions for packaging, such as Arch, Debian (deb), Fedora (rpm) and Slackware. (The older "lzma" format is less efficient.)[8] On the other hand, it is important to note, that tar does not save the filesystem encoding, which means that tar compressed filenames can become unreadable if decompressed on a different computer.

The 7z format does not allow extraction of some "broken files"—that is (for example) if one has the first segment of a series of 7z files, 7z cannot give the start of the files within the archive—it must wait until all segments are downloaded. The 7z format also lacks recovery records, making it vulnerable to data degradation unless used in conjunction with external solutions, like parchives, or within filesystems with robust error-correction. By way of comparison, zip files also lack a recovery feature while the rar format has one.

See also

References

  1. ^ "A Few Questions for Igor Pavlov". Dr. Dobb's Data Compression Newsletter. 30 April 2003. Archived from the original on 28 October 2008. Retrieved 26 December 2009.
  2. ^ a b "History of 7-zip changes". Archived from the original on 27 February 2015. Retrieved 10 June 2010.
  3. ^ LZMA SDK, "DOC" directory, 7zFormat.txt
  4. ^ ".7z format specification — py7zr – 7-zip archive library". py7zr.readthedocs.io.
  5. ^ a b Collin, Lasse. "lzma_.lzma". liblzma bindings. Archived from the original on 8 February 2010. Retrieved 3 January 2010. Compared to LZMA1, LZMA2 adds support for LZMA_SYNC_FLUSH, uncompressed chunks (smaller expansion when trying to compress uncompressible data), possibility to change lc/lp/pb in the middle of encoding, and some other internal improvements.
  6. ^ "7-zip source code". Archived from the original on 22 March 2019. Retrieved 23 March 2018.
  7. ^ Colin Percival. scrypt Archived 28 May 2019 at the Wayback Machine. As presented in "Stronger Key Derivation via Sequential Memory-Hard Functions" Archived 14 April 2019 at the Wayback Machine. presented at BSDCan'09, May 2009.
  8. ^ "GNU tar 1.34: 8.1 Using Less Space through Compression". Archived from the original on 2 April 2015. Retrieved 17 March 2015.

Further reading

  • Salomon, David (2007). Data compression: the complete reference. Springer. p. 241. ISBN 978-1-84628-602-5.

Read other articles:

American baseball player Baseball player Mike ShawarynShawaryn with the Omaha Storm Chasers in 2021Staten Island FerryHawks – No. 18PitcherBorn: (1994-09-17) September 17, 1994 (age 29)Carneys Point, New Jersey, U.S.Bats: RightThrows: RightMLB debutJune 7, 2019, for the Boston Red SoxMLB statistics (through 2019 season)Win–loss record0–0Earned run average9.74Strikeouts29 Teams Boston Red Sox (2019) Michael Thomas Shawaryn (born September 17, 1994) is an American profess…

Faduhusi Daely Bupati Nias Barat ke-2PetahanaMulai menjabat 22 April 2016PresidenJoko WidodoGubernurTengku Erry NuradiEdy RahmayadiWakilKhenoki WaruwuPendahuluAdrianus Aroziduhu GulöPenggantiPetahana Informasi pribadiLahir25 November 1956 (umur 67) Onawaembo, Lahomi, Nias Barat, Sumatera UtaraSuami/istriInsani HalawaAnakEster Daely, Am.Kep. Surya Rh Daely, Skm Triagus M Daely, SS dr. Hetty Daely Yeremia DP Daely, S.E.Alma materSTKIP Riama MedanSunting kotak info • L •…

Piala Liga Inggris 1976–19771976–77 Football League CupNegara Inggris WalesTanggal penyelenggaraan14 Agustus 1976 s.d. 13 April 1977Jumlah peserta92Juara bertahanManchester CityJuaraAston Villa(gelar ke-3)Tempat keduaEverton← 1975–1976 1977–1978 → Piala Liga Inggris 1976–1977 adalah edisi ke-17 penyelenggaraan Piala Liga Inggris, sebuah kompetisi dengan sistem gugur untuk 92 tim terbaik di Inggris. Edisi ini dimenangkan oleh Aston Villa setelah mengalahkan Everton pada pe…

County in Kentucky, United States County in KentuckyGallatin CountyCountyGallatin County Courthouse in WarsawLocation within the U.S. state of KentuckyKentucky's location within the U.S.Coordinates: 38°46′N 84°52′W / 38.76°N 84.86°W / 38.76; -84.86Country United StatesState KentuckyFounded1798Named forAlbert GallatinSeatWarsawLargest cityWarsawArea • Total105 sq mi (270 km2) • Land101 sq mi (260 km2)&…

Record for the longest distance cycled in one hour on a bicycle For the hour record in athletics, see One hour run. For the hour record in non-upright bicycles, see Hour record (recumbents). The hour record is the record for the longest distance cycled in one hour on a bicycle from a stationary start. Cyclists attempt this record alone on the track without other competitors present. It is considered one of the most prestigious records in cycling. Since it was first set, cyclists ranging from unk…

Television series Kung Fu: The Legend ContinuesCreated byEd SpielmanBased onKung Fuby Ed SpielmanJerry ThorpeHerman MillerStarring David Carradine Chris Potter Narrated byRichard AndersonComposerJeff DannaCountry of origin Canada United States No. of seasons4No. of episodes88 (list of episodes)ProductionExecutive producerMichael SloanProducers Gavin Mitchell Susan Murdoch John Hackett Running time44–46 minutesProduction company Warner Bros. Television Original releaseNetworkPrime Time Entertai…

1799–99 conflict in the Kingdom of Mysore Fourth Anglo-Mysore WarPart of the Anglo-Mysore WarsA map of the war thereafterDate1798 – 4 May 1799LocationIndian subcontinentResult British victoryBelligerents Mysore  Great Britain  East India Company Hyderabad Deccan Maratha Confederacy PeshwasCommanders and leaders Tipu Sultan † Purnaiah Mir Sadiq[1] Ghulam Muhammad Khan Sipahsalar Sayyid Abdul Ghaffar Sahib Mir Golam Hussain Mohomed Hulleen Mir Miran George Harris D…

VoilemontcomuneVoilemont – Veduta LocalizzazioneStato Francia RegioneGrand Est Dipartimento Marna ArrondissementSainte-Menehould CantoneArgonne Suippe et Vesle TerritorioCoordinate49°03′N 4°48′E / 49.05°N 4.8°E49.05; 4.8 (Voilemont)Coordinate: 49°03′N 4°48′E / 49.05°N 4.8°E49.05; 4.8 (Voilemont) Superficie6 km² Abitanti48[1] (2009) Densità8 ab./km² Altre informazioniCod. postale51800 Fuso orarioUTC+1 Codice INSEE5165…

第三十二届夏季奥林匹克运动会柔道比賽比賽場館日本武道館日期2021年7月24日至31日項目數15参赛选手393(含未上场5人)位選手,來自128(含未上场4队)個國家和地區← 20162024 → 2020年夏季奥林匹克运动会柔道比赛个人男子女子60公斤级48公斤级66公斤级52公斤级73公斤级57公斤级81公斤级63公斤级90公斤级70公斤级100公斤级78公斤级100公斤以上级78公斤以上级团体混合…

2020年夏季奥林匹克运动会波兰代表團波兰国旗IOC編碼POLNOC波蘭奧林匹克委員會網站olimpijski.pl(英文)(波兰文)2020年夏季奥林匹克运动会(東京)2021年7月23日至8月8日(受2019冠状病毒病疫情影响推迟,但仍保留原定名称)運動員206參賽項目24个大项旗手开幕式:帕维尔·科热尼奥夫斯基(游泳)和马娅·沃什乔夫斯卡(自行车)[1]闭幕式:卡罗利娜·纳亚(皮划艇)[2…

Yusuf dan Asenath pada sebuah gambar di Berlin. Pria yang digambarkan di dekat mereka adalah Potifera. Potifera /pɒˈtɪfərə/ adalah seorang pendeta dari kota On, Mesir Kuno,[1] yang disebutkan dalam Kejadian 41:45 dan Kejadian 41:50. Ia adalah ayah dari Asnat, yang diberikan kepada Yusuf bin Yakub sebagai istrinya oleh Firaun (Kejadian 41:45) dan mengkaruniai Yusuf dengan dua putra: Manasye dan Efraim.[2] Teori Potifera diyakini adalah pangeran, tak sekadar pendeta,[3]…

Частина серії проФілософіяLeft to right: Plato, Kant, Nietzsche, Buddha, Confucius, AverroesПлатонКантНіцшеБуддаКонфуційАверроес Філософи Епістемологи Естетики Етики Логіки Метафізики Соціально-політичні філософи Традиції Аналітична Арістотелівська Африканська Близькосхідна іранська Буддійсь…

  لمعانٍ أخرى، طالع روبي روبنسون (توضيح). روبي روبنسون معلومات شخصية الميلاد 22 أغسطس 1989 (35 سنة)  ويستبورت  [لغات أخرى]‏  مواطنة نيوزيلندا  أقرباء داميان ماكنزي (ابن العم/ه أو الخال/ه)مارتي ماكنزي (ابن العم/ه أو الخال/ه)  الحياة العملية المهنة لاعب اتحاد الر…

Scottish politician Not to be confused with Cathy Jamison. Cathy JamiesonOfficial portrait, 2003Leader of the Labour Party in Scotland[a]Acting28 June 2008 – 13 September 2008UK party leaderGordon BrownPreceded byWendy AlexanderSucceeded byIain GrayActing15 August 2007 – 14 September 2007UK party leaderGordon BrownPreceded byJack McConnellSucceeded byWendy AlexanderActing8 November 2001 – 22 November 2001UK party leaderTony B…

Joint venture that produces stock market indices This article needs additional citations for verification. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed.Find sources: S&P Dow Jones Indices – news · newspapers · books · scholar · JSTOR (June 2023) (Learn how and when to remove this message) S&P Dow Jones Indices LLCTrade nameS&P Dow Jones IndicesCompany typeJoint ventu…

Image-generating deep-learning model DALL·EWatermark present on DALL·E images generated on OpenAI's labs.openai.comAn image generated by DALL·E 3 with GPT-4 based on the text prompt A modern architectural building with large glass windows, situated on a cliff overlooking a serene ocean at sunsetDeveloper(s)OpenAIInitial release5 January 2021; 3 years ago (2021-01-05)Stable releaseDALL·E 3 / 10 August 2023; 10 months ago (2023-08-10) TypeText-to-image model…

Real estate outside an Indian reservation held by the US govt. for the benefit of a tribe This article is part of a series onPolitical divisions ofthe United States First level State (Commonwealth) Federal district Territory (Commonwealth) Indian reservation (list) / Hawaiian home land / Alaska Native tribal entity / Pueblo / Off-reservation trust land / Tribal Jurisdictional Area Second level County / Parish / Borough Unorganized Borough / Census area / Villages / District (USVI) / District (AS…

اتنين في الصندوق النوع كوميدي، دراما تأليف لؤي السيد إخراج محمد مصطفى (مخرج) أحمد عزب (مساعد مخرج) بطولة حمدي الميرغني  محمد أسامة (أوس أوس)  صلاح عبد الله نور فخري بيومي فؤاد محمود البزاوي  هايدي رفعت  أحمد عبدالله محمود  محمد حسني  إنتصار  محمود الشرقاوي…

2017 United Kingdom general election ← 2015 8 June 2017 2019 → ← outgoing memberselected members →All 650 seats in the House of Commons326[n 1] seats needed for a majorityOpinion pollsRegistered46,836,533Turnout68.8% ( 2.4 pp)[1]   First party Second party Third party   Leader Theresa May Jeremy Corbyn Nicola Sturgeon Party Conservative Labour SNP Leader since 11 July 2016 12 September 2015 14 November 2014 Leader's&…

Semi-legendary founder of Zen Buddhism Bodhidharma Names (details) Known in English: Bodhidharma Balinese: Darmo Bengali: বোধিধর্ম Burmese: ဗောဓိဓမ္မ Chinese abbreviation: 達摩 Hanyu Pinyin: Pútídámó Hindi: बोधिधर्म Hokkien: Tatmo Indonesian: Budhi Darma Japanese: 達磨 Daruma Kannada: ಬೋಧಿ ಧರ್ಮ Khmer: ពោធិធម៌ Poŭthĭthôrm Korean: 달마 Dalma Malay: Dharuma Malayalam: ബോധിധർമ്മൻ Bodhidha…