Sufficient dimension reduction

In statistics, sufficient dimension reduction (SDR) is a paradigm for analyzing data that combines the ideas of dimension reduction with the concept of sufficiency.

Dimension reduction has long been a primary goal of regression analysis. Given a response variable y and a p-dimensional predictor vector , regression analysis aims to study the distribution of , the conditional distribution of given . A dimension reduction is a function that maps to a subset of , k < p, thereby reducing the dimension of .[1] For example, may be one or more linear combinations of .

A dimension reduction is said to be sufficient if the distribution of is the same as that of . In other words, no information about the regression is lost in reducing the dimension of if the reduction is sufficient.[1]

Graphical motivation

In a regression setting, it is often useful to summarize the distribution of graphically. For instance, one may consider a scatterplot of versus one or more of the predictors or a linear combination of the predictors. A scatterplot that contains all available regression information is called a sufficient summary plot.

When is high-dimensional, particularly when , it becomes increasingly challenging to construct and visually interpret sufficiency summary plots without reducing the data. Even three-dimensional scatter plots must be viewed via a computer program, and the third dimension can only be visualized by rotating the coordinate axes. However, if there exists a sufficient dimension reduction with small enough dimension, a sufficient summary plot of versus may be constructed and visually interpreted with relative ease.

Hence sufficient dimension reduction allows for graphical intuition about the distribution of , which might not have otherwise been available for high-dimensional data.

Most graphical methodology focuses primarily on dimension reduction involving linear combinations of . The rest of this article deals only with such reductions.

Dimension reduction subspace

Suppose is a sufficient dimension reduction, where is a matrix with rank . Then the regression information for can be inferred by studying the distribution of , and the plot of versus is a sufficient summary plot.

Without loss of generality, only the space spanned by the columns of need be considered. Let be a basis for the column space of , and let the space spanned by be denoted by . It follows from the definition of a sufficient dimension reduction that

where denotes the appropriate distribution function. Another way to express this property is

or is conditionally independent of , given . Then the subspace is defined to be a dimension reduction subspace (DRS).[2]

Structural dimensionality

For a regression , the structural dimension, , is the smallest number of distinct linear combinations of necessary to preserve the conditional distribution of . In other words, the smallest dimension reduction that is still sufficient maps to a subset of . The corresponding DRS will be d-dimensional.[2]

Minimum dimension reduction subspace

A subspace is said to be a minimum DRS for if it is a DRS and its dimension is less than or equal to that of all other DRSs for . A minimum DRS is not necessarily unique, but its dimension is equal to the structural dimension of , by definition.[2]

If has basis and is a minimum DRS, then a plot of y versus is a minimal sufficient summary plot, and it is (d + 1)-dimensional.

Central subspace

If a subspace is a DRS for , and if for all other DRSs , then it is a central dimension reduction subspace, or simply a central subspace, and it is denoted by . In other words, a central subspace for exists if and only if the intersection of all dimension reduction subspaces is also a dimension reduction subspace, and that intersection is the central subspace .[2]

The central subspace does not necessarily exist because the intersection is not necessarily a DRS. However, if does exist, then it is also the unique minimum dimension reduction subspace.[2]

Existence of the central subspace

While the existence of the central subspace is not guaranteed in every regression situation, there are some rather broad conditions under which its existence follows directly. For example, consider the following proposition from Cook (1998):

Let and be dimension reduction subspaces for . If has density for all and everywhere else, where is convex, then the intersection is also a dimension reduction subspace.

It follows from this proposition that the central subspace exists for such .[2]

Methods for dimension reduction

There are many existing methods for dimension reduction, both graphical and numeric. For example, sliced inverse regression (SIR) and sliced average variance estimation (SAVE) were introduced in the 1990s and continue to be widely used.[3] Although SIR was originally designed to estimate an effective dimension reducing subspace, it is now understood that it estimates only the central subspace, which is generally different.

More recent methods for dimension reduction include likelihood-based sufficient dimension reduction,[4] estimating the central subspace based on the inverse third moment (or kth moment),[5] estimating the central solution space,[6] graphical regression,[2] envelope model, and the principal support vector machine.[7] For more details on these and other methods, consult the statistical literature.

Principal components analysis (PCA) and similar methods for dimension reduction are not based on the sufficiency principle.

Example: linear regression

Consider the regression model

Note that the distribution of is the same as the distribution of . Hence, the span of is a dimension reduction subspace. Also, is 1-dimensional (unless ), so the structural dimension of this regression is .

The OLS estimate of is consistent, and so the span of is a consistent estimator of . The plot of versus is a sufficient summary plot for this regression.

See also

Notes

  1. ^ a b Cook & Adragni (2009) Sufficient Dimension Reduction and Prediction in Regression In: Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 367(1906): 4385–4405
  2. ^ a b c d e f g Cook, R.D. (1998) Regression Graphics: Ideas for Studying Regressions Through Graphics, Wiley ISBN 0471193658
  3. ^ Li, K-C. (1991) Sliced Inverse Regression for Dimension Reduction In: Journal of the American Statistical Association, 86(414): 316–327
  4. ^ Cook, R.D. and Forzani, L. (2009) "Likelihood-Based Sufficient Dimension Reduction", Journal of the American Statistical Association, 104(485): 197–208
  5. ^ Yin, X. and Cook, R.D. (2003) Estimating Central Subspaces via Inverse Third Moments In: Biometrika, 90(1): 113–125
  6. ^ Li, B. and Dong, Y.D. (2009) Dimension Reduction for Nonelliptically Distributed Predictors In: Annals of Statistics, 37(3): 1272–1298
  7. ^ Li, Bing; Artemiou, Andreas; Li, Lexin (2011). "Principal support vector machines for linear and nonlinear sufficient dimension reduction". The Annals of Statistics. 39 (6): 3182–3210. arXiv:1203.2790. doi:10.1214/11-AOS932. S2CID 88519106.

References

Read other articles:

Football leaguePrimera FuerzaFounded1902; 122 years ago (1902)Folded1943; 81 years ago (1943)Country MexicoNumber of teams37Level on pyramid1Domestic cup(s)Copa MexicoMost championshipsReal Club España (14 titles) The Liga Mexicana de Fútbol Amateur Association, known as the Primera Fuerza, was an amateur football league founded in Mexico in 1902 with five clubs: Orizaba A.C., Pachuca A.C., Reforma A.C., Mexico Cricket Club and British Club. Orizaba won the …

Belgian CEO, software developer, author Pieter HintjensHintjens in 2014Born(1962-12-03)3 December 1962Congo-LéopoldvilleDied4 October 2016(2016-10-04) (aged 53)Brussels, BelgiumNationalityBelgianOccupation(s)CEO, software developer, authorWebsitehintjens.com Pieter Hintjens (3 December 1962 – 4 October 2016) was a Belgian software developer, author, and past president of the Foundation for a Free Information Infrastructure (FFII), an association that fights against software patents.…

كونغريسو   الإحداثيات 35°10′45″N 2°26′29″W / 35.1792°N 2.4414°W / 35.1792; -2.4414   تقسيم إداري  البلد إسبانيا[1]  خصائص جغرافية  المساحة 0.256 كيلومتر مربع  عدد السكان  عدد السكان 0   الكثافة السكانية 00000 نسمة/كم2 معلومات أخرى منطقة زمنية ت ع م+01:00  رمز جيونيمز …

Television program T-Pop Stage ShowAlso known asT-POP StageGenreMusicPresented byTeeratee Buddeehong, Suppapong UdomkaewkanjanaCountry of originThailandOriginal languageThaiNo. of seasons2No. of episodes52 (as of March 5, 2022 (2022-03-05))ProductionProduction locationWorkpoint StudioRunning time60 minutesProduction companiesWorkpoint EntertainmentTPOP IncorporationOriginal releaseNetworkWorkpoint TVRelease8 February 2021 (2021-02-08) –present T-POP Stage Show (Thai: …

Fictional character from the Fox series Glee Fictional character Quinn FabrayGlee characterDianna Agron as Quinn Fabray in GleeFirst appearancePilot (2009)Last appearanceDreams Come True (2015)Created byRyan MurphyBrad FalchukIan BrennanPortrayed byDianna AgronIn-universe informationFull nameLucy Quinn Fabray[1]OccupationCollege studentFamilyRussell Fabray (father)Judy Fabray (mother) Frannie Fabray (sister)Significant otherNoah Puck Puckerman (boyfriend; father of her child) † Finn Hu…

Rosie Huntington-WhiteleyHuntington-Whiteley pada bulan Juni 2011.LahirRosie Alice Huntington-Whiteley18 April 1987 (umur 37)Plymouth, Devon, Britania RayaPekerjaanModel, aktrisTahun aktif2003–sekarangPasanganJason Statham (2010-sekarang)Modelling modelingTinggi175 cm (5 ft 9 in)Warna rambutPirang-hitamWarna mataBiruManajerElite New York City Situs webwww.rosiehuntingtonwhiteley.net Rosie Alice Huntington-Whiteley[1] (lahir 18 April 1987) adalah seorang model d…

Economic theory promoting local control Distributivism redirects here. For the algebraic concept, see Distributivity. This article has multiple issues. Please help improve it or discuss these issues on the talk page. (Learn how and when to remove these template messages) This article possibly contains original research. Please improve it by verifying the claims made and adding inline citations. Statements consisting only of original research should be removed. (March 2021) (Learn how and when to…

Permanent mechanical fastener For other uses, see Rivet (disambiguation). This article's lead section may be too short to adequately summarize the key points. Please consider expanding the lead to provide an accessible overview of all important aspects of the article. (May 2024) Solid rivets Sophisticated riveted joint on a railway bridge Riveters work on the Liberty ship SS John W. Brown (December 2014). A rivet is a permanent mechanical fastener. Before being installed, a rivet consists of a s…

Не следует путать с Рослесхоз. Федеральная служба по надзору в сфере природопользованиясокращённо: Росприроднадзор Общая информация Страна  Россия Юрисдикция Россия Дата создания 2004 Руководство Подчинено Министерство природных ресурсов и экологии Российской Федера…

State Police of New Hampshire This article needs additional citations for verification. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed.Find sources: New Hampshire State Police – news · newspapers · books · scholar · JSTOR (April 2023) (Learn how and when to remove this message) Law enforcement agency New Hampshire State PolicePatch of New Hampshire State PoliceAbbreviationNHSPAgen…

Traditional clothing of the Khmer people See also: Sompot Chong Kben and Sompot Chong Somloy Khmer traditional clothes displayed at the MGC Asian Traditional Textiles Museum in Siem Reap Part of a series on theCulture of Cambodia Society Khmers Ethnic groups Folklore History Languages Holidays Religion Script Women Youth Topics Art Architecture Ceramics Cinema Clothing Cuisine (Royal cuisine) Dance Literature Media Newspapers Radio Television Music Sculpture Sports Theatre Symbols Flag Coat of a…

Argentine film maker and theorist (1925–2017) Fernando Birri, March 2008 Fernando Birri (March 13, 1925 – December 27, 2017) was an Argentine film maker and theorist. He was considered by many to be the father of the new Latin American cinema. Biography Birri was born in Santa Fe, Argentina. After being involved in theater and poetry, he went to Rome to study film-making at the Centro Sperimentale di Cinematografia, from 1950 to 1953, and appeared in the 1955 Italian film Gli Sbandati. In 19…

Tumbangnya Seorang Diktator Sampul edisi IndonesiaPengarangGabriel García MárquezJudul asliEl otoño del patriarcaPenerjemahGregory RabassaNegaraKolombiaBahasaSpanyolPenerbitPlaza & Janes (Spain)Tanggal terbit1975Jenis mediaCetak (Hardback & Paperback)ISBNISBN 0-06-011419-3OCLC2464022Desimal Dewey863LCCPQ8180.17.A73 O813 1976 Tumbangnya Seorang Diktator (judul aslinya: El otoño del patriarca) adalah sebuah novel yang ditulis oleh Gabriel García Márquez pada tahun…

Artikel ini tidak memiliki referensi atau sumber tepercaya sehingga isinya tidak bisa dipastikan. Tolong bantu perbaiki artikel ini dengan menambahkan referensi yang layak. Tulisan tanpa sumber dapat dipertanyakan dan dihapus sewaktu-waktu.Cari sumber: Gackt – berita · surat kabar · buku · cendekiawan · JSTOR GACKTInformasi latar belakangNama lainGackt Camui, Gackt C.Lahir4 Juli 1973 (umur 51)[1][2]GenreJ-Rock, J-Pop,PekerjaanPenyanyi…

The O2 ArenaNorth Greenwich Arena LokasiLokasiThe O2 Drawdock Road North Greenwich London, SE10 0BB InggrisKoordinat51°30′10.79″N 0°0′11.28″E / 51.5029972°N 0.0031333°E / 51.5029972; 0.0031333KonstruksiDibuatAntara 2003 dan 2007Dibuka24 Juni 2007ArsitekPopulous[1] (HOK Sport)Insinyur strukturBuro HappoldInsinyur pemeliharaanM-E EngineersKontraktor umumSir Robert McAlpineData teknisPermukaanVersatileKapasitas20.000[2]Situs webtheo2.co.ukPemakaiA…

Florence MasnadaFlorence Masnada a Chambéry nel 2019Nazionalità Francia Altezza176 cm Peso69 kg Sci alpino SpecialitàDiscesa libera, supergigante, slalom gigante, slalom speciale, combinata SquadraCS Chamrousse Termine carriera1999 Palmarès Competizione Ori Argenti Bronzi Olimpiadi 0 0 2 Mondiali 0 0 1 Vedi maggiori dettagli  Modifica dati su Wikidata · Manuale Florence Masnada (Grenoble, 16 dicembre 1968) è un'ex sciatrice alpina francese. Indice 1 Biografia 1.1 Carriera sc…

Standing committee of the United States House of Representatives House Judiciary CommitteeStanding committeeActiveUnited States House of Representatives118th CongressHistoryFormedJune 6, 1813LeadershipChairJim Jordan (R) Since January 7, 2023Ranking memberJerry Nadler (D) Since January 7, 2023Vice chairVacantStructureSeats44Political partiesMajority (25)   Republican (25) Minority (19)   Democratic (19) JurisdictionSenate counterpartSenate Committee on the Judiciary The U.S. House Comm…

Failed proposed aircraft manufacturing joint venture between Boeing and Embraer. Boeing Brasil–CommercialAn Embraer E-Jet E190 below a Boeing 777FIndustryAircraft manufacturerFateFailed proposalHeadquartersSão José dos Campos, São Paulo, BrazilProductsAirlinersBrandsE-Jet, E-Jet E2OwnersBoeing (80%)Embraer (20%) Boeing Brasil–Commercial was a proposed, but failed joint venture between Boeing and Embraer to design, build, and sell commercial airliners worldwide. The partnership was establi…

Russian politician For other people named Aleksandr Rumyantsev, see Aleksandr Rumyantsev. In this name that follows Eastern Slavic naming customs, the patronymic is Grigorievich and the family name is Rumyantsev. You can help expand this article with text translated from the corresponding article in Russian. (February 2024) Click [show] for important translation instructions. Machine translation, like DeepL or Google Translate, is a useful starting point for translations, but translator…

Beautiful Night《Beautiful Night》專輯封面藝聲的迷你专辑发行日期2021年5月3日录制时间2021年类型City-POP、Ballad唱片公司Label SJ、韓國SM娱樂有限公司藝聲专辑年表 Pink Magic(2019年) Beautiful Night(2021年) 收錄於《Beautiful Night》的單曲 Phantom Pain發行日期:2021年4月23日 Beautiful Night發行日期:2021年5月3日 音乐视频YouTube上的Phantom PainYouTube上的Beautiful Night 《Beautiful Night》是韓國團體Super Jun…