Lack-of-fit sum of squares

In statistics, a sum of squares due to lack of fit, or more tersely a lack-of-fit sum of squares, is one of the components of a partition of the sum of squares of residuals in an analysis of variance, used in the numerator in an F-test of the null hypothesis that says that a proposed model fits well. The other component is the pure-error sum of squares.

The pure-error sum of squares is the sum of squared deviations of each value of the dependent variable from the average value over all observations sharing its independent variable value(s). These are errors that could never be avoided by any predictive equation that assigned a predicted value for the dependent variable as a function of the value(s) of the independent variable(s). The remainder of the residual sum of squares is attributed to lack of fit of the model since it would be mathematically possible to eliminate these errors entirely.

Principle

In order for the lack-of-fit sum of squares to differ from the sum of squares of residuals, there must be more than one value of the response variable for at least one of the values of the set of predictor variables. For example, consider fitting a line

by the method of least squares. One takes as estimates of α and β the values that minimize the sum of squares of residuals, i.e., the sum of squares of the differences between the observed y-value and the fitted y-value. To have a lack-of-fit sum of squares that differs from the residual sum of squares, one must observe more than one y-value for each of one or more of the x-values. One then partitions the "sum of squares due to error", i.e., the sum of squares of residuals, into two components:

sum of squares due to error = (sum of squares due to "pure" error) + (sum of squares due to lack of fit).

The sum of squares due to "pure" error is the sum of squares of the differences between each observed y-value and the average of all y-values corresponding to the same x-value.

The sum of squares due to lack of fit is the weighted sum of squares of differences between each average of y-values corresponding to the same x-value and the corresponding fitted y-value, the weight in each case being simply the number of observed y-values for that x-value.[1][2] Because it is a property of least squares regression that the vector whose components are "pure errors" and the vector of lack-of-fit components are orthogonal to each other, the following equality holds:

Hence the residual sum of squares has been completely decomposed into two components.

Mathematical details

Consider fitting a line with one predictor variable. Define i as an index of each of the n distinct x values, j as an index of the response variable observations for a given x value, and ni as the number of y values associated with the i th x value. The value of each response variable observation can be represented by

Let

be the least squares estimates of the unobservable parameters α and β based on the observed values of x i and Y i j.

Let

be the fitted values of the response variable. Then

are the residuals, which are observable estimates of the unobservable values of the error term ε ij. Because of the nature of the method of least squares, the whole vector of residuals, with

scalar components, necessarily satisfies the two constraints

It is thus constrained to lie in an (N − 2)-dimensional subspace of R N, i.e. there are N − 2 "degrees of freedom for error".

Now let

be the average of all Y-values associated with the i th x-value.

We partition the sum of squares due to error into two components:

Probability distributions

Sums of squares

Suppose the error terms ε i j are independent and normally distributed with expected value 0 and variance σ2. We treat x i as constant rather than random. Then the response variables Y i j are random only because the errors ε i j are random.

It can be shown to follow that if the straight-line model is correct, then the sum of squares due to error divided by the error variance,

has a chi-squared distribution with N − 2 degrees of freedom.

Moreover, given the total number of observations N, the number of levels of the independent variable n, and the number of parameters in the model p:

  • The sum of squares due to pure error, divided by the error variance σ2, has a chi-squared distribution with N − n degrees of freedom;
  • The sum of squares due to lack of fit, divided by the error variance σ2, has a chi-squared distribution with n − p degrees of freedom (here p = 2 as there are two parameters in the straight-line model);
  • The two sums of squares are probabilistically independent.

The test statistic

It then follows that the statistic

has an F-distribution with the corresponding number of degrees of freedom in the numerator and the denominator, provided that the model is correct. If the model is wrong, then the probability distribution of the denominator is still as stated above, and the numerator and denominator are still independent. But the numerator then has a noncentral chi-squared distribution, and consequently the quotient as a whole has a non-central F-distribution.

One uses this F-statistic to test the null hypothesis that the linear model is correct. Since the non-central F-distribution is stochastically larger than the (central) F-distribution, one rejects the null hypothesis if the F-statistic is larger than the critical F value. The critical value corresponds to the cumulative distribution function of the F distribution with x equal to the desired confidence level, and degrees of freedom d1 = (n − p) and d2 = (N − n).

The assumptions of normal distribution of errors and independence can be shown to entail that this lack-of-fit test is the likelihood-ratio test of this null hypothesis.

See also

Notes

  1. ^ Brook, Richard J.; Arnold, Gregory C. (1985). Applied Regression Analysis and Experimental Design. CRC Press. pp. 48–49. ISBN 0824772520.
  2. ^ Neter, John; Kutner, Michael H.; Nachstheim, Christopher J.; Wasserman, William (1996). Applied Linear Statistical Models (Fourth ed.). Chicago: Irwin. pp. 121–122. ISBN 0256117365.

Read other articles:

Italian cyclist Mario BecciaPersonal informationFull nameMario BecciaBorn (1955-08-16) August 16, 1955 (age 68)Troia, Apulia, ItalyTeam informationCurrent teamRetiredDisciplineRoadRoleRiderDirecteur sportifTeam managerProfessional teams1977–1978Sanson1979–1980Mecap–Hoonved1981Santini–Selle Italia1982–1988Hoonved–Bottecchia Managerial team2006–2009Vorarlberger Major winsGiro d'Italia, 4 stagesLa Flèche Wallonne (1982)Tour de Suisse (1980) Mario Beccia (born August 16…

This article is about the borough in Washington County. For the community in Berks County, see Green Hills, Berks County, Pennsylvania. Borough in Pennsylvania, United StatesGreen HillsBoroughLone Pines Country ClubLocation of Green Hills in Washington County, Pennsylvania.Green HillsLocation of Green Hills in PennsylvaniaCoordinates: 40°6′48″N 80°17′58″W / 40.11333°N 80.29944°W / 40.11333; -80.29944CountryUnited StatesStatePennsylvaniaCountyWashingtonGovernme…

Head of the Catholic Church from 1288 to 1292 This article's lead section may be too short to adequately summarize the key points. Please consider expanding the lead to provide an accessible overview of all important aspects of the article. (June 2016) PopeNicholas IVBishop of RomePortrait of Nicolas IV, apse of Santa Maria Maggiore church in RomeChurchCatholic ChurchPapacy began22 February 1288Papacy ended4 April 1292PredecessorHonorius IVSuccessorCelestine VOrdersConsecration1281Created cardin…

XX secolo · XXI secolo · XXII secolo Anni 1980 · Anni 1990 · Anni 2000 · Anni 2010 · Anni 2020 2000 · 2001 · 2002 · 2003 · 2004 · 2005 · 2006 · 2007 · 2008 · 2009 Gli anni 2000 sono il decennio che comprende gli anni dal 2000 al 2009 inclusi. A cavallo tra il secondo e il terzo millennio, ovvero primo decennio del XXI secolo, è stato definito il decennio breve per la velocità delle innovazion…

非常尊敬的讓·克雷蒂安Jean ChrétienPC OM CC KC  加拿大第20任總理任期1993年11月4日—2003年12月12日君主伊利沙伯二世总督Ray HnatyshynRoméo LeBlancAdrienne Clarkson副职Sheila Copps赫布·格雷John Manley前任金·坎貝爾继任保羅·馬田加拿大自由黨黨魁任期1990年6月23日—2003年11月14日前任約翰·特納继任保羅·馬田 高級政治職位 加拿大官方反對黨領袖任期1990年12月21日—1993年11月4日…

Overview of the culture of Malaysia Part of a series on theCulture of Malaysia History Malaysians Immigration Holidays Languages Multiculturalism Women Topics Architecture Art Cinema Cuisine Festivals Hawker centre Literature Media Music Politics Religion Sports Manglish Television Symbols Anthem Flag Coat of arms Flower Tree Pledge of Allegiance  Malaysia portalvte The Culture of Malaysia draws on the varied cultures of the different people of Malaysia. The first people to live in the …

2020年夏季奥林匹克运动会波兰代表團波兰国旗IOC編碼POLNOC波蘭奧林匹克委員會網站olimpijski.pl(英文)(波兰文)2020年夏季奥林匹克运动会(東京)2021年7月23日至8月8日(受2019冠状病毒病疫情影响推迟,但仍保留原定名称)運動員206參賽項目24个大项旗手开幕式:帕维尔·科热尼奥夫斯基(游泳)和马娅·沃什乔夫斯卡(自行车)[1]闭幕式:卡罗利娜·纳亚(皮划艇)[2…

Anglican Church of ChileIglesia Anglicana de ChileClassificationProtestantOrientationAnglicanScriptureHoly BibleTheologyAnglican doctrinePolityEpiscopalPrimateHéctor ZavalaAssociationsAnglican Communion, GAFCON, Global SouthLanguageSpanish, Mapudungun, EnglishHeadquartersSantiago, ChileTerritory ChileMembersc. 20,000[1]Official websiteiach.cl The Anglican Church of Chile (Spanish: Iglesia Anglicana de Chile) is the ecclesiastical province of the Anglican Communion that covers four …

This article needs additional citations for verification. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed.Find sources: Tutwiler, Mississippi – news · newspapers · books · scholar · JSTOR (October 2010) (Learn how and when to remove this message) Town in Mississippi, United StatesTutwiler, MississippiTownLocation of Tutwiler, MississippiTutwiler, MississippiLocation in the United S…

الدوري الهولندي الدرجة الأولى تفاصيل الموسم 2012–2013 البلد هولندا  التاريخ نهاية:10 أغسطس 2012  البطل نادي كامبور عدد المشاركين 18   الموقع الرسمي الموقع الرسمي  2011–2012 2013–2014 تعديل مصدري - تعديل   الدوري الهولندي الدرجة الأولى 2012–2013 هو الموسم السابع والخمسون من الدور…

Частина серії проФілософіяLeft to right: Plato, Kant, Nietzsche, Buddha, Confucius, AverroesПлатонКантНіцшеБуддаКонфуційАверроес Філософи Епістемологи Естетики Етики Логіки Метафізики Соціально-політичні філософи Традиції Аналітична Арістотелівська Африканська Близькосхідна іранська Буддійсь…

Phuntsholing atau Phuentsholing adalah sebuah kota di perbatasan Bhutan dengan India yang berada di bagian selatan Bhutan. Kota ini merupakan ibu kota administrasi dari Distrik Chukha.[1][2] Di kota ini terdapat bagian dari Gomez Phuentsholing dan Sampheling Gewog.[3] Pemandangan Kota Phuntsholing Phuentsholing berbatasan dengan kota Jaigaon di India. Perdagangan lintas batas telah menghasilkan ekonomi lokal yang berkembang. Di kota ini dulunya menjadi tempat kantor pusat…

Chief Scientist at BP Dame Angela StrankDBE FRS FREng FIChemEAngela Strank at the Royal Society admissions day in London, July 2018BornAngela Rosemary Emily StrankOctober 1952 (age 71)[4]Alma materUniversity of Manchester (BSc, PhD)ChildrenTwo[5]AwardsDoctor of Science (Honoris causa) (2018)[1]Scientific careerInstitutionsBPBritish Geological SurveyUniversity of Manchester[2]ThesisForaminiferal biostratigraphy of the Holkerian, Asbian an…

Польська жінка-солдат у трофейній формі вермахту перевіряє документи канадського водія 7 травня 1945 року Польська окупаційна зона була особливою територією в британській окупаційній зоні в післявоєнній Німеччині з 1945 по 1948 рік і була розташована в центрально-північній ча…

نادي العيون الألوان العنابي و الابيض تأسس عام 1976 م الملعب مدينة العيون ،  السعودية البلد السعودية  الدوري دوري الدرجة الثانية السعودي الإدارة المالك الهيئة العامة للرياضة سعد عبدالله الكليب المدرب وقاع الشمري الموقع الرسمي تويتر تعديل مصدري - تعديل   نادي العيون هو …

Lower house of U.S. state legislature South DakotaHouse of RepresentativesSouth Dakota LegislatureTypeTypeLower house Term limits4 terms (8 years)HistoryNew session startedJanuary 10, 2023LeadershipSpeakerHugh Bartels (R) since January 10, 2023 Speaker pro temporeMike Stevens (R) since January 10, 2023 Majority LeaderWill Mortenson (R) since January 10, 2023 Minority LeaderOren Lesmeister (D) since January 10, 2023 StructureSeats70Political groupsMajority   Republican (63) Minority …

Senior church official Part of a series on theHierarchy of theCatholic ChurchSaint Peter Ecclesiastical titles (order of precedence) Pope Cardinal Cardinal Vicar Crown Prince Protector Moderator of the curia Chaplain of His Holiness Papal legate Papal majordomo Apostolic nuncio Apostolic delegate Apostolic Syndic Apostolic visitor Vicar apostolic Apostolic exarch Apostolic prefect Assistant to the papal throne Eparch Metropolitan Patriarch Catholicos Bishop Archbishop Bishop emeritus Diocesan bi…

List of MPs for constituencies in Scotland (February 1974–October 1974)← (1970–Feb. 1974)(Oct. 1974–1979) → Colours on map indicate the party allegiance of each constituency's MP. This is a list of the 71 members of Parliament (MPs) elected to the House of Commons of the United Kingdom by Scottish constituencies for the Forty-sixth parliament of the United Kingdom (Feb. 1974 to Oct. 1974) at the February 1974 United Kingdom general election. Composition Affiliation Me…

Not to be confused with Victoria Rifles of Canada.Captain George R Anderson (left), commanding officer of the Victoria Rifles The Victoria Rifles was a military unit of black soldiers in Halifax, Nova Scotia, that was established in 1860 in the wake of the Crimean War and on the eve of the American Civil War.[1] It was one of the oldest black units established in Canada.[2][3] On January 30, 1860, at a meeting of the Victoria Rifles, George Anderson was elected Captain an…

Place in Bavaria, Germany Town in Bavaria, GermanyBad Aibling TownThe former Prantshausen castle and the Church of Saint Sebastian Coat of armsLocation of Bad Aibling within Rosenheim district Bad Aibling Show map of GermanyBad Aibling Show map of BavariaCoordinates: 47°51′50″N 12°00′36″E / 47.86389°N 12.01000°E / 47.86389; 12.01000CountryGermanyStateBavariaAdmin. regionUpper Bavaria DistrictRosenheim Subdivisions28 StadtteileGovernment • Mayor (20…