Data validation

In computing, data validation or input validation is the process of ensuring data has undergone data cleansing to confirm they have data quality, that is, that they are both correct and useful. It uses routines, often called "validation rules", "validation constraints", or "check routines", that check for correctness, meaningfulness, and security of data that are input to the system. The rules may be implemented through the automated facilities of a data dictionary, or by the inclusion of explicit application program validation logic of the computer and its application.

This is distinct from formal verification, which attempts to prove or disprove the correctness of algorithms for implementing a specification or property.

Overview

Data validation is intended to provide certain well-defined guarantees for fitness and consistency of data in an application or automated system. Data validation rules can be defined and designed using various methodologies, and be deployed in various contexts.[1] Their implementation can use declarative data integrity rules, or procedure-based business rules.[2]

The guarantees of data validation do not necessarily include accuracy, and it is possible for data entry errors such as misspellings to be accepted as valid. Other clerical and/or computer controls may be applied to reduce inaccuracy within a system.

Different kinds

In evaluating the basics of data validation, generalizations can be made regarding the different kinds of validation according to their scope, complexity, and purpose.

For example:

  • Data type validation;
  • Range and constraint validation;
  • Code and cross-reference validation;
  • Structured validation; and
  • Consistency validation

Data-type check

Data type validation is customarily carried out on one or more simple data fields.

The simplest kind of data type validation verifies that the individual characters provided through user input are consistent with the expected characters of one or more known primitive data types as defined in a programming language or data storage and retrieval mechanism.

For example, an integer field may require input to use only characters 0 through 9.

Simple range and constraint check

Simple range and constraint validation may examine input for consistency with a minimum/maximum range, or consistency with a test for evaluating a sequence of characters, such as one or more tests against regular expressions. For example, a counter value may be required to be a non-negative integer, and a password may be required to meet a minimum length and contain characters from multiple categories.

Code and cross-reference check

Code and cross-reference validation includes operations to verify that data is consistent with one or more possibly-external rules, requirements, or collections relevant to a particular organization, context or set of underlying assumptions. These additional validity constraints may involve cross-referencing supplied data with a known look-up table or directory information service such as LDAP.

For example, a user-provided country code might be required to identify a current geopolitical region.

Structured check

Structured validation allows for the combination of other kinds of validation, along with more complex processing. Such complex processing may include the testing of conditional constraints for an entire complex data object or set of process operations within a system.

Consistency check

Consistency validation ensures that data is logical. For example, the delivery date of an order can be prohibited from preceding its shipment date.

Example

Multiple kinds of data validation are relevant to 10-digit pre-2007 ISBNs (the 2005 edition of ISO 2108 required ISBNs to have 13 digits from 2007 onwards[3]).

  • Size. A pre-2007 ISBN must consist of 10 digits, with optional hyphens or spaces separating its four parts.
  • Format checks. Each of the first 9 digits must be 0 through 9, and the 10th must be either 0 through 9 or an X.
  • Check digit. To detect transcription errors in which digits have been altered or transposed, the last digit of a pre-2007 ISBN must match the result of a mathematical formula incorporating the other 9 digits (ISBN-10 check digits).

Validation types

Allowed character checks
Checks to ascertain that only expected characters are present in a field. For example a numeric field may only allow the digits 0–9, the decimal point and perhaps a minus sign or commas. A text field such as a personal name might disallow characters used for markup. An e-mail address might require at least one @ sign and various other structural details. Regular expressions can be effective ways to implement such checks.
Batch totals
Checks for missing records. Numerical fields may be added together for all records in a batch. The batch total is entered and the computer checks that the total is correct, e.g., add the 'Total Cost' field of a number of transactions together.
Cardinality check
Checks that record has a valid number of related records. For example, if a contact record is classified as "customer" then it must have at least one associated order (cardinality > 0). This type of rule can be complicated by additional conditions. For example, if a contact record in a payroll database is classified as "former employee" then it must not have any associated salary payments after the separation date (cardinality = 0).
Check digits
Used for numerical data. To support error detection, an extra digit is added to a number which is calculated from the other digits.
Consistency checks
Checks fields to ensure data in these fields correspond, e.g., if expiration date is in the past then status is not "active".
Cross-system consistency checks
Compares data in different systems to ensure it is consistent. Systems may represent the same data differently, in which case comparison requires transformation (e.g., one system may store customer name in a single Name field as 'Doe, John Q', while another uses First_Name 'John' and Last_Name 'Doe' and Middle_Name 'Quality').
Data type checks
Checks input conformance with typed data. For example, an input box accepting numeric data may reject the letter 'O'.
File existence check
Checks that a file with a specified name exists. This check is essential for programs that use file handling.
Format check
Checks that the data is in a specified format (template), e.g., dates have to be in the format YYYY-MM-DD. Regular expressions may be used for this kind of validation.
Presence check
Checks that data is present, e.g., customers may be required to have an email address.
Range check
Checks that the data is within a specified range of values, e.g., a probability must be between 0 and 1.
Referential integrity
Values in two relational database tables can be linked through foreign key and primary key. If values in the foreign key field are not constrained by internal mechanisms, then they should be validated to ensure that the referencing table always refers to a row in the referenced table.
Spelling and grammar check
Looks for spelling and grammatical errors.
Uniqueness check
Checks that each value is unique. This can be applied to several fields (i.e. Address, First Name, Last Name).
Table look up check
A table look up check compares data to a collection of allowed values.

Post-validation actions

Enforcement Action
Enforcement action typically rejects the data entry request and requires the input actor to make a change that brings the data into compliance. This is most suitable for interactive use, where a real person is sitting on the computer and making entry. It also works well for batch upload, where a file input may be rejected and a set of messages sent back to the input source for why the data is rejected.
Another form of enforcement action involves automatically changing the data and saving a conformant version instead of the original version. This is most suitable for cosmetic change. For example, converting an [all-caps] entry to a [Pascal case] entry does not need user input. An inappropriate use of automatic enforcement would be in situations where the enforcement leads to loss of business information. For example, saving a truncated comment if the length is longer than expected. This is not typically a good thing since it may result in loss of significant data.
Advisory Action
Advisory actions typically allow data to be entered unchanged but sends a message to the source actor indicating those validation issues that were encountered. This is most suitable for non-interactive system, for systems where the change is not business critical, for cleansing steps of existing data and for verification steps of an entry process.
Verification Action
Verification actions are special cases of advisory actions. In this case, the source actor is asked to verify that this data is what they would really want to enter, in the light of a suggestion to the contrary. Here, the check step suggests an alternative (e.g., a check of a mailing address returns a different way of formatting that address or suggests a different address altogether). You would want in this case, to give the user the option of accepting the recommendation or keeping their version. This is not a strict validation process, by design and is useful for capturing addresses to a new location or to a location that is not yet supported by the validation databases.
Log of validation
Even in cases where data validation did not find any issues, providing a log of validations that were conducted and their results is important. This is helpful to identify any missing data validation checks in light of data issues and in improving the validation.

Validation and security

Failures or omissions in data validation can lead to data corruption or a security vulnerability.[4] Data validation checks that data are fit for purpose,[5] valid, sensible, reasonable and secure before they are processed.

See also

References

Read other articles:

Election in Connecticut Main article: 1992 United States presidential election 1992 United States presidential election in Connecticut ← 1988 November 3, 1992 1996 →   Nominee Bill Clinton George H. W. Bush Ross Perot Party Democratic Republican Independent Home state Arkansas Texas Texas Running mate Al Gore Dan Quayle James Stockdale Electoral vote 8 0 0 Popular vote 682,318 578,313 348,771 Percentage 42.21% 35.78% 21.58% County Results Municipali…

Piottafrazione Piotta – Veduta LocalizzazioneStato Svizzera Cantone Ticino DistrettoLeventina ComuneQuinto TerritorioCoordinate46°30′46″N 8°40′35″E / 46.512778°N 8.676389°E46.512778; 8.676389 (Piotta)Coordinate: 46°30′46″N 8°40′35″E / 46.512778°N 8.676389°E46.512778; 8.676389 (Piotta) Altitudine1 006 m s.l.m. Abitanti200 circa (-) Altre informazioniCod. postale6776 Prefisso091 Fuso orarioUTC+1 TargaTI Car…

Val de DrômecomuneVal de Drôme – Veduta LocalizzazioneStato Francia Regione Normandia Dipartimento Calvados ArrondissementVire Cantone TerritorioCoordinate49°04′33″N 0°49′13″W / 49.075833°N 0.820278°W49.075833; -0.820278 (Val de Drôme)Coordinate: 49°04′33″N 0°49′13″W / 49.075833°N 0.820278°W49.075833; -0.820278 (Val de Drôme) Altitudine89 - 233 m s.l.m. Superficie27,62 km² Abitanti865 (2018) Densità3…

This is a list of historic places in Southwestern Ontario, containing heritage sites listed on the Canadian Register of Historic Places (CRHP), all of which are designated as historic places either locally, provincially, territorially, nationally, or by more than one level of government. The following subregions have separate listings: County of Brant Essex County Middlesex County Perth County Regional Municipality of Waterloo Wellington County Map all coordinates using OpenStreetMap Download co…

LGBT caucus within the Democratic Party Stonewall DemocratsNamed afterStonewall Inn / Stonewall riotsDemocratic PartyFormation1971; 53 years ago (1971)(Alice B. Toklas Memorial Democratic Club) 1975; 49 years ago (1975)(Stonewall Democratic Club)TypeLGBT political organizationLegal statusActivePurposeElect pro-LGBT Democrats in federal, state, and local elections, serve as a bullhorn for LGBT governmental issues, and supply LGBT voters to ballot boxesLocationU…

Essex Institute, Salem, Massachusetts, circa 1900-1910 The Essex Institute (1848–1992) in Salem, Massachusetts, was a literary, historical and scientific society.[1] It maintained a museum, library,[2] historic houses; arranged educational programs; and issued numerous scholarly publications. In 1992 the institute merged with the Peabody Museum of Salem to form the Peabody Essex Museum.[3] On December 8, 2017, Dan L. Monroe, PEM’s Rose-Marie and Eijk van Otterloo Dire…

American actor (1929–2015) For other people named George Coe, see George Coe (disambiguation). This article needs additional citations for verification. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed.Find sources: George Coe – news · newspapers · books · scholar · JSTOR (July 2015) (Learn how and when to remove this message) George CoeCoe in 1980BornGeorge Julian Cohen(1929-05-1…

Artikel ini sebatang kara, artinya tidak ada artikel lain yang memiliki pranala balik ke halaman ini.Bantulah menambah pranala ke artikel ini dari artikel yang berhubungan atau coba peralatan pencari pranala.Tag ini diberikan pada September 2016. Neil Alexander Alexander bersama Rangers di Final Piala UEFA 2008Informasi pribadiNama lengkap James Neil Alexander[1]Tanggal lahir 10 Maret 1978 (umur 46)Tempat lahir Edinburgh, SkotlandiaPosisi bermain Penjaga gawangInformasi klubKlub saa…

Азиатский барсук Научная классификация Домен:ЭукариотыЦарство:ЖивотныеПодцарство:ЭуметазоиБез ранга:Двусторонне-симметричныеБез ранга:ВторичноротыеТип:ХордовыеПодтип:ПозвоночныеИнфратип:ЧелюстноротыеНадкласс:ЧетвероногиеКлада:АмниотыКлада:СинапсидыКласс:Млеко…

Questa voce o sezione sugli argomenti scultori italiani e pittori italiani non cita le fonti necessarie o quelle presenti sono insufficienti. Puoi migliorare questa voce aggiungendo citazioni da fonti attendibili secondo le linee guida sull'uso delle fonti. Segui i suggerimenti dei progetti di riferimento 1, 2. Gino Terreni Gino Terreni (Martignana, 13 settembre 1925 – Empoli, 28 novembre 2015) è stato un pittore, scultore e xilografo italiano, uno dei più significativi rappresentanti d…

English actress (born 1968) Sophie OkonedoCBEOkonedo in 2008Born (1968-08-11) 11 August 1968 (age 55)London, EnglandAlma materRoyal Academy of Dramatic ArtOccupation(s)Actress and narratorYears active1991–presentSpouseJamie ChalmersChildren1 Sophie Okonedo CBE (born 11 August 1968) is a British actress. The recipient of a Tony Award, she has been nominated for an Academy Award, three BAFTA TV Awards, an Emmy Award, two Laurence Olivier Awards, and a Golden Globe Award. She was a…

Chronologies Données clés 1920 1921 1922  1923  1924 1925 1926Décennies :1890 1900 1910  1920  1930 1940 1950Siècles :XVIIIe XIXe  XXe  XXIe XXIIeMillénaires :-Ier Ier  IIe  IIIe Chronologies géographiques Afrique Afrique du Sud, Algérie, Angola, Bénin, Botswana, Burkina Faso, Burundi, Cameroun, Cap-Vert, République centrafricaine, Comores, République du Congo, République démocratique du Congo, Côte d'Ivoire, Djibouti, Égypte, …

Artikel ini sebatang kara, artinya tidak ada artikel lain yang memiliki pranala balik ke halaman ini.Bantulah menambah pranala ke artikel ini dari artikel yang berhubungan atau coba peralatan pencari pranala.Tag ini diberikan pada Februari 2023. SMP Negeri 3 Ciawigebang merupakan salah satu SMP yang berada di wilayah Kecamatan Ciawigebang Kabupaten Kuningan, berlokasi di Desa Geresik ± 5 KM dari pusat kota Ciawigebang. Fasilitas & Ekstrakurikuler Dalam perjalanannya, SMP Negeri 3 Ciawigeban…

Cet article est une ébauche concernant la chronologie du cinéma. Vous pouvez partager vos connaissances en l’améliorant (comment ?) selon les recommandations des projets correspondants. Consultez la liste des tâches à accomplir en page de discussion. Chronologies Données clés 1977 1978 1979  1980  1981 1982 1983Décennies :1950 1960 1970  1980  1990 2000 2010Siècles :XVIIIe XIXe  XXe  XXIe XXIIeMillénaires :-Ier Ier  IIe  III…

Lista das 305 comunas do departamento francês de Saboia.[1] Comunas (CAC) Agglomeration community of the Chambéry Métropole, criada em 2000. INSEE Postal Comuna 73001 73610 Aiguebelette-le-Lac 73002 73220 Aiguebelle 73003 73260 Aigueblanche 73004 73340 Aillon-le-Jeune 73005 73340 Aillon-le-Vieux 73006 73210 Aime 73007 73220 Aiton 73008 73100 Aix-les-Bains 73010 73410 Albens 73011 73200 Albertville 73012 73300 Albiez-le-Jeune 73013 73300 Albiez-Montrond 73014 73200 Allondaz 73015 73550 L…

2001 Stephen Hawking's book The Universe in a Nutshell First edition cover (UK)AuthorStephen HawkingCountryUnited KingdomLanguageEnglishSubjectTheoretical PhysicsPublisherBantam SpectraPublication date2001Pages224ISBN0-553-80202-XOCLC46959876Dewey Decimal530.12 21LC ClassQC174.12 .H39 2001Preceded byBlack Holes and Baby Universes and Other Essays Followed byOn The Shoulders of Giants  The Universe in a Nutshell is a 2001 book about theoretical physics by Stephen Hawking.…

Upacara peresmian Monumen Tadulako, 10 November 2016 Monumen Tinombala atau Tugu Tadulako (bahasa Inggris: Monument of Tinombala; bahasa Inggris: Monument of Tadulako), adalah sebuah monumen yang dibangun oleh pasukan gabungan Operasi Tinombala 2016 dalam rangka mengenang 13 orang penumpang helikopter Bell 412EP TNI Angkatan Darat yang tewas pada tanggal 20 Maret 2016. Monumen ini terletak di kelurahan Kasiguncu, kecamatan Poso Pesisir, Kabupaten Poso, tepat di titik pusat jatuhnya heli …

Wikispecies mempunyai informasi mengenai Dolar rambat. Dolar rambat Ficus pumila Ficus pumila di depan hotel Citradream Cirebon TaksonomiDivisiTracheophytaSubdivisiSpermatophytesKladAngiospermaeKladmesangiospermsKladeudicotsKladcore eudicotsKladSuperrosidaeKladrosidsKladfabidsOrdoRosalesFamiliMoraceaeGenusFicusSpesiesFicus pumila Linnaeus, 1753 lbs Dolar rambat (Ficus pumila) adalah spesies tumbuhan merambat yang berasal dari Genus ficus, tumbuhan ini berasal dari Asia Timur & Tenggara, sepe…

1859 conflict between Sardinia (with France) and Austria Franco-Austrian War redirects here. For other uses, see Franco-Austrian War (disambiguation). Second Italian War of IndependencePart of the wars of Italian unification and the French-Habsburg rivalryNapoleon III at the Battle of Solferino, by Jean-Louis-Ernest Meissonier, oil on canvas, 1863Date26 April – 12 July 1859(2 months, 2 weeks and 2 days)LocationLombardy–Venetia, Piedmont and the Austrian LittoralResult Franco-S…

Chinese astronomer In this Chinese name, the family name is Zhang (Chang). Zhang Yuzhe张钰哲Born(1902-02-16)16 February 1902Fuzhou, MinhouDied21 July 1986(1986-07-21) (aged 84)EducationTsinghua UniversityUniversity of ChicagoScientific careerInstitutionsPurple Mountain ObservatoryNational Central University Zhang YuzheTraditional Chinese張鈺哲Simplified Chinese张钰哲TranscriptionsStandard MandarinHanyu PinyinZhāng YùzhéWade–GilesChang Yü-cheIPA[ʈʂáŋ y&…