Data validation
Data validation is a check carried out by a computer program as data is entered. Its purpose is to catch data that does not conform to specific rules.
It allows a programmer to use several techniques to ensure that data entered will be sensible, reasonable within acceptable boundaries and complete.
Consider data collected from a class test:
- Sensible: In the correct format (78 instead of seventy-eight)
- Reasonable: 89 instead of 88.56345
- Within acceptable boundaries: Exam scores between 1 and 100
- Complete: Data field is not left empty
Presence check
When entering data into a computer application some fields may be optional. A presence check will not allow certain data fields to remain blank.
For example, in an employee file a mobile number may be optional as the person may not have a mobile phone, whereas a National Insurance number is mandatory on each record.
Length check
This is used to check that data entered into computer programs contains a certain number of characters. For example, a mobile phone number contains 11 digits.
This can be represented with the following function in Python:
def lengthCheck(mob): '''function to check length of a data item''' if not len(mob) == 11: #Checks if mobile number is NOT equal to 11 digits print('Invalid mobile number') else: print('Number accepted')Type check
This ensures the data item is of a particular data type. For example, a mobile phone number is a number. This can be represented with the following pseudocode:
INPUT Number
for i in range 0 - LENGTH Number: if not Number[i].isdecimal(): PRINT 'Error'In Python, this would be represented as:
num = input ("Enter a number") for i in range (0, len(num)): if not num[i].isdecimal(): print ("Error") breakFormat check
This is used to ensure a data item matches a previously determined pattern, and that particular characters have particular values (such as letters or digits). For example, a vehicle number plate may be required to be entered in a predefined format using the pattern EU01 ABC.
Car registration numbers in England and Wales follow these rules:
- No longer than 8 characters
- The first 2 characters are text representing the EU country
- The next two characters are numbers representing the year of registration – the 6 month period it was registered followed by a space
- The last three characters are text used to help identify a vehicle – random
Running a format check on car registrations in England and Wales could be represented in Python as:
43 def isPlateValid(plate):
44 #function to validate length & format of reg plate
45 if len(plate)!=8:
46 return False
47 for i in range(0,1):
48 if not plate[i].isalpha():
49 return False
50 for i in range(2,3):
51 if not plate[i].isdecimal():
52 return False
53 if not plate[4]==' ':
54 return False
55 for i in range(5,7):
56 if not plate[i].isalpha():
57 return False
58 return True
59 60 plate = input('Enter Reg: ')
61 if isPlateValid(plate):
62 print('Valid Reg #')
63 else:
64 print('Invalid Reg #')The function is PlateValid. It accepts a value and checks that the value is eight characters in length, the first two characters are letters, the next two characters are numbers followed by a space and that the final three characters are letters.
If the value passed into the function does not meet these rules, False will be returned.
- Line 43: Function declared
- Line 45: Length check. If the value is not equal to 8, it will not be accepted
- Line 47: If the first two characters are not letters they will not be accepted
- Line 50: If the next two characters are not numbers they will not be accepted
- Line 53: If the fifth character is not a space it will not be accepted
- Line 55: If the last two characters are not letters they will not be accepted
- Line 61: When a value is passed to the function, if it evaluates to True it will be accepted, else it will be rejected