Which should perform first? sanitation or validation

Question

Which should perform first? sanitation or validation

I have a field in my registration form that contains, for example, the name field, it will be stored in the database in a field called user_name varchar(20) . it is clear that I have to check user input if I check this frist field with the code below:

 <?php if(emptiy($_pos['name']) || strlen($_post['name'])>20) //send an not valid input error else{ $name=htmlspcialchars($_post['name']); //check for sql injection; //insert name into database;} ?>

if the user inserts a name like <i> some one </i> , the line length is 17, so the else part will be executed, and the name will be <i> some one </i> whose length is 28, which will lead to an error when pasting into db.in at this time, if I send an error to the user that his input is too long, he gets confused. what should I do? What is the best approach?

+6

php validation sanitize

naazanin Oct 18 '13 at 14:42

source share

2 answers

Brianhall · Answer 1 · 2013-10-18T14:57:09+0000

In general, you must first sanitize - "for your protection and for them." This includes removing any invalid characters (of course, character encoding). If the field should contain only characters and spaces, cross out everything that is not the first.

After that, you confirm the results - this name is already used (for unique fields), is the correct size, not empty?

The reason you give is exactly what you need - to maximize your user experience. Do not confuse the user if it can be avoided. This helps protect against silent copying and pasting behavior, but you have to be careful - if I want my name to be written as "Ke $ h @", I can or cannot change it to "Keh".

Secondly, it also prevents errors.

What happens if you want to create usernames that do not allow the use of special characters? If I enter Brian and your system rejects it as the name we already use, do I send Brian? First you test it and it is not used, then you remove special characters and you stay with Brian. Uh oh - now you either need to check the AGAIN, or you will get a strange error that caused the failure to create an account (if your database is configured to require unique usernames), or, even worse, it will be successful and rewriting / corruption occurs with user user accounts.

Another example is the minimum field length: if you need a name with a length of at least 3 letters and accept only letters, and I enter "no", you will reject it; but if I enter "no @ # $%", you could say that it is valid (long enough), sanitize it, and now it is no longer valid, etc.

An easy way to avoid this is to sanitize first and then you don’t need to think twice about validation.

However, Neath was entitled to not encode data before storage; As a rule, it is much easier to set the output in HTML as encoded when necessary, you should remember that it should decode it when you just need plain text (for input into text fields, JSON strings, etc.). Most of the test cases that you will use will not include data with HTML objects, so it is easy to introduce stupid errors that are not easy to catch.

The big problem is that when such an error is introduced, it can quickly lead to data corruption, which is not easy to solve. Example: you have plain text, output it to the text field incorrectly as html objects, the form returns and you re-encode it ... every time it opens or re-submits, it gets a transcoding. With a busy site / form, you can get thousands of records encoded in different ways, without a clear way to determine what should and what is not intended for HTML encoding.

Injection protection is good, but HTML coding is not (and should not) rely on.

Kyo · Answer 2 · 2014-12-25T08:31:04+0000

No, you must check first. Sanitation is pre-prepared to handle the data warehouse tier, which is the last step. It makes no sense to approach the level of the data warehouse if business rules do not pass the verification phase. If you need a number and you are given a string, this is an error, so you send them back to the form. Sanitation, with the exception of plug-ins, if required (not required, as in 5.4), is not required if you use SQL with prepared statements and would really mess up the input.

Which should perform first? sanitation or validation

More articles: