Fixed-length VS. Variable-length
The CHAR data type is fixed-length. VARCHAR, as its name implies, is variable-length. Let's understand it with an example:
CREATE TABLE myTable ( charField CHAR(5), varCharField VARCHAR(5) ) INSERT INTO myTable SELECT 'T','T' INSERT INTO myTable SELECT 'F','F' INSERT INTO myTable SELECT 'TRUE','TRUE' INSERT INTO myTable SELECT 'FALSE','FALSE' SELECT datalength(charField) charFieldLength, datalength(varCharField) varCharFieldLength FROM myTable
From the above example, we learned that CHAR automatically uses the maximum allotted column space. If we update data and add more characters, those additional characters will go in the same space.
If we input “Hello” and add “World”, all will go in one continuous memory location:
In the case of VARCHAR, the column does not “max out” – instead it only uses the needed space. Any added data to the system will add a pointer to the first allotted space that points to a different location, which is then used to store the additional data.
Let’s use the same example. Initially when the user writes “Hello” to the column value, the memory will appear as:
“EOL” (End Of Length) is the effective length of the string.
If the user adds “World” to the existing data, it would look like this:
“PTR1” indicates a pointer that connects “Hello “with “World”. If the data is updated 4 or 5 times there will be 4 or 5 pointers.
The number of bytes used to store length depends on the type of database engine being used. For example, SQL Server databases use 2 bytes. Database engines look at that value when pulling the data from the table or when joining the column with other tables or when VARCHAR is used as an index.
In the case of CHAR, a database engine looks at the column/variable width and pulls all the data straight from the column without checking the data or doing any data manipulation. If a query, index or join uses VARCHAR column, the performance will be affected. Since VARCHAR is a variable-length data-type, the database engine works to find the total length of the field value, resulting in a data storage overhead of at least one byte that contains the effective length of the string.
CHAR is better in performance than VARCHAR, but CHAR takes unnecessary memory space when the data does not have a fixed-length.
However, the database engine can use the unnecessary space (as described above, in the case of data manipulations). In addition, disk space is no longer an issue. Obtaining disk space is much cheaper now and is available in Exabyte, Petabyte, Terabyte, and Gigabyte, and can be stored in cloud space.
Since CHAR provides us with better performance, I would suggest using CHAR even in situations where data length varies. So… when should you use VARCHAR? Use it for data archives where there are no data manipulations and the data is of variable length.