MySQL Collation – A Beginner’s Introduction

MySQL Collation

In this tutorial, we will study Collation in MySQL. Let’s get right into the topic without any further ado.

Also read: MySQL COALESCE() Function

Introduction to MySQL Collation

A MySQL collation is basically a set of rules which is used to compare the characters in a character set. A character set is a set of characters that is legal in a string. Each character set has one default collation and it can have more than one collation but two character sets cannot have the same collation. The character sets are compared using the corresponding encoding technique.

The below statement is used to see the default collation of the character sets-

Show character set;
Character Set
Character Set

Here in the collation column, _ci means case insensitive, _cs means case sensitive and _bi means binary.

If we want to see all the collations for a given character set, the below statement will be used-

SHOW COLLATION LIKE 'latin1%';

Output-

Collation 1
Collation 1

MySQL Collation – Character set and collations

In MySQL, we can set character sets and collations at four different levels-

  • Server
  • Database
  • Table
  • Column

Also read: MySQL CREATE TABLE – How to create a table in MySQL?

Setting character sets and collations at the server level

In MySQL the default character set is latin1 and the default collation is latin_swedish_ci.

We can specify the character set at the server startup and then MySQL will use collations of that character set. Also, both character set and collation can be specified at the server startup then MySQL will use that for every database.

The syntax for changing the character set utf8 and collation to utf8_unicode_cs for server-

>mysqld --character-set-server=utf8 --collation-server=utf8_unicode_cs

Setting character sets and collations at the database level

When the database is created and the character set and collation are not specified then MySQL uses the default character set and collation of the server for a new database. To change the character set and collation at the database level, use the following syntax-

Create Database database_name
Character Set character_set_name
Collate collation_name;

or we can use Alter Database

Alter Database database_name
Character Set character_set_name
Collate collation_name;

Setting character sets and collations at the table level

A database contains tables whose character sets and collations may be different from the character sets and collations at the database level. To specify the character set and collation of the table use the following syntax –

Create Table table_name(
   ...
)
Character Set character_set_name
Collate collation_name;

or we can use Alter Table statement-

Alter Table table_name(
  ...
)
Character Set character_set_name
Collate collation_name;

Setting character sets and collations at the column level

A column has data types such as char, varchar, and text so it can have its own character set and collation which may be different from the table. To change the character set and collation of a particular column, use the following-

column_name [CHAR | VARCHAR | TEXT] (length)
Character Set character_set_name
Collate collation_name

Example

First, we will create a database adb and set the character set as utf8 and collation as utf8_unicode_ci

Create Database adb
Character Set utf8
Collate utf8_unicode_ci;

Now let us create a table named mytable in the database adb

Use adb;
 
Create Table mytable(
    a1 char(50)
);

Since we have not specified the character set and collation of the table, it will use the default character set and collation of the database. Let us change the character set to latin1 and collation to latin1_swedish_ci of the table-

Alter Table t1
Character Set latin1
Collate latin1_swedish_ci;

Conclusion

In this tutorial, we studied Collation and character sets in MySQL. For more reference, check out the official documentation of MySQL.