Skip to content

Tag Archives: type

MySQL – Should you put an index on a Boolean field to help query performance?

I am often asked if it makes sense to place an index on a Boolean field in order to improve query performance.  In general, because a boolean value can only have three values (True, False, Null), this low cardinality would suggest that adding an index will not help performance, as the query optimizer will still usually perform a table-scan if you have an even distribution of values within your DB.

One situation in which an index on a boolean field (or other low cardinality field) might be useful is if there are relatively few of one of the values, for example 5 True values in a table of millions of records and you are searching for those few values on a regular basis.

However, you might index a boolean value on a combination of fields. Indexing on a single Boolean might be pointless, because there’s only 2 (or 3) values.  However, indexing on 16 boolean values has the potential of  2^16 values.  It might help to make a combined index but you should understand how the combined index can and cannot be used and be aware that the order of the columns matters.

In general, you should always profile your system to see if there are queries that are too slow and consider adding another index to handle those queries. Sometimes a single combined index can be used for multiple queries and others time you will need to make an index for each type of query. Remember that adding indexes slows down modifications to the data so it is possible to have too many indexes. There is always a  trade-off when creating multiple indexes.

MySQL – DateTime vs. TimeStamp – When to use?

This question often comes up, and while there is often a heated debate, these are my thoughts on the subject.

I typically use a timestamp when I need to record a fixed point in time. For example when a record was inserted into the database or when some useraction took place that resulted in a row being updated.   The Timestamp data type has various features that allow it to automatically perform this function.   The default value for the Timestamp data type results in it being set to the current_date when a new row is added.  If you as the On Update syntax, this value will also be updated whenever the row is updated.

I use a datetime field when the date/time can be set and changed arbitrarily. For example when the field can be updated based on specific events or actions, besides a simple row Update.  In addition, a TimeStamp field can only store dates since 1970, so if you need to store dates in the past, such as a birthday, you must use DateTime.

A couple other things to keep in mind, TimeStamp fields support the TimeZone setting on your server.  For example, if I have a database in Europe, and take a dump of that database to syncronize/populate a database in America, then the timestamp would update to reflect the real time of the event in the new time zone, while datetime would still reflect the time of the event in the European timezone.  By default, the current time zone for each connection is the server’s time, however the time zone can be set on a per-connection basis.

Lastly, TIMESTAMP stores its value in 4 bytes, while DATETIME uses 8 bytes.  This is the main reason why TimeStamp has a lower limit of 1970.

MySQL – Constraints on Hierarchical Data in a Self-Referential Table

I recently had a client who was using a table structure that utilized a self-referential structure using an ID field and Parent_ID fields. After upgrading to Innodb, his DBA was trying to utilize a foreign key constraint to perform cascading UPDATE and DELETE statements. After much heart-ache, they called me in to find out why it was not working. Unfortunately, I had to share with them the fact that MySQL does not support this type of use of constraints on self-referential tables.

This deviation from SQL standards results affects an ON UPDATE CASCADE or ON UPDATE SET NULL that recurses to update the same table it has previously updated during the cascade. Instead of cascading, it acts like RESTRICT. This means that you cannot use self-referential ON UPDATE CASCADE or ON UPDATE SET NULL operations in MySQL. According to the Innodb developers, this is to prevent infinite loops resulting from cascaded updates, although I would think this could be addressed in future versions. A self-referential ON DELETE SET NULL, on the other hand, is still possible, as is a self-referential ON DELETE CASCADE.

MySQL Foreign Key Night-mare

I recently was working with a 3rd party DB that was designed well and used lots of foreign keys to enforce referential integrity. Unfortunately, the application that used the DB structure was very poorly written and caused all sorts of problems when the foreign keys were applied. I had to quickly remove all the foreign keys from about 160 tables.  I can never understand why developers will create foreign keys, however then not write their code properly to prevent errors, or at least attempt to catch those errors.

The code below will quickly create a script that removes all Foreign Keys from a database.  Many folks will simply change the engine type to MyIsam and then back to InnoDB.  This may work, but it is very slow if you already have a lot of data in your tables.

select concat(‘alter table ‘,table_schema,’.',table_name,’ DROP FOREIGN KEY ‘,constraint_name,’;') from information_schema.table_constraints
where constraint_type=’FOREIGN KEY’;

You can limit by schema by adding the line : AND table_schema rlike ‘Schema Name’

Remember, Foreign keys are only valid when using Innodb, MyISAM does not support foreign keys.