Skip to content

Conversation

@javer
Copy link
Contributor

@javer javer commented Jan 15, 2022

The idea behing this is the following: by overriding DBAL built-in StringType we can use native enums directly in the database. But to make it possible we need enumType information on DBAL side, right now it's holded only in ORM mapping and isn't passed to DBAL in SchemaTool.

# config/packages/doctrine.yaml
doctrine:
    dbal:
        types:
            string: App\DBAL\Type\EnumType
# src/App/DBAL/Type/EnumType.php
<?php

namespace App\DBAL\Type;

use BackedEnum;
use Doctrine\DBAL\Platforms\AbstractPlatform;
use Doctrine\DBAL\Platforms\MySQLPlatform;
use Doctrine\DBAL\Types\StringType;

class EnumType extends StringType
{
    public function getSqlDeclaration(array $column, AbstractPlatform $platform): string
    {
        if ($platform instanceof MySQLPlatform && isset($column['enumType']) && enum_exists($column['enumType'])) {
            $values = array_map(static fn(BackedEnum $case) => "'{$case->value}'", $column['enumType']::cases());

            return 'ENUM(' . implode(', ', $values) . ')';
        }

        return parent::getSQLDeclaration($column, $platform);
    }
}

By doing this the following entity

<?php

namespace App\Entity;

use Doctrine\ORM\Mapping as ORM;

enum Suit: string {
    case Hearts = 'H';
    case Diamonds = 'D';
    case Clubs = 'C';
    case Spades = 'S';
}

#[ORM\Entity]
class Card
{
    #[ORM\Id, ORM\GeneratedValue, ORM\Column]
    public int $id;

    #[ORM\Column]
    public Suit $suit;
}

results in the migration:

CREATE TABLE card (
    id INT AUTO_INCREMENT NOT NULL,
    suit ENUM('H', 'D', 'C', 'S') NOT NULL,
    PRIMARY KEY(id)
) ENGINE = InnoDB;

@derrabus
Copy link
Member

I would not want a PHP enum to always be mapped to a MySQL enum. If we merge this, it must be an opt-in feature, please.

@javer
Copy link
Contributor Author

javer commented Jan 15, 2022

The proposed change doesn't force using MySQL enum for all PHP enums, it just forwards metadata from ORM to DBAL to make it possible to implement in your application. Without the code in the first comment (which should be implemented in the application, not in the library) nothing will be changed.

To make it clear, in no way I suggest using MySQL enums by default, it's just possibility for the application to implement this behavior if app want it, because right now it isn't possible, because there is no enough information on DBAL side.

@greg0ire
Copy link
Member

IMO we shouldn't make it easy to use ENUM(), as it is a footgun

@javer
Copy link
Contributor Author

javer commented Jan 15, 2022

We have been using MySQL ENUM() for more than 15 years in many projects without any issues. You just should understand where it fits the best (predefined set of never-changing values without any additional data), and where it isn't. It is very memory-efficient, especially when we talk about millions of records, and it is strange to reject all of these advantages just because of some "religion sights". Knife also can lead to devastating results, but at the same time it's very efficient tool for the appropriate purpose.

At the same time, PHP native Enum has identical native representation in MySQL - with the same advantages and disadvantages, so why we are okay with PHP Enum and want make harder it reflection to MySQL Enum?

Again, I don't suggest using MySQL Enum by default for everyone, I just propose to forward information about enumType from ORM to DBAL layer to give application developers the right to make a decision whether they want to use MySQL Enum or not. Default behavior is not changed in any way. And this small patch will help to avoid forking/patching doctrine/orm to be able to do this.

Thank you for your understanding.

@javer
Copy link
Contributor Author

javer commented Jan 15, 2022

It is worth mentioning that we still have a chapter in the official Doctrine ORM documentation about

the Enum Type of MySQL that is used quite a lot by developers

And we were given a solution to make it possible with ORM: https://www.doctrine-project.org/projects/doctrine-orm/en/2.11/cookbook/mysql-enums.html#solution-2-defining-a-type

So the proposed solution is already used by many developers in many projects.

And it seems logical to replace old-fashioned approach with EnumSomethingType class with a native Something enum (without duplicating its content in EnumSomethingType), especially when people already use it quite a lot.

@greg0ire
Copy link
Member

I have myself used enums in the past year, and I must say I have been burnt, because of how key the "never-changing values without any additional data" part is. I had this table with millions of rows that I needed to migrate to allow one more value, and it was very painful. So you see, it's not religion or superstition, it's based on experience.

Knife also can lead to devastating results, but at the same time it's very efficient tool for the appropriate purpose.

This is a great analogy. And I was saying that knives shouldn't be easily accessible to anybody. But since you still have a define a DBAL type, maybe this is acceptable.

It is very memory-efficient, especially when we talk about millions of records,

If you're that worried about memory, you can have a lookup table with an id column and a column to hold the value for the enum, and pick an appropriate type for the id column, see https://stackoverflow.com/questions/761211/how-to-handle-enumerations-without-enum-fields-in-a-database/761343#761343

And we were given a solution to make it possible with ORM: https://www.doctrine-project.org/projects/doctrine-orm/en/2.11/cookbook/mysql-enums.html#solution-2-defining-a-type

Maybe we should alter this cookbook… in the example, a status is used, which is definitely not something with a fixed set of values that will never evolve. Also, the fact that there is a cookbook entry for this shows that we have no intention of supporting ENUM in directly Doctrine, and we warn against it:

But first a word of warning. The MySQL Enum type has considerable downsides:

Adding new values requires to rebuild the whole table, which can take hours depending on the size.
Enums are ordered in the way the values are specified, not in their "natural" order.
Enums validation mechanism for allowed values is not necessarily good, specifying invalid values leads to an empty enum for the default MySQL error settings. You can easily replicate the "allow only some values" requirement in your Doctrine entities.

I'm not favorable to this PR, but maybe it's just me, let's see if other maintainers think this is fine. Regardless of the benefits and drawbacks of ENUM, I find it weird to forward information to the DBAL if it natively will not do anything with it.

@javer
Copy link
Contributor Author

javer commented Jan 15, 2022

I have myself used enums in the past year, and I must say I have been burnt, because of how key the "never-changing values without any additional data" part is. I had this table with millions of rows that I needed to migrate to allow one more value, and it was very painful.

Online altering of ENUM values (without table rebuilding) is supported since MySQL 8.0.12 (2018-07-27), see Online DDL Operations:

Modifying the definition of an ENUM or SET column by adding new enumeration or set members to the end of the list of valid member values may be performed instantly or in place, as long as the storage size of the data type does not change.

So it is a fast operation since 2018, at least when you add a new value to the end (the most often use case in my experience).

If you're that worried about memory, you can have a lookup table with an id column and a column to hold the value for the enum, and pick an appropriate type for the id column, see https://stackoverflow.com/questions/761211/how-to-handle-enumerations-without-enum-fields-in-a-database/761343#761343

It might be the case for a very tiny application, but no more. One of our application has 84 EnumTypes, so instead of creating 84 tiny classes with enumerated values we should:

  • create 84 lookup tables and fill them with data
  • create 84 entities for them
  • create 84 repositories to lookup through them
  • wrap each of them to some cache layer, because it's waste of CPU and I/O to fetch a lookup table every time we need a key from it
  • or hardcode table content as constants in entities (which is even worse)
  • create some monitors to invalidate these caches
  • support all this bunch of code for lifetime

Well, it seems too much work for such an easy task, isn't it? Remember we need to create business value also, not only write code for the code itself. And there is already existing solution out-of-the-box, why not to use it if you know where and how to use it? Again, I'm not evangelist of ENUM and don't suggest using it for everybody and everywhere, but let's not reject the things which do their job efficiently. It's like a supercars, they don't fit to everybody, but they greatly do their job for people who can control them.

Enums validation mechanism for allowed values is not necessarily good

It's not true since PHP Enums, because now the value is validated far before persisting it to the DB, because Enum won't be instantiated at all. That is one of the reasons why I prefer PHP Enums over EnumTypes.

I find it weird to forward information to the DBAL if it natively will not do anything with it.

Yes, it also confuses me, but it just informs DBAL that a column holds Enum, this information might be necessary in the future for full-fledged support of PHP Enums, for example, for handling Enum default values or something else.

Copy link
Member

@beberlei beberlei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a simple and good enough change to do. All other things set in a @Column are exposed to Schema as well, so this should too.

// the 'default' option can be overwritten here
$options = $this->gatherColumnOptions($mapping) + $options;

if (isset($mapping['enumType'])) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel this code should be moved into gatherColumnOptions, because that is where it should come from.

@beberlei
Copy link
Member

The pros and cons of using an ENUM type are irrelevant to this change imho, its a choice everyone can make, and this just exposes the value to Schema component, similar to all other data that is available when set in ORM mappings.

Copy link
Member

@greg0ire greg0ire left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the cookbook you mentioned be changed in any way? As I said, I find it weird to forward something to the DBAL if it does not natively do something with it, but if at least the possible use of this option is mentioned somewhere, that's already better.

@derrabus
Copy link
Member

derrabus commented Jan 15, 2022

I'm fine with the change after reading the discussion. But:

  • We should add a test for it. We're forwarding a piece of information that we don't use anywhere. If we break this feature, there should be a test in place that lets us know.
  • We should add documentation, at least a cookbook entry about how to use the forwarded metatdata.

@beberlei
Copy link
Member

beberlei commented Jan 15, 2022

SchemaToolTest already does a lot in this area, it can probably be extended for this.

Docs can be done seperately, they haven't existed before either.

@javer javer force-pushed the expose-enum-type-to-dbal branch 3 times, most recently from b03983d to 9ccf93a Compare January 16, 2022 08:51
@javer
Copy link
Contributor Author

javer commented Jan 16, 2022

@beberlei Thank you! I've moved the code to gatherColumnOptions() and added the unit test.

Should we target this change to 2.11.x instead?

@beberlei
Copy link
Member

@javer Why not, this is somewhat an omission of the enum feature PR and can be considered a bug for adding this without integrating in all the proper places. Its also tiny enough to be a low risk for breakage if we consider this a new feature instead.

@javer javer changed the base branch from 2.12.x to 2.11.x January 16, 2022 19:53
@javer javer force-pushed the expose-enum-type-to-dbal branch from 9ccf93a to d02d9c7 Compare January 16, 2022 19:54
@javer
Copy link
Contributor Author

javer commented Jan 16, 2022

@beberlei Done, I've rebased on 2.11.x and changed target branch to 2.11.x.

@javer javer force-pushed the expose-enum-type-to-dbal branch from d02d9c7 to b5c94c3 Compare January 17, 2022 09:27
@derrabus derrabus added the Bug label Jan 17, 2022
@derrabus derrabus added this to the 2.11.1 milestone Jan 17, 2022
@derrabus derrabus merged commit 223b265 into doctrine:2.11.x Jan 17, 2022
@derrabus
Copy link
Member

Thank you, @javer!

@javer javer deleted the expose-enum-type-to-dbal branch January 17, 2022 10:51
derrabus added a commit that referenced this pull request Jan 18, 2022
* Expose enumType to DBAL to make native DB Enum possible (#9382)

* Accessing private properties and methods from the same class is forbidden (#9311)

Resolves issue doctrine/common#934

Update docs/en/cookbook/accessing-private-properties-of-the-same-class-from-different-instance.rst

Co-authored-by: Claudio Zizza <[email protected]>

Update docs/en/cookbook/accessing-private-properties-of-the-same-class-from-different-instance.rst

Co-authored-by: Claudio Zizza <[email protected]>

Fix review issues

* Update baselines for DBAL 3.3 (#9393)

Co-authored-by: Vadim Borodavko <[email protected]>
Co-authored-by: olsavmic <[email protected]>
derrabus added a commit to derrabus/orm that referenced this pull request Jan 18, 2022
* 2.12.x:
  Deprecate MultiGetRegion (doctrine#9397)
  Fix type on loadCacheEntry (doctrine#9398)
  Update baselines for DBAL 3.3 (doctrine#9393)
  Accessing private properties and methods from the same class is forbidden (doctrine#9311)
  Expose enumType to DBAL to make native DB Enum possible (doctrine#9382)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants