Skip to content

Conversation

@chaen
Copy link
Contributor

@chaen chaen commented Aug 19, 2022

The idea was to be able to compare what the MySQL optimizer does in version 5.7 and 8.0 when running the Integration tests of the DFC.
This PR adds an environment variable DIRAC_MYSQL_OPTIMIZER_TRACES_PATH that should point to an existing directory. Every call to a method of MySQL (_query, _update, executeStoredProcedure, executeStoredProcedureWithCursor) will then create a json file in that directory containing the optimizer traces (https://dev.mysql.com/doc/internals/en/optimizer-tracing.html).
The logic is in ff45676

This is a very advanced optimization tool, so the decorator itself may lead in crashes, but that's by design (see code doc). If the environment variable is NOT set, the decorator is not even applied.

Click me to see example of the json file
[
  {
    "query": "call ps_get_lfns_from_guids(\"'1000'\")",
    "trace": {
      "steps": []
    }
  },
  {
    "query": "SET @sql = CONCAT('SELECT SQL_NO_CACHE GUID, CONCAT(d.Name, \"/\", f.FileName) FROM FC_Files f JOIN FC_DirectoryList d on f.DirID = d.DirID WHERE GUID IN (', guids, ')')",
    "trace": {
      "steps": []
    }
  },
  {
    "query": "SELECT SQL_NO_CACHE GUID, CONCAT(d.Name, \"/\", f.FileName) FROM FC_Files f JOIN FC_DirectoryList d on f.DirID = d.DirID WHERE GUID IN ('1000')",
    "trace": {
      "steps": [
        {
          "join_preparation": {
            "select#": 1,
            "steps": [
              {
                "expanded_query": "/* select#1 */ select `f`.`GUID` AS `GUID`,concat(`d`.`Name`,'/',`f`.`FileName`) AS `CONCAT(d.Name, \"/\", f.FileName)` from (`FC_Files` `f` join `FC_DirectoryList` `d` on((`f`.`DirID` = `d`.`DirID`))) where (`f`.`GUID` = '1000')"
              },
              {
                "transformations_to_nested_joins": {
                  "transformations": [
                    "JOIN_condition_to_WHERE",
                    "parenthesis_removal"
                  ],
                  "expanded_query": "/* select#1 */ select `f`.`GUID` AS `GUID`,concat(`d`.`Name`,'/',`f`.`FileName`) AS `CONCAT(d.Name, \"/\", f.FileName)` from `FC_Files` `f` join `FC_DirectoryList` `d` where ((`f`.`GUID` = '1000') and (`f`.`DirID` = `d`.`DirID`))"
                }
              }
            ]
          }
        }
      ]
    }
  },
  {
    "query": "SELECT SQL_NO_CACHE GUID, CONCAT(d.Name, \"/\", f.FileName) FROM FC_Files f JOIN FC_DirectoryList d on f.DirID = d.DirID WHERE GUID IN ('1000')",
    "trace": {
      "steps": [
        {
          "join_optimization": {
            "select#": 1,
            "steps": [
              {
                "condition_processing": {
                  "condition": "WHERE",
                  "original_condition": "((`f`.`GUID` = '1000') and (`f`.`DirID` = `d`.`DirID`))",
                  "steps": [
                    {
                      "transformation": "equality_propagation",
                      "resulting_condition": "(multiple equal('1000', `f`.`GUID`) and multiple equal(`f`.`DirID`, `d`.`DirID`))"
                    },
                    {
                      "transformation": "constant_propagation",
                      "resulting_condition": "(multiple equal('1000', `f`.`GUID`) and multiple equal(`f`.`DirID`, `d`.`DirID`))"
                    },
                    {
                      "transformation": "trivial_condition_removal",
                      "resulting_condition": "(multiple equal('1000', `f`.`GUID`) and multiple equal(`f`.`DirID`, `d`.`DirID`))"
                    }
                  ]
                }
              },
              {
                "substitute_generated_columns": {}
              },
              {
                "table_dependencies": [
                  {
                    "table": "`FC_Files` `f`",
                    "row_may_be_null": false,
                    "map_bit": 0,
                    "depends_on_map_bits": []
                  },
                  {
                    "table": "`FC_DirectoryList` `d`",
                    "row_may_be_null": false,
                    "map_bit": 1,
                    "depends_on_map_bits": []
                  }
                ]
              },
              {
                "ref_optimizer_key_uses": [
                  {
                    "table": "`FC_Files` `f`",
                    "field": "DirID",
                    "equals": "`d`.`DirID`",
                    "null_rejecting": true
                  },
                  {
                    "table": "`FC_Files` `f`",
                    "field": "GUID",
                    "equals": "'1000'",
                    "null_rejecting": true
                  },
                  {
                    "table": "`FC_DirectoryList` `d`",
                    "field": "DirID",
                    "equals": "`f`.`DirID`",
                    "null_rejecting": true
                  }
                ]
              },
              {
                "rows_estimation": [
                  {
                    "table": "`FC_Files` `f`",
                    "rows": 1,
                    "cost": 1,
                    "table_type": "const",
                    "empty": false
                  },
                  {
                    "table": "`FC_DirectoryList` `d`",
                    "rows": 1,
                    "cost": 1,
                    "table_type": "const",
                    "empty": false
                  }
                ]
              },
              {
                "condition_on_constant_tables": "true",
                "condition_value": true
              },
              {
                "attaching_conditions_to_tables": {
                  "original_condition": "true",
                  "attached_conditions_computation": [],
                  "attached_conditions_summary": []
                }
              },
              {
                "refine_plan": []
              }
            ]
          }
        },
        {
          "join_execution": {
            "select#": 1,
            "steps": []
          }
        }
      ]
    }
  }
]

One important change in that PR is a73ba15 As a consequence, the keyword parameter MUST be named if used. I think I got all of them fixed in 14531d1

BEGINRELEASENOTES
*Core
NEW: Introduce DIRAC_MYSQL_OPTIMIZER_TRACES_PATH environment variable for advance optimization of MySQL calls
CHANGE: calls to the MySQL class with keyword parameters MUST now be named

ENDRELEASENOTES

@chaen chaen force-pushed the v8.0_FEAT_mysqlOptimizer branch from 86cf375 to 90906e4 Compare August 19, 2022 21:05
@chaen chaen marked this pull request as ready for review August 20, 2022 10:26
@chaen
Copy link
Contributor Author

chaen commented Aug 20, 2022

I think it is fine to review. It has the potential to break a few things, but the error message would be quite clear. I could not find any issue. I'll anyway add something in the wiki for that.

@chrisburr chrisburr merged commit 19bc77e into DIRACGrid:integration Aug 24, 2022
@DIRACGridBot DIRACGridBot added the sweep:ignore Prevent sweeping from being ran for this PR label Aug 24, 2022
.. code-block:: bash
cd ${DIRAC_MYSQL_OPTIMIZER_TRACES_PATH}
c=0; for i in $(ls); do newFn=$(echo $i | sed -E "s/_trace_[0-9]+.[0-9]+_(.*)/_trace_${c}_\1/g"); mv $i $newFn; c=$(( c + 1 )); done
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
c=0; for i in $(ls); do newFn=$(echo $i | sed -E "s/_trace_[0-9]+.[0-9]+_(.*)/_trace_${c}_\1/g"); mv $i $newFn; c=$(( c + 1 )); done
c=0; for i in *; do newFn=$(echo $i | sed -E "s/_trace_[0-9]+.[0-9]+_(.*)/_trace_${c}_\1/g"); mv $i $newFn; c=$(( c + 1 )); done

* A list of dictionaries, one per trace for the specific call:
* ``{ "Query": <query executed>, "Trace" : <optimizer analysis>}`` if all is fine
* ``{"Error": <the error>}`` in case something goes wrong. See the lower in the code
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* ``{"Error": <the error>}`` in case something goes wrong. See the lower in the code
* ``{"Error": <the error>}`` in case something goes wrong. See below in this function

Assuming this is what you meant (lines 294ff)

* ``{"Error": <the error>}`` in case something goes wrong. See the lower in the code
for the description of errors
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add something like this, (if I understand correctly)?

Suggested change
To make use of this functionality in a database implementation, calls to :func:`~DIRAC.Core.Utilities.MySQL.MySQL._query`, :func:`~DIRAC.Core.Utilities.MySQL.MySQL_update`, and :func:`~DIRAC.Core.Utilities.MySQL.MySQL.executeStoredProcedure` must explicitly make use of the ``conn`` keyword-argument.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

sweep:ignore Prevent sweeping from being ran for this PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants