Skip to content

Conversation

@chungeun-choi
Copy link
Contributor

@chungeun-choi chungeun-choi commented Aug 23, 2023

Overview

issue number #430

When using the pymysqlreplication package, there was a recurring issue with the history list length value under the following circumstances:

  • MySQL version 8.x or higher.
  • Continuous execution of DML (Data Manipulation Language) statements.

This problem led to a gradual slowing down of SELECT queries.

For detailed information, you can refer to the following reference:

https://minervadb.xyz/troubleshooting-innodb-history-length-with-hung-mysql-transaction/

As a result, I investigated the problem area and made necessary modifications. After making the changes, you were able to compare the performance before and after the fix, using monitoring tools such as Prometheus and MySQL Exporter.

Fixed

  • Modified code

    def __get_table_information(self, schema, table): for i in range(1, 3): try: if not self.__connected_ctl: self.__connect_to_ctl() cur = self._ctl_connection.cursor() cur.execute("""  SELECT  COLUMN_NAME, COLLATION_NAME, CHARACTER_SET_NAME,  COLUMN_COMMENT, COLUMN_TYPE, COLUMN_KEY, ORDINAL_POSITION,  DATA_TYPE, CHARACTER_OCTET_LENGTH  FROM  information_schema.columns  WHERE  table_schema = %s AND table_name = %s  ORDER BY ORDINAL_POSITION  """, (schema, table)) result = cur.fetchall() cur.close() return result	...	...

Performance Comparison After the Code Modifications

  • Before modification

    image

  • After modification

    image

@chungeun-choi
Copy link
Contributor Author

We also found and added parts that can be solved by adding autocommit settings

 def __connect_to_ctl(self): if not self._ctl_connection_settings: self._ctl_connection_settings = dict(self.__connection_settings) self._ctl_connection_settings["db"] = "information_schema" self._ctl_connection_settings["cursorclass"] = DictCursor self._ctl_connection_settings["autocommit"] = True # Changed self._ctl_connection = self.pymysql_wrapper(**self._ctl_connection_settings) self._ctl_connection._get_table_information = self.__get_table_information self.__connected_ctl = True 
@dongwook-chan
Copy link
Collaborator

This PR makes it feasible for major corporations handling extensive traffic to utilize this library. It addresses potential query response lags and reduces the strain on database servers. One of the standout features of the replication protocol is its minimal impact on database performance, and this PR ensures the library capitalizes on that strength.

@julien-duponchelle
Copy link
Owner

Great , thanks for the investigation . Small fix but big problem indeed

@julien-duponchelle julien-duponchelle merged commit 4c2dcf2 into julien-duponchelle:main Aug 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

5 participants