Skip to content

MDEV-31669: use XXH3 as a digest instead of md5#4573

Open
MohamedM216 wants to merge 1 commit intoMariaDB:mainfrom
MohamedM216:feat-use-sha2-256/MDEV-31669
Open

MDEV-31669: use XXH3 as a digest instead of md5#4573
MohamedM216 wants to merge 1 commit intoMariaDB:mainfrom
MohamedM216:feat-use-sha2-256/MDEV-31669

Conversation

@MohamedM216
Copy link
Copy Markdown
Contributor

@MohamedM216 MohamedM216 commented Jan 21, 2026

Jira Issue number for this PR: MDEV-31669

I followed mysql implementation as suggested in jira.

PERFORMANCE SCHEMA MD5 DIGEST NEEDS TO CHANGE DIGEST FOR FIPS
COMPLIANCE
Before this fix, DIGEST hashes are computed using the MD5 hash.
MD5 is not FIPS compliant.
This fix replaces the MD5 hash with SHA256 (which is compliant).
As a result, DIGEST columns are changed from VARCHAR(32) to VARCHAR(64)

@gkodinov gkodinov added the External Contribution All PRs from entities outside of MariaDB Foundation, Corporation, Codership agreements. label Jan 22, 2026
Copy link
Copy Markdown
Member

@gkodinov gkodinov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a preliminary review. Right now there's no strong opposition against removing MD5 from this. But the choice of a replacement hash function is yet to be determined. I can see the logic in Sergei's argument that using a crypto hash for a non-crypto purpose is kind of a mismatch. But I will leave that discussion for the final review now.

First of all: please make sure the change compiles and runs all buildbot test in a satisfactory way.

@MohamedM216 MohamedM216 force-pushed the feat-use-sha2-256/MDEV-31669 branch from 899f3b3 to 3e7b460 Compare January 22, 2026 22:33
@MohamedM216 MohamedM216 requested a review from gkodinov January 23, 2026 04:39
Copy link
Copy Markdown
Member

@gkodinov gkodinov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please switch this to XXH3, as suggested by Sergei. I will then do a full review of the result once this is done.

@MohamedM216 MohamedM216 force-pushed the feat-use-sha2-256/MDEV-31669 branch from 3e7b460 to 241cb0c Compare February 18, 2026 15:32
@MohamedM216 MohamedM216 changed the title MDEV-31669: use sha2-256 as a digest instead of md5 MDEV-31669: use XXH3 as a digest instead of md5 Feb 18, 2026
@MohamedM216 MohamedM216 force-pushed the feat-use-sha2-256/MDEV-31669 branch from 241cb0c to ea765b3 Compare February 19, 2026 12:43
@MohamedM216
Copy link
Copy Markdown
Contributor Author

Apologies, I accidentally closed this PR.
I will reopen it shortly.

@MohamedM216 MohamedM216 reopened this Feb 19, 2026
Copy link
Copy Markdown
Member

@gkodinov gkodinov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for switching to XXhash.

OK to push from me, given that you add a commit message to the commit that follows CODING_STANDARDS.md.

Please wait for the final review.

@gkodinov gkodinov requested a review from vuvova February 19, 2026 14:47
@MohamedM216 MohamedM216 force-pushed the feat-use-sha2-256/MDEV-31669 branch from ea765b3 to 233436e Compare February 19, 2026 23:28
@MohamedM216
Copy link
Copy Markdown
Contributor Author

Hi @gkodinov , I've updated the commit message. Thanks!

Copy link
Copy Markdown
Member

@vuvova vuvova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the main comment here: I've looked at xxhash.h that you're using and noticed that it has 128-bit hash. This is the same width as MD5 so less changes and also same chance of collison, 64-bit hash has it much higher.

could you try with XXH3_128bit please?


--disable_query_log
create table test._digests(d varchar(32) primary key);
create table test._digests(d varchar(16) primary key);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's keep it at 32 here. It'd be good to know that existing applications (that query and, perhaps, temporarily store these digests) will continue to work

compute_md5_hash(md5,
(const char *) digest_storage->m_token_array,
digest_storage->m_byte_count);
XXH64_hash_t res = XXH3_64bits(digest_storage->m_token_array,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder, perhaps we should use XXH3_128bits? it exists in the same file, so as easy to use. but has the same width as md5, so less changes and more importantly same chance of collisions as before.

@MohamedM216
Copy link
Copy Markdown
Contributor Author

OK, I'll work on that. Thanks for the feedback!

@MohamedM216 MohamedM216 force-pushed the feat-use-sha2-256/MDEV-31669 branch 3 times, most recently from bc14604 to 8ad7bd6 Compare February 26, 2026 23:42
@MohamedM216 MohamedM216 requested a review from vuvova February 26, 2026 23:55
Copy link
Copy Markdown
Member

@vuvova vuvova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Changes are small and from the user point of view nothing changes whatsoever. Except that the digest is faster and it no longer fails under strict FIPS mode.

just a couple of very minimal changes and I'll approve.

begin
declare digest_exists tinyint;
if length(digest) != 32 or conv(digest, 16, 10) = 0 then
if length(digest) != 32 or digest is null then
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this change probably can be reverted completely

uint m_schema_name_length;
/** Column DIGEST. */
char m_digest[COL_DIGEST_SIZE];
char m_digest[DIGEST_HASH_TO_STRING_LENGTH + 1];
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extra space?

@MohamedM216 MohamedM216 force-pushed the feat-use-sha2-256/MDEV-31669 branch from 8ad7bd6 to 1a0ae81 Compare February 28, 2026 19:34
@MohamedM216 MohamedM216 requested a review from vuvova February 28, 2026 19:36
@vuvova
Copy link
Copy Markdown
Member

vuvova commented Feb 28, 2026

@MohamedM216, thanks, looks good. This is what happens now: MDEV-31669 is moved to the "In-Testing" status, if everything is fine, a tester will approve it and it'll be pushed into main in time for the 13.0.1 release which is planned for the beginning of May.

There is nothing you need to to anymore, unless testing finds bugs.

@MohamedM216
Copy link
Copy Markdown
Contributor Author

👍

@mariadb-RoelVandePaar
Copy link
Copy Markdown

mariadb-RoelVandePaar commented Mar 16, 2026

Small typo (s/128/32/):

/* XXH3_128bits = 16 bytes of binary = 128 printable characters */

Should read, for example:

/* XXH3_128bits: 128 Bits = 16 Bytes Binary = 32 Printable characters */

Another small typo:

  /* Comchpute digest hash of the tokens received. */

Should read:

  /* Compute digest hash of the tokens received. */

Also, in the commit message,

- Following MySQL implementation as mentioned in the PR description

Does not seem applicable.

@mariadb-RoelVandePaar
Copy link
Copy Markdown

@MohamedM216 Hi! Testing is ongoing, especially in the area of performance, ref MDEV-31669. Please see my last comment from above & please also add a MTR testcase. For example, you could show how before the patch a given MD5 hash is expected while after the patch a XXH3 hash is present. For an example/some ideas see this comment (it's a long testcase; there is no need for that, a simple sanity level check will do!). Thank you

@mariadb-RoelVandePaar
Copy link
Copy Markdown

@MohamedM216 Hi! Please see this message. Thank you!

@MohamedM216
Copy link
Copy Markdown
Contributor Author

Hi @mariadb-RoelVandePaar , I’ve seen the latest updates, that’s great news! Thank you for taking the time to test the PR, I really appreciate it. I’ve been a bit busy over the past few days, but I’ll resume working on the remaining tasks shortly.

@mariadb-RoelVandePaar
Copy link
Copy Markdown

Thank you!

@MohamedM216 MohamedM216 force-pushed the feat-use-sha2-256/MDEV-31669 branch from 1a0ae81 to 1520fc7 Compare April 4, 2026 06:07
@MohamedM216
Copy link
Copy Markdown
Contributor Author

Changes applied. Looking forward to your review.

@MohamedM216 MohamedM216 requested a review from vuvova April 4, 2026 06:09
Copy link
Copy Markdown
Member

@vuvova vuvova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, please, remove this new test. I'll better revert 485773a instead. But separately, not as a part of your PR.

- XXH is significantly faster than both MD5 and SHA-256
- Here we use the 128-bit XXH3
- Keep DIGEST_HASH_SIZE at 16 bytes (same as MD5)
- Reduce COL_DIGEST_SIZE to 32 chars
- Keep DIGEST columns as VARCHAR(32)
@MohamedM216 MohamedM216 force-pushed the feat-use-sha2-256/MDEV-31669 branch from 1520fc7 to 437fa81 Compare April 4, 2026 10:04
@MohamedM216 MohamedM216 requested a review from vuvova April 4, 2026 10:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

External Contribution All PRs from entities outside of MariaDB Foundation, Corporation, Codership agreements.

Development

Successfully merging this pull request may close these issues.

4 participants