Skip to content

Optimize message memmoves#501

Merged
anarthal merged 16 commits intoboostorg:developfrom
vsoulgard:feature/462-optimize-packet-read
Mar 29, 2026
Merged

Optimize message memmoves#501
anarthal merged 16 commits intoboostorg:developfrom
vsoulgard:feature/462-optimize-packet-read

Conversation

@vsoulgard
Copy link
Copy Markdown
Contributor

Solves #462

Instead of memmove everything on every call to message_reader::prepare_buffer(), we can use Exponential Moving Average (EMA) formula to predict next message size. If predicted message size fits into buffer, we can avoid memmove.

Callgrind screenshots (boost_mysql_bench_one_big_row_boost) Before: big_row_v1 After: big_row_v4

Before:
Total instructions - 1184758337
Memcpy instructions - 1030331155

After:
Total instructions - 154287054 (x7.6 less)
Memcpy instructions - 485896 (x2120 less)

It's almost free EMA implementation with high predictive power.

@vsoulgard vsoulgard marked this pull request as draft March 17, 2026 02:55
@vsoulgard vsoulgard marked this pull request as ready for review March 17, 2026 06:08
@anarthal
Copy link
Copy Markdown
Collaborator

Hi! Thanks for your PR.

I think this is a step towards the right direction. I am not convinced about the heuristic to determine whether we memmove or not though. Consider:

  • The one_big_row benchmark. The average row size is around 90kB. The predicted packet size is between 2 and 8kB however. This is because of the metadata packets that get sent before the row.
  • A connection pool shared between several logical sessions. I don't expect much relation between the packet sizes for each session.

So I'm not convinced about the actual packet size being predictable.

I think that this approach is working well for the one_big_row benchmark because it's preventing memmoves until the free area gets smaller than 2kB.

I'd explore the possibility of approaching this as a cost/benefit analysis, instead. If we decide to memmove, we gain buffer_.reserved_size() at the expense of memmoving buffer_.current_message_size() + buffer_.pending_size(). We can explore heuristics like:

  • If we would be memmoving very few bytes, do the memmove.
  • If a memmove would prevent a reallocation, memmove.

We also need to consider the buffer's limit size. A message that fits the limit shouldn't cause an error, regardless of how many bytes we had in the reserved area.

Thanks for working in this. Let me know if you need aky help.

Regards,
Rubén.

@vsoulgard
Copy link
Copy Markdown
Contributor Author

Thanks for the review! You are right: trying to calculate average message size is unsuitable when working with MySQL protocol and connection pools.

I pushed new solution that following your suggestions:

  • Memmove, if data amount is small
  • Memmove, if reallocation could be prevented

However, the benchmark aren't showing that improvement as my previous suggestion. I did some more investigation and that's why it happened: My previous EMA-based solution would trigger a reallocation that allocated a buffer larger than the incoming row size. This extra size allowed to avoid additional memmoves. However, in the standard boost_mysql_bench_one_big_row_boost benchmark, the buffer size is exact same as size of received row. This leaves no spare capacity, forcing a memmove everytime.

@vsoulgard vsoulgard changed the title Optimize message memmoves by predicting next message size Optimize message memmoves Mar 22, 2026
@vsoulgard vsoulgard requested a review from anarthal March 22, 2026 20:02
@anarthal
Copy link
Copy Markdown
Collaborator

This looks more sound to me, thanks.

The "would prevent reallocation condition" seems to be missing a piece. We're currently checking that the required space is greater than free_size(), which answers "would we need reallocation if we don't compact?". But if the requested size is big enough, we might still need a reallocation, even if we compact.

I think what you propose in #502 is sound. Do you want to implement it as part of this PR?

Thanks,
Rubén.

@vsoulgard
Copy link
Copy Markdown
Contributor Author

If requested size is big enough to still cause reallocation, then by compacting the buffer we reduce reallocation cost. That's why it seems quite logical for me.

Yes, I would love to implement #502 proposal in this PR.

@anarthal
Copy link
Copy Markdown
Collaborator

If requested size is big enough to still cause reallocation, then by compacting the buffer we reduce reallocation cost. That's why it seems quite logical for me.

Fair. Go ahead with this approach.

@vsoulgard
Copy link
Copy Markdown
Contributor Author

read_buffer now grows to powers of two.

Combined, these changes reduce boost_mysql_bench_one_big_row_boost to 147k instructions!

@anarthal
Copy link
Copy Markdown
Collaborator

Looks good. A couple of comments:

  • The next power of two implementation looks pretty complex. Instantiating templates is compile-time heavyweight. Is there a meaningful performance difference vs. just a while loop?
  • We need unit tests to cover the next power of two algorithm, buffer growth and the message reader.

@vsoulgard
Copy link
Copy Markdown
Contributor Author

Yes, there is performance difference. Since that template implementation is branchless, it's a lot faster, see:
https://quick-bench.com/q/k0PMQJ0MtFPZxU70zU_v_B4zr4A
https://godbolt.org/z/E1d5Gefsa

I will come back later with new unit tests!

@vsoulgard
Copy link
Copy Markdown
Contributor Author

Added new unit tests and overflow assertation in next_power_of_two.

constexpr std::size_t num_tests = sizeof(test_sizes) / sizeof(test_sizes[0]);

// Test that buffer capacity grows to powers of two
for (std::size_t i = 0; i < num_tests; ++i)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use range for here?

@anarthal
Copy link
Copy Markdown
Collaborator

Thanks for this, it's looking really good!

@anarthal
Copy link
Copy Markdown
Collaborator

I'm gonna merge #504 first and re-run the coverage jobs for this PR and #503.

@vsoulgard
Copy link
Copy Markdown
Contributor Author

Done!

@vsoulgard vsoulgard requested a review from anarthal March 28, 2026 19:57
@anarthal
Copy link
Copy Markdown
Collaborator

anarthal commented Mar 28, 2026

Could you please merge develop into this and the other PR so we get the coverage report?

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 29, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 99.16%. Comparing base (c20a041) to head (4a6028c).
⚠️ Report is 1 commits behind head on develop.

Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff            @@
##           develop     #501   +/-   ##
========================================
  Coverage    99.15%   99.16%           
========================================
  Files          144      145    +1     
  Lines         7370     7387   +17     
========================================
+ Hits          7308     7325   +17     
  Misses          62       62           
Files with missing lines Coverage Δ
...de/boost/mysql/impl/internal/next_power_of_two.hpp 100.00% <100.00%> (ø)
...oost/mysql/impl/internal/sansio/message_reader.hpp 100.00% <100.00%> (ø)
...e/boost/mysql/impl/internal/sansio/read_buffer.hpp 100.00% <100.00%> (ø)

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c20a041...4a6028c. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@anarthal anarthal merged commit 5d9ff8f into boostorg:develop Mar 29, 2026
6 checks passed
@anarthal
Copy link
Copy Markdown
Collaborator

Merged! Thanks a lot for working on this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants