Skip to content

Explicitly forget the zero remaining elements in vec::IntoIter::fold().#148486

Open
kpreid wants to merge 1 commit intorust-lang:mainfrom
kpreid:vec-iter-drop
Open

Explicitly forget the zero remaining elements in vec::IntoIter::fold().#148486
kpreid wants to merge 1 commit intorust-lang:mainfrom
kpreid:vec-iter-drop

Conversation

@kpreid
Copy link
Contributor

@kpreid kpreid commented Nov 4, 2025

View all comments

[Original description:] This seems to help LLVM notice that dropping the elements in the destructor of IntoIter is not necessary. In cases it doesn’t help, it should be cheap since it is just one assignment.

This PR adds a function to vec::IntoIter() which is used used by fold() and spec_extend(), when those operations complete, to forget the zero remaining elements and only deallocate the allocation, ensuring that there will never be a useless loop to drop zero remaining elements when the iterator is dropped.

This is my first ever attempt at this kind of codegen micro-optimization in the standard library, so please let me know what should go into the PR or what sort of additional systematic testing might indicate this is a good or bad idea.

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Nov 4, 2025
@rustbot
Copy link
Collaborator

rustbot commented Nov 4, 2025

r? @Mark-Simulacrum

rustbot has assigned @Mark-Simulacrum.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@Kobzol
Copy link
Member

Kobzol commented Nov 4, 2025

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rust-bors

This comment has been minimized.

rust-bors bot added a commit that referenced this pull request Nov 4, 2025
Explicitly forget the zero remaining elements in `vec::IntoIter::fold()`.
@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Nov 4, 2025
@rust-bors
Copy link
Contributor

rust-bors bot commented Nov 4, 2025

☀️ Try build successful (CI)
Build commit: ae97583 (ae975837374d0ef9f9abcb691803628d975e8fb2, parent: e5efc336720901420a8891dcdb67ca0a475dc03c)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (ae97583): comparison URL.

Overall result: ❌✅ regressions and improvements - no action needed

Benchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
0.6% [0.5%, 0.6%] 3
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-0.2% [-0.2%, -0.2%] 2
All ❌✅ (primary) - - 0

Max RSS (memory usage)

Results (primary 3.3%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
3.3% [3.3%, 3.3%] 1
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 3.3% [3.3%, 3.3%] 1

Cycles

Results (secondary -2.7%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-2.7% [-2.7%, -2.7%] 1
All ❌✅ (primary) - - 0

Binary size

Results (primary 0.0%, secondary 0.1%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
0.0% [0.0%, 0.1%] 5
Regressions ❌
(secondary)
0.1% [0.1%, 0.1%] 1
Improvements ✅
(primary)
-0.1% [-0.1%, -0.1%] 1
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 0.0% [-0.1%, 0.1%] 6

Bootstrap: 473.413s -> 473.632s (0.05%)
Artifact size: 390.72 MiB -> 390.71 MiB (-0.00%)

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Nov 5, 2025
@lqd
Copy link
Member

lqd commented Nov 5, 2025

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rust-bors

This comment has been minimized.

rust-bors bot added a commit that referenced this pull request Nov 5, 2025
Explicitly forget the zero remaining elements in `vec::IntoIter::fold()`.
@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Nov 5, 2025
@rust-bors
Copy link
Contributor

rust-bors bot commented Nov 5, 2025

☀️ Try build successful (CI)
Build commit: 48bf163 (48bf1634d442e6680a26e6e6305239e51914dbb3, parent: 53efb3d4f3b67d189a0c72eb475a52017d79d609)

@rust-timer

This comment has been minimized.

@cyb0124
Copy link

cyb0124 commented Nov 5, 2025

Hi, I posted the URLO topic that led to this.

I found another case of unnecessary drop_in_place call in my project. This time it's from Option::get_or_insert_with. Stripped it down to this:

#[derive(Default)]
pub struct A {
    _a: Option<Box<Option<A>>>,
}

pub fn test(slot: &mut Option<A>) {
    slot.get_or_insert_default();
}

Here https://godbolt.org/z/dPTb3r89T, test calls drop_in_place<Option<A>> right after it just checked it's a None.

Could you fix this as well while you're at it?

@kpreid
Copy link
Contributor Author

kpreid commented Nov 5, 2025

@cyb0124 That looks quite doable, but it is a completely different part of the code and so should not go in this PR.

Question: What is your particular goal in this area? Are you looking for code size reduction, execution time reduction, lack of spurious panic paths, or something else? Do you have a specific program you are trying to optimize?

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (48bf163): comparison URL.

Overall result: ❌✅ regressions and improvements - no action needed

Benchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

mean range count
Regressions ❌
(primary)
0.4% [0.4%, 0.4%] 1
Regressions ❌
(secondary)
0.1% [0.1%, 0.1%] 2
Improvements ✅
(primary)
-0.5% [-0.5%, -0.5%] 1
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) -0.1% [-0.5%, 0.4%] 2

Max RSS (memory usage)

Results (secondary -3.4%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
2.9% [2.9%, 2.9%] 1
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-4.0% [-5.7%, -1.9%] 11
All ❌✅ (primary) - - 0

Cycles

Results (primary -4.0%, secondary -1.1%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
4.2% [3.0%, 6.1%] 5
Improvements ✅
(primary)
-4.0% [-4.0%, -4.0%] 1
Improvements ✅
(secondary)
-5.5% [-9.6%, -1.7%] 6
All ❌✅ (primary) -4.0% [-4.0%, -4.0%] 1

Binary size

Results (primary 0.1%, secondary 0.2%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
0.1% [0.0%, 0.4%] 10
Regressions ❌
(secondary)
0.2% [0.2%, 0.2%] 1
Improvements ✅
(primary)
-0.2% [-0.2%, -0.2%] 1
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 0.1% [-0.2%, 0.4%] 11

Bootstrap: 473.384s -> 475.775s (0.51%)
Artifact size: 390.72 MiB -> 391.11 MiB (0.10%)

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Nov 5, 2025
@cyb0124
Copy link

cyb0124 commented Nov 6, 2025

Question: What is your particular goal in this area? Are you looking for code size reduction, execution time reduction, lack of spurious panic paths, or something else? Do you have a specific program you are trying to optimize?

It's "lack of spurious panic paths". The description of this post and this post are pretty much exactly what I need. From what I can find, panicking in the destructor seems to be the best option for now, and it'd be nice to know statically that such panics are never reached. I know this is impossible to prove statically in general without new syntactic restrictions in the language, but cases like "get_or_insert_with shouldn't call destructor" and "into_iter().for_each(f) shouldn't call destructor if f doesn't unwind" should be simple enough..

@kpreid
Copy link
Contributor Author

kpreid commented Nov 6, 2025

Finished benchmarking commit (48bf163): comparison URL.

Well, that looks … good-ish? Seems unfortunate that this change increases binary size. I am not familiar with how or whether one might investigate that in the context of the rustc test suite.

It's "lack of spurious panic paths". … I know this is impossible to prove statically in general without new syntactic restrictions in the language, but cases like "get_or_insert_with shouldn't call destructor" and "into_iter().for_each(f) shouldn't call destructor if f doesn't unwind" should be simple enough..

The thing I would caution you about is that you — and people working on the standard library trying to help — may still find this a Sisyphean task, where it's never really complete enough to actually write the program you want and have it stay free of panic paths under maintenance.

@Mark-Simulacrum
Copy link
Member

r? scottmcm

@rustbot rustbot assigned scottmcm and unassigned Mark-Simulacrum Nov 8, 2025
@wesleywiser wesleywiser removed the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Jan 15, 2026
@rustbot
Copy link
Collaborator

rustbot commented Mar 4, 2026

This PR was rebased onto a different main commit. Here's a range-diff highlighting what actually changed.

Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers.

@kpreid
Copy link
Contributor Author

kpreid commented Mar 4, 2026

Rebased to fix conflict, and I also added a few more uses in spec_extend[_front]s that I missed the first time around.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

…()` and `spec_extend()`.

Adds internal `vec::IntoIter::forget_remaining_elements_and_dealloc()`,
which is used by `fold()` and `spec_extend()`, when those operations
complete, to forget the zero remaining elements and only deallocate the
allocation, ensuring that there will never be a useless loop to drop
zero remaining elements when the iterator is dropped.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants