Zum Inhalt der Seite gehen


I'm trying to get into #rustlang. I've already rewritten a small program and it was an okay experience but today, I hit something weird.

I have a LinkedList and need to iterate over it and, if I found any item that meets some criteria, I need to remove it from the list.

Using C++ iterators, this would be O(N).

Rust's linked_list has an issue open for more than 4 years (!) to add a remove function to linked lists. But even with that function, you'll need an index of the item to be removed and cannot (as far as I can tell) use the iterator you already have from scanning the list.

So in Rust I would need to spend O(N) to scan the list and an additional O(N) for each item I need removed.

Or is there a faster way to do that?

teilten dies erneut

Als Antwort auf Martin

Technically that's also O(n) (O(2n) = O(n)).

I don't see much Rust code using `LinkedList`. Probably its more common to just use `Vec`. Which in many cases will end up performing better overall, regardless of language.

Als Antwort auf Ian Douglas Scott

The List grows very dynamically, that's why I want to use LinkedList, because adding, removing and moving within the list is normally cheap with a LinkedList.
Als Antwort auf Martin

I've been writing Rust for 5 years and professionally for 2 years, I've seen a linked list outside of some very specific use cases.

Yes, Rust Linked List are very limited and you will most likely fight the borrow checker.

Are you sure you need a linked list? In most cases a Vec will be more efficient, and you won't have as many borrow checking issue.

If the order of elements doesn't matter, you can even use Vec::swap_remove for O(1) removal of elements.

Als Antwort auf Soso

Doesn't Vec::retain create a whole new copy of the collection, i.e. I'll need twice the memory at that point?
Als Antwort auf Martin

No, retain does not re-allocate. It modifies the Vec in place.
Als Antwort auf Soso

Okay, I just read that it works in place. Still all the elements after the removed element must be copied.
Als Antwort auf Soso

I don't want to use Vec because:

  • I don't want the collection to hog more memory than it actually needs.
  • I want to be able to cheaply extend the collection
  • I want to be able to cheaply remove items from arbitrary locations within the collection
Als Antwort auf Martin

LinkedList will likely use more memory. Each LinkedList element needs:

2 additional pointers,
Additional metadata for the allocation (size depends on the allocator)

Pushing an element to a vec is faster than to a LinkedList because it doesn't always need to allocate..
Removing from arbitrary locations can be done through swap_remove. But even remove is likely to be faster than linked list due to not needing to reallocate.

Als Antwort auf Soso

There are very few algorithms that need a linked list for performance.
And most of the time they need specific properties from a specific linked list implementation, so they end up implementing their own, so the standard library version never got much attention, which is a bit disappointing for algorithms that actually need it..
Als Antwort auf Soso

I cannot use swap_remove because the ordering in the list matters. My use case: https://friends.mbober.de/display/b22eb8e8-4167-19df-204d-582842795229

As far as I know common C++ vector implementations, they do not release memory when removing an element from the vector unless many elements were removed since reducing the size of a vector means allocating new memory and copying all remaining elements which is quite expensive. I assume Rust's vec does the same. That is what I meant by "hogging" memory.

As I'm also exploring Rust for embedded applications. Performance and memory efficiency matters to me and I was told that Rust is as good as C++ in this regard.

Als Antwort auf Martin

it looks like an LRU cache. There are multiple crates already implementing this.

Looking at the most popular, it uses a custom unsafe linked list implementation to be able to hold multiple pointers to the same node, which would not be possible with a safe API as the standard library intends to do.

Vec does not deallocate when removing elements, but you can force that with shrink_to_fit.

Als Antwort auf Soso

You could also implement a linked list on top of a Vec, storing indices instead of pointers. It's probably more efficient with regards to cache and reducing the need for multiple allocations.

In embedded you want to limit allocations as much as you can. It often means preallocating all the memory you need at the very beginning of the program. Linked-list are not very friendly for that pattern.

Als Antwort auf Martin

@sgued Just a general piece of advice:

Don't confuse "As good as C++ in this regard" with "Works the same way as C++" - while the characteristics can be very similar, the way you'll want to implement things are often quite different.

@Soso
Als Antwort auf Martin

@sgued You could consider using or taking inspiration from the LRU used by the Servo browser https://github.com/servo/uluru

It uses a fixed size array to store all items, using u16s to create a linked list using the array indices of your items. Since its backed by a constant size array you know exactly how much memory it uses (well, unless your items are heap pointers like Vec, String etc.) - in this case: (your item + 2*u16)*n + 2*u16 which is also going to be more memory efficient than a linked list utilizing pointers on all architectures where u16 < usize

The fact it uses a static array also makes the crate work in no_std environments without an allocator (embedded).

@Soso
Als Antwort auf Martin

You can do that with the (unfortunately unstable) cursor API, see LinkedList::cursor_front and LinkedList::cursor_back. The cursor API is more powerful than Iterator, but is specific to LinkedList AIUI.
Als Antwort auf George Macon

Yeah, I'm aware of that cursor API and even ChatGPT guided me towards that, but I don't want to do unstable at this point.
Als Antwort auf Martin

Totally fair. IIRC, the cursor API was prototyped in https://crates.io/crates/linked-list, which isn’t the standard library any more, but at least doesn’t require a nightly compiler
Als Antwort auf Martin

Do you have to remove it from that list, or can you iterate and create the filtered list? I'm unsure if you can .collect() into a LinkedList (assuming you really want one)
Als Antwort auf brettwitty

That sounds like a lote of unnecessary memory allocations to me.
Als Antwort auf Martin

Possibly! Depends on the context. Try and see, you might be surprised. The costs of these things can be a lot more complicated under the hood than a napkin calculation. Good luck!
Als Antwort auf Martin

The "faster way" depends on your exact use-case. If you intend to remove several items at once sporadically, then [Vec::retain()](https://doc.rust-lang.org/std/vec/struct.Vec.html#method.retain) might be performant enough - especially if you're performing full reads of the resulting vector.
Als Antwort auf Epic Eric :thinkhappy:

In terms of which collection will best suit your needs, I recommend reading through the doc section "When Should You Use Which Collection?": https://doc.rust-lang.org/std/collections/#when-should-you-use-which-collection
Als Antwort auf Epic Eric :thinkhappy:

That guide explicitly says I should use a LinkedList when:

You want to efficiently split and append lists.


Which is exactly why I want to use a linked list. Vec seems like it would do a lot of unnecessary memory allocation for that.

Als Antwort auf Martin

Fair enough! Then the best way forward would be using nightly Rust in order to enable LinkedList features like remove and retain
Dieser Beitrag wurde bearbeitet. (1 Monat her)
Als Antwort auf Martin

BTW what I want to do is a cache that stores a key/value pair (both are strings). When an entry is hit, I want to move it to the front of the list.

When there is no hit, a network request is issued to fetch the value for the key. The key/value pair is then emplaced at the front of the list. If the list is then over the maximum capacity, the item at the back shall be removed.

Als Antwort auf Martin

huh, i never even knew Rust had a linked list in the stdlib. i wonder what the usecase is over Vec/VecDeque - the docs only say to prefer Vec/VecDeque "almost always" without any info on when to actually use a linked list
Als Antwort auf laund

In my experience, LinkedLists are great when the container is growing and shrinking dynamically or when you want to move stuff around within the collection while the rest of the items maintain their order. The downside is expensive random access to elements so list are better suited for use cases where you have to iterate through the collection anyways.

Vectors are better for access to random elements and if you know the size of the collection beforehand. It's also cheaper to insert elements at the end if the vector's current size can hold another element. Inserting elements anywhere other than the end or removing an element that is not at the front or back is ridiculously expensive.

Als Antwort auf Martin

i mean i kinda understand that, i just found it interesting the docs don't state this.

Though i kinda wanna do a benchmark of linked list inserting/removing elements in the middle vs vec = vec.iter().filter(remove_condition).collect() - how many elements do i need to filter out for a Vec to be faster? how much slower is iteration of linked lists?

theres a old benchmark by matklad which inserts random numbers into a sorted vec/list, which requires tons of insertions in the middle, that claims Vec is still far more performant at this.

Vev: 33365 μs
List: 184282 μs
https://github.com/matklad/vec-vs-list
but this was 8 years ago with a different linked list.

Als Antwort auf laund

@laund I also usually end up with something like a vec because the cache effects on NUMA systems often make it be faster "in practice" even when on-paper memory moves are supposed to make it worse. A convenient middle ground might be a VecDeque, which has the cheap appends at the front. For your LRU case that seems interesting. It still does moves with middle-element deletions though, although again depending on the hardware that might still win regardless.