> I eventually ended up abusing the unsafe code in Vec to get what I wanted: Tha...

civility · on Feb 10, 2017

I was pretty tired last night when I wrote that previous reply to you, and I don't think I stated my complaints very clearly. In several of the places where I quote you below, it's not your fault you didn't understand what I was getting at. Generally, I was trying to be concise to make my point, but it still turned into a wall of text, and I don't think I succeeded at either goal (making my point or being concise). Sorry in advance, but this message is a wall of text too.

> Yeah, I thought you were implementing a vector or a doubly linked list or something, not a matrix.

I was trying several things. For example, I got an immutable finger tree working (comparable difficulty to linked lists), but I was trying to describe the pain points, not where I succeeded. Please remember, these were all learning exercises for me, so I wasn't concerned about whether or not there was an existing library that did what I was trying to do - I wanted to learn how to write that kind of library.

Early on, I did intend to re-implement Vec. That is non-trivial in Rust and C++ for reasons I think we might both agree on, although honestly I did wish it was easy in Rust. I didn't think that should involve unsafe code, but it really seems like it does, so life goes on.

Later, I wanted to build a Matrix type wrapping around a low level array of memory. I did not want to wrap a Matrix around a Vec because I don't need a growable storage area, and I don't think I should have to pay for a capacity and current length field when I already keep track of the number of rows and columns. Reading the docs, it seemed like what I wanted was a "slice" to act as my array. I figured Vec must use a slice internally, so I should be able to do that as well! Diving through the Vec source code, I see RawVec, Unique, NonZero< * T> and finally a pointer. There's also something in there about PhantomData which I suspect is related to the phantom type stuff I've read about in Ocaml, but I figured I didn't really need to understand that right away. Translating to pseudo C++, this looks something like:

    struct Vec<T> {
        struct RawVec<T> {
            struct Unique<T> {
                 struct NonZero<const T*> {
                      tuple<T*> value;
                 } pointer;
                 struct PhantomData<T> {
                 } _marker;
            } ptr;
            usize cap;
        } buf;
        usize len;
    }

Piecing this together, my first thought was "crap, why is something as simple as Vec so complicated?". My second thought was "uh oh, where's the slice?". My next thought was, "well hmm, I can copy all that stuff", followed up by, "but a lot of those are marked as unstable, so I probably shouldn't do that".

So I poked around all over the docs and asked some questions about how to get a slice, and it seemed like I would need a Box to hold it. Even today, I really don't know if that's what I wanted, but mostly I gathered that you can't create one from a runtime chosen size without some unsafe code somewhere. That's when I found Vec::into_boxed_slice() which seemed to do almost what I wanted. I didn't want a Vec, and the implementation of that method has an unsafe block, so I used it and moved on. That's the first code snippet I included in the previous message.

In a previous post, you said you were interested in finding out what causes new Rust users to stumble. For me, I got hung up trying to figure out how to create a fixed sized array. Please don't say, "just use a Vec!" - that really misses the point. I'm still not sure what type I should use, but if it is a Box<[T]>, I would love a function like:

    slice::from_func<T>(len: usize, f: &Fn(usize) -> T) -> Box<[T]>

where f is called once for each element to provide a value. If something else is the right choice, imagine a different return type.

> That code isn't unsafe. It uses the [T] DST, which isn't a common type, but it's safe code.

Maybe DSTs are what I was looking for, but I see the docs are in the "nomicon", and I'm not sure that even existed when I was trying to figure out the stuff above. I will read it in the near future though.

> [the num crate] does give you the Float and Signed traits, which get you exactly what you need here? You may need to cast to float for integers. It's not great, but it should be good enough.

I didn't really want to dive into the ways I think the num crate is broken, but here goes. Floating point numbers and signed integers are not the only thing I want to call .abs() on. For instance, .abs() is relevant to complex numbers too (as are exp() and many others). Unfortunately, you can't implement Float or Signed for complex numbers because those traits have many other methods which don't make any sense for complex numbers. As example, Signed requires .is_positive() and Float requires .min_value().

Let's dodge the topic about calling .abs() on unsigned types, but in generic code that's really not as silly as it sounds. The num crate is trying to solve an ugly problem, and I don't think it's made the right trade offs. Really it isn't good enough, and I'm glad you guys removed it from the std library before version 1.0 and committing to it for the foreseeable future.

> I mean, operators have the same conflicting impl problem regardless of how you implement them (as traits or otherwise).

I don't see the conflict. These impls are fine:

    impl Mul<f64> for Matrix<f64>
    impl Mul<Matrix<f64>> for f64

And I can write a macro to call on any type I want. Try to use generics instead of a macro though, and the second one breaks.

> C++ solves this with overload resolution rules. Rust doesn't want that, and instead doesn't allow overlaps to occur in the first place.

I understand why Rust doesn't want SFINAE, and I understand why coherence is valuable, but you shouldn't believe something like SFINAE is required to make generic operators work. I've read nikomatasaki's blog posts several times, and it seems like he almost went with the "covered" rule. If the "covered" rule was applied to binary operator traits, but the current (nuanced /cough) rules applied everywhere else, I think the kind of generic operators I want to write would pass coherence. I'm also not saying this is the only way it could work, but I am saying that generic binary operators really should work symmetrically.

> I think C++ has a similar asymmetry when it comes to choosing between overlapping operator overloads when the overlap is in the pre or in the post type

That might be true if you define the operator as a member, so don't do that. If you declare it as a standalone function, these are really very general and symmetric:

    template<class A, class B>
    auto operator *(const A& a, const Matrix<B>& b)
        -> Matrix<decltype(A() * B())>
    { /* implementation */ }

    template<class A, class B>
    auto operator *(const Matrix<A>& a, const B& b)
        -> Matrix<decltype(A() * B())>
    { /* implementation */ }

> But here's the thing -- you're not actually doing anything with these.

Please don't assume that because I provided short examples of things which don't work like I think they should that you have any idea what I do. I'm not some novice who gets a kick out of finding ways to break the language. I understand why you might think that, but you're wrong, and it's condescending. I was trying to create the tools I would need for the kind of work I do, and I stumbled in several places. I'd like to ignore this part of your reply and get back to the rest of it.

> You're designing a library for general use, but you're designing it with the concerns for use in C++.

No. I was trying to design it with my understanding of idiomatic Rust, and I was trying to learn the language. We're only discussing C++ because it's the topic of this thread and because Rust aspires to be an alternative to it.

> Asymmetric operators are considered bad in C++. Not in Rust; many operators are asymmetric.

You state that like it was a well reasoned design decision instead of an oversight or unfortunate consequence, and I don't believe that is true. If you read nikomatsakis's blog post from my point of view, it looks like it was basically an accidental casualty because it wasn't in his list of use cases. Why wouldn't you want them to be symmetric? The num crate uses macros to implement symmetric operations on complex numbers in exactly the way I've described above. If asymmetry is something valuable, why would they do that?

> But ultimately, how much do asymmetric operators really affect you when programming? Just reorder the things. No big deal.

Sure, I could get by in Forth or Scheme too if they met my other requirements. However, processing arrays of numeric data is something I do nearly every single day. I translate equations I know and new ones I learn from published papers all the time. The order matters for clarity, and not all operations are commutative. Why have operators at all if you aren't trying represent math notation?

Look, it's fine if you don't want Rust to appeal to numerical programmers like me - you don't owe me anything. You seemed interested in what I found lacking, so I tried to share. Honestly, I had already assumed these kinds of things won't be fixed, I've already suffered the learning process for how to work around them, and maybe they'll be revisited in Rust 2.0.

> I can make similar complaints about how hard it is to use algebraic datatypes in C++.

Yes, implementing ADTs in C++ is terrible. This is an area where Rust shines, and I greatly prefer sum types to classes and inheritance.

Manishearth · on Feb 10, 2017

> Piecing this together, my first thought was "crap, why is something as simple as Vec so complicated?".

This is because it uses some reusable primitives in the stdlib. A standalone vec can be done in a much simpler. NonZero isn't necessary, it just enabled optimizations. PhantomData is for variance stuff (explained in the nomicon) and drop order, which are sort of niche but interesting things. The variance problem in this case is only about being able to allow things like a Vec of a borrowed reference (so not including it just means that you can't use the vec for more niche things). The drop order part is necessary for safety in situations involving arenas and whatnot, but this is again one of those things you need to think about in C++ too.

The nomicon does build up a vec impl from scratch (https://doc.rust-lang.org/stable/nomicon/vec.html) and starts with a simple impl and slowly adds optimizations and refactorings. It depends on knowledge from the rest of the nomicon, however.

> and the implementation of that method has an unsafe block

Ah, I see, when you said "abusing the unsafe code" I thought you meant you were actually using unsafe code. Almost all stdlib things eventually drill down to unsafe calls so using a safely-wrapped API like into_boxed_slice is OK. That's what I mean by "that code isn't unsafe" :)

> For me, I got hung up trying to figure out how to create a fixed sized array. Please don't say, "just use a Vec!" - that really misses the point. I'm still not sure what type I should use, but if it is a Box<[T]>, I would love a function like

Box<[T]> is basically it, though it's a more obscure type (most newcomers would just use Vec, which is really fine, but if you are more acquainted with the language nothing wrong with using a boxed DST so you should use it). I wish we could get type level integers so that you can write generic types over [T; n] though.

Generally the stdlib doesn't include functions that are simple compositions of others, and since you can do something like `(0..n).iter().map(|_| func()).collect().into_boxed_slice()` such a function probably wouldn't exist. But it's not that clear cut, if you propose it it could happen! DSTs don't get used much in your average rust code so this is an area of the stdlib that could get more convenience functions.

> Maybe DSTs are what I was looking for, but I see the docs are in the "nomicon",

Yeah, DSTs are a more advanced feature of Rust. I'd prefer to wait for type level integers than bring them out to the forefront.

> but here goes. Floating point numbers and signed integers...

Good points; hadn't thought of that. If you have the time/inclination, I'd love to see an alternative traits lib better suited for this purpose.

> Try to use generics instead of a macro though, and the second one breaks.

So there's no conflict in the code written the way it is right now, but other blanket impls from other crates may conflict, basically.

> I understand why Rust doesn't want SFINAE,

Not talking about SFINAE; just talking about overload resolution (SFINAE is something built on top of it)

> If the "covered" rule was applied to binary operator traits, but the current (nuanced /cough) rules applied everywhere else, I think the kind of generic operators I want to write would pass coherence.

This is interesting. I think you would still have a problem with some kinds of blanket impls that currently are allowed on operators, but the ones you have listed would work.

Ultimately it's a tradeoff, though. The covered rule reduces some of the power of genericness of the RHS of operator overloads and balances it out. E.g. right now `impl<T: MaybeSomeBoundHere> Add<T> for Foo` works, but it doesn't by the covered rule. That's a pretty useful impl to have.

It might be possible to introduce a coherence escape hatches like `#[fundamental]` to be used with the operator traits. I'm not sure.

> If you declare it as a standalone function, these are really very general and symmetric:

Oh, forgot you can do that :)

> Please don't assume that because I provided short examples of things which don't work like I think they should that you have any idea what I do

I apologize. I inferred this from "all I wanted to do was create a freshman level data structure", which has the implication of "I can design this abstraction easily in C++, why not Rust".

Sorry about that :)

> You state that like it was a well reasoned design decision instead of an oversight or unfortunate consequence, and I don't believe that is true. If you read nikomatsakis's blog post from my point of view, it looks like it was basically an accidental casualty because it wasn't in his list of use cases.

I do think it's an unfortunate consequence. I think it's a tradeoff, and operator symmetry was forgone so that other things could exist. It's an unfortunate consequence of a well reasoned design decision where it was part of a tradeoff that was not decided in its favor. I don't think it was an oversight; these things were discussed extensively and operators were some of the main examples used, because operators are the primary example of traits with type parameters in Rust (and thus great fodder for coherence discussions).

> Look, it's fine if you don't want Rust to appeal to numerical programmers like me

I do! :) I used to be a physics person in the past, and did try to use Rust for my numerical programming. It was ... okay (this was many years ago, before some of the numerical inference features -- explicit literal types was hell). It's improved since then. I recognize that it's not the greatest for numerical programming (I still prefer mathematica, though I don't do much of that anymore anyway).

I think specialization (the "final form", not the current status) will help address your issues a lot. Also, type level integers should exist, I have some scratch proposals for them; but I keep getting bogged down in making it work with things like varaidic generics (I feel that a type level integer system should not be designed separately from whatever gets used to make it possible to operate generically on tuples as is done in some functional programming languages.)

> The order matters for clarity, and not all operations are commutative.

This is a great point. Ultimately macros pretty much are your solution here, which is not a great situation. Specialization would help, again.

civility · on Feb 10, 2017

I'm glad you replied. I was beginning to worry we were going too far into "agree to disagree" territory. I'm at work now, but I'd like to respond to a few of your items above this weekend.

We're getting pretty deep into a Hacker News thread about an almost unrelated topic, and the formatting options here are limited. Is there a better forum to have this kind of discussion? Some of it seems relevant to Rust internals, but I don't know if it's welcome there or not.

Manishearth · on Feb 11, 2017

Really just posting on users.rust-lang.org (or /r/rust) about your issues would be nice. In particular if you're interested in creating a new num traits crate I recommend creating a separate post about that focused on the issues you came across and a sketch of what you'd prefer to see.

civility · on Feb 11, 2017

I'll put together a post on the users forum about num crate traits, but it'll probably be a day or two. In the mean time, a few replies to some of the other items above:

> I wish we could get type level integers so that you can write generic types over [T; n] though.

Yes, that would be very useful. I use fixed sized matrices for things like Kalman filters from time to time. These aren't usually the 3x3 or 4x4 kinds of matrices you see in the graphics world. For instance, they might be 6x9 or 12x4 in some specific case. It makes a huge difference in performance if they can be stack allocated (Eigen provides a template specialization for this).

For other problems, I use very large vectors and matrices, and those should be heap allocated to keep from blowing the stack. In those cases, the allocation time is usually dwarfed by the O(N^2) or O(N^3) algorithms anyways.

> since you can do something like

    (0..n).iter().map(|_| func()).collect().into_boxed_slice()

I just tried this, but rustc version 1.15.1 can't find the .iter() method for the Range. I'm assuming it's a small change (which I'd really like to see if you're willing), but that's quite a stack of legos you've snapped together there :-)

Let's add that to the list of things a new user like myself stumbles on: Even knowing I wanted a boxed slice, I'm not sure I would piece together "let's take a range, convert it to an iterator, map a function over each item, and collect that into a Vec so that I can extract the boxed slice I want".

Does that create and then copy (possibly large) temporaries? Walking through the code, I see it calls RawVec::shrink_to_fit() - which looks like it's possibly a no-op if the capacity is the right size. Then it calls Unique::read() - which looks like a memcpy. I honestly don't know if this does make copies, but if it does, that cost can be significant sometimes.

> just talking about overload resolution (SFINAE is something built on top of it)

I think Rust already dodges 90% of that problem by not providing implicit conversions (a good thing, IMO). However, really all I was trying to say is I don't believe you need to copy C++'s approach for generic operator traits and functions to work like I think they can/should in Rust. I don't understand the details to know if you could fix things and maintain backwards compatibility, but it's a false dichotomy to say only Rust's (current) way or C++'s way are the only possibilities.

> E.g. right now `impl<T: MaybeSomeBoundHere> Add<T> for Foo` works, but it doesn't by the covered rule. That's a pretty useful impl to have.

I think you're referring to the table in the orphan impls post:

    +-------------------------------------------------+---+---+---+---+---+
    | Impl Header                                     | O | C | S | F | E |
    +-------------------------------------------------+---+---+---|---|---+
    | impl<T> Add<T> for MyBigInt                     | X |   | X | X |   |
    | impl<U> Add<MyBigInt> for U                     |   |   |   |   |   |

I honestly don't know if either of those should be allowed! They both seem very presumptuous and not at all in the spirit of avoiding implicit conversions. Let's instantiate T with a String, a File, or a HashTable - I don't see how adding MyBigInt could possibly make sense on either the left or the right. Maybe they make sense with the right bounds added.

I think it's a very different thing when the user of your crate explicitly instantiates your type with one of their choosing. If I had any say, my contribution to the use-case list would look like this:

    +----------------------------------------------------------+---+
    | Impl Header                                              | ? |
    +----------------------------------------------------------+---+
    | impl<T> Add<T> for MyType<T>                             | X |
    | impl<U> Add<MyType<U>> for U                             | X |
    | impl<T> Sub<T> for MyType<T>                             | X |
    | impl<U> Sub<MyType<U>> for U                             | X |
    | impl<T> Mul<T> for MyType<T>                             | X |
    | impl<U> Mul<MyType<U>> for U                             | X |
      ... and so on for 20 or 30 more lines :-)

When MyType is parameterized like this, I'm declaring something stronger, and I don't think it should introduce a coherence problem.

Manishearth · on Feb 11, 2017

> I just tried this, but rustc version 1.15.1 can't find the .iter() method for the Range. I'm assuming it's a small change (which I'd really like to see if you're willing), but that's quite a stack of legos you've snapped together there :-)

Yeah, ranges are iterators already; you don't need to create iterators out of them.

`let boxslice = (0..10).map(|_| func()).collect::<Vec<_>>().into_boxed_slice();` is something that will actually compile. The turbofish `::<Vec<_>>` is necessary because `collect()` can collect into arbitrary containers (like HashSets) and we need to tell it which one to collect into. A two-liner `let myvec: Vec<_> = ....collect(); let boxslice = myvec.into_boxed_slice();` would also work and wouldn't need the turbofish.

In case of functions returning Clone types, you can just do `vec![func(); n].into_boxed_slice();`. My example was the fully generic one that would be suitable for implementing a function in the stdlib, not exactly what you might use -- I didn't expect you to be able to piece it together :). For your purposes just using the vec syntax is fine, and would work for most types.

Using ranges as iterators is basically the go-to pattern for "iterate n times", for future reference.

> Does that create and then copy (possibly large) temporaries? Walking through the code, I see it calls RawVec::shrink_to_fit() - which looks like it's possibly a no-op if the capacity is the right size. Then it calls Unique::read() - which looks like a memcpy. I honestly don't know if this does make copies, but if it does, that cost can be significant sometimes.

In this case .collect() is operating on an ExactSizeIterator (runtime known length) so it uses Vec::with_capacity and shrink_to_fit would be a noop. In general .collect().into_boxed_slice() may do a shrink (which involves copying) if it operates on iterators of a-priori unknown length. This is not one of those cases. At most you may have a copy involved of each element when it is returned from func() and placed into the vector. I suspect it can get optimized out.

vec![func(), n] will call func once and then create n copies by calling .clone(). Usually that cost is about the same as calling func() n times.

> Let's instantiate T with a String, a File, or a HashTable - I don't see how adding MyBigInt could possibly make sense on either the left or the right. Maybe they make sense with the right bounds added.

Yeah, that's why I had a bound there. I personally feel these impls make sense, both for traits and for operators. Perhaps more for non-operator traits.

I think your usecase is a good one, and it's possible that the covered rule could be made to work with the current rules. I don't know. It would be nice to see a post exploring these possibilities. It might be worth looking at how #[fundamental] works (https://github.com/rust-lang/rfcs/blob/1f5d3a9512ba08390a222...) -- it's a coherence escape hatch put on Box<T> and some other stdlib types which makes a tradeoff: Box<T> and some other types can be used in certain places in impls without breaking coherence, but the stdlib is not allowed to make certain trait impls on Box without it being a breaking change (unless the trait is introduced in the same release as the impl). The operator traits may have a solution in a similar spirit -- restrict how the stdlib may use them, but open up more possibilities outside. It's possible that this may not even be necessary; the current coherence rules are still conservative and could be extended. I don't think I'm the right person to really help you figure this out, however, I recommend posting about this on the internals forum.

(I'm not sure if this discussion is over, but if it isn't I think it makes more sense to continue over email. username@gmail.com. Fine to continue here if you don't want to use email for whatever reason)

civility · on Feb 11, 2017

Those new examples work nicely. I'll have to remember the word "turbofish" :-)

> (I'm not sure if this discussion is over, but if it isn't I think it makes more sense to continue over email. username@gmail.com. Fine to continue here if you don't want to use email for whatever reason)

Nah, I think we're at a good stopping point. I'll post the num traits topic on users, and the operator coherence one on internals, so maybe you will jump in there. I'm generally pretty private online, so I wouldn't take you up on the offer to continue in email. However, you've been really helpful and patient, and I sincerely appreciate it. Thank you again.