4
\$\begingroup\$

I am using an offset variable to track the iteration position, and modifying / returning it. Is the below considered bad practice in Rust? I ask this because I don't see dereferencing very often in code I have seen.

fn get_next_u16(input: &[u8], offset: &mut usize) -> u16 {
    let next = ((input[*offset+1] as u16) << 8) +
                (input[*offset]   as u16);
    *offset += 2;
    next
}

fn main() {
    let input = [1, 2, 3, 4];
    let mut offset = 0;
    let first_u16 = get_next_u16(&input, &mut offset);
    let second_u16 = get_next_u16(&input, &mut offset);
    println!("first: {}", first_u16);
    println!("second: {}", second_u16);
}

The other way I think of doing this is to reassign to the offset variable like:

fn get_next_u16(input: &[u8], offset: usize) -> (usize, u16) {
   (offset+2, ((input[offset+1] as u16) << 8) +
               (input[offset]   as u16))
}

fn main() {
    let input = [1, 2, 3, 4];
    let offset = 0;
    let (offset, first_u16) = get_next_u16(&input, offset);
    let (offset, second_u16) = get_next_u16(&input, offset);
    println!("first: {}", first_u16);
    println!("second: {}", second_u16);
}

This looks cleaner but now you have to reassign to offset all the time.

Perhaps there is another way to do this also?

\$\endgroup\$

3 Answers 3

4
\$\begingroup\$

Comparing your variants

Your first variant is a C-style variant. In C, we only have one return value, not multiple, so poin­ters are the only way to pass more than a single piece of information to the user. Un­fort­u­na­tely, those pointers sometimes get used in a wrong way, which isn't that easy with Rust. Still the fol­lowing code is somewhat repetitive, isn't it?

let first_u16 = get_next_u16(&input, &mut offset);
let second_u16 = get_next_u16(&input, &mut offset);
let third_u16 = get_next_u16(&input, &mut offset);
let fourth_u16 = get_next_u16(&input, &mut offset);
....
let one_hundred_u16 = get_next_u16(&input, &mut offset);

input and offset are tightly coupled, which shows that there is something amiss. A struct­ured way to handle both at the same time would be great. Or we could search some closure somewhere else.

The latter variant clearly shows the user that there is more information to work with: The usize shouldn't get ignored! After all, it's part of the output. But alas, that's still error prone, since we now can accidentally use the same offset twice:

let (offset2,  first_u16) = get_next_u16(&input, offset);
let (_      , second_u16) = get_next_u16(&input, offset);     // whoops!

That's due to the fact that usize implements the Copy trait.

Iterate with iterators

But let's take a step back: you want a safe way to get the next u16 repeatedly. You want an iterator. Slices already provide a fitting iterator called chunks. Therefore, we can get two u8s easily. All we need to do is to map those chunks:

fn to_u16(input: &[u8]) -> u16 {
    ((input[1] as u16) << 8) + (input[0] as u16)
}

fn main() {
    let input = [1, 2, 3, 4];
    let output: Vec<_> = input.chunks(2).map(to_u16).collect();
    println!("{:?}", output);
}

Note that this will panic if the input doesn't contain an even number of u8s. You can use the unstable chunks_exact feature or return an Option<u16> instead.

Other remarks

Use rustfmt to format your code. That way your code looks similar to the one written by others.

\$\endgroup\$
2
\$\begingroup\$

That you really want is to keep a state (the offset in this current case). In both the codes you give, offset is an "embarrassment": you do not know what to do with it, because it is an implementation detail, not something you want to be bothered by.

The best solution for you is to hide this offset in a struct and call a method of this struct. At each call, the offset is updated, and you get the result. Iterators are exactly for this kind of usage:

struct U16Generator<'a> {
    input: &'a [u8],
    offset: usize,
}

impl<'a> U16Generator<'a> {
    fn new(input: &'a [u8]) -> Self {
        U16Generator { input, offset: 0 }
    }
}

impl<'a> Iterator for U16Generator<'a> {
    type Item = u16;

    fn next(&mut self) -> Option<Self::Item> {
        match (self.input.get(self.offset), self.input.get(self.offset + 1)) {
            (Some(u1), Some(u2)) => {
                self.offset += 2;
                Some(((*u2 as u16) << 8) + *u1 as u16)
            },
            _ => None,
        }
    }
}

#[test]
fn test() {
    let mut gen = U16Generator::new(&[1, 2, 3, 4]);

    assert_eq!(gen.next(), Some(513));
    assert_eq!(gen.next(), Some(1027));
}

One put everything in the struct (both the collection and the offset), and at each call of next, one update the offset then return the right value.


Note that this answer is educational. Obviously, for such a simple use case, this is better to use the Zeta's solution: input.chunks(2).map(to_u16).

\$\endgroup\$
0
2
\$\begingroup\$

A slice is a pointer and a length. The pointer doesn't necessarily have to point to the first item of an array! Thus, instead of maintaining an offset, you can update the slice itself.

fn get_next_u16(input: &mut &[u8]) -> u16 {
    let next = ((input[1] as u16) << 8) +
                (input[0] as u16);
    *input = &input[2..];
    next
}

fn main() {
    let mut input = &[1u8, 2, 3, 4] as &[_];
    let first_u16 = get_next_u16(&mut input);
    let second_u16 = get_next_u16(&mut input);
    println!("first: {}", first_u16);
    println!("second: {}", second_u16);
}
\$\endgroup\$
1

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.