Take the 2-minute tour ×
Code Review Stack Exchange is a question and answer site for peer programmer code reviews. It's 100% free, no registration required.

I am writing Byte Array value into a file using Java with Big Endian Byte Order format.. Now I need to read that file from C++ program...

That Byte Array which I am writing into a file is made up of three Byte Arrays as described below-

short employeeId = 32767;
long lastModifiedDate = "1379811105109L";
byte[] attributeValue = os.toByteArray();

I am writing employeeId , lastModifiedDate and attributeValue together into a single Byte Array and that resulting Byte Array I am writing into a file and then I will be having my C++ program which will retrieve that Byte Array data from file and then deserialize it to extract employeeId, lastModifiedDate and attributeValue from it.

Below is my working Java code, which writes Byte Array value into a file with Big Endian format:

public class ByteBufferTest {

    public static void main(String[] args) {

        String text = "Byte Array Test For Big Endian";
        byte[] attributeValue = text.getBytes();

        long lastModifiedDate = 1289811105109L;
        short employeeId = 32767;

        int size = 2 + 8 + 4 + attributeValue.length; // short is 2 bytes, long 8 and int 4

        ByteBuffer bbuf = ByteBuffer.allocate(size); 
        bbuf.order(ByteOrder.BIG_ENDIAN);

        bbuf.putShort(employeeId);
        bbuf.putLong(lastModifiedDate);
        bbuf.putInt(attributeValue.length);
        bbuf.put(attributeValue);

        bbuf.rewind();

        // best approach is copy the internal buffer
        byte[] bytesToStore = new byte[size];
        bbuf.get(bytesToStore);

        writeFile(bytesToStore);

    }

    /**
     * Write the file in Java
     * @param byteArray
     */
    public static void writeFile(byte[] byteArray) {

        try{
            File file = new File("bytebuffertest");

            FileOutputStream output = new FileOutputStream(file);
            IOUtils.write(byteArray, output);           

        } catch (Exception ex) {
            ex.printStackTrace();
        }
    }
}

Now I need to retrieve Byte Array from that same file using the below C++ program and deserialize it to extract employeeId, lastModifiedDate and attributeValue from it. I am not sure what is the best way on the C++ side. Below is the code I have so far:

int main() {

    string line;

    std::ifstream myfile("bytebuffertest", std::ios::binary);

    if (myfile.is_open()) {

        uint16_t employeeId;
        uint64_t lastModifiedDate;
        uint32_t attributeLength;

        char buffer[8]; // sized for the biggest read we want to do

        // read two bytes (will be in the wrong order)
        myfile.read(buffer, 2);

        // swap the bytes
        std::swap(buffer[0], buffer[1]);

        // only now convert bytes to an integer
        employeeId = *reinterpret_cast<uint16_t*>(buffer);

        cout<< employeeId <<endl;

        // read eight bytes (will be in the wrong order)
        myfile.read(buffer, 8);

        // swap the bytes
        std::swap(buffer[0], buffer[7]);
        std::swap(buffer[1], buffer[6]);
        std::swap(buffer[2], buffer[5]);
        std::swap(buffer[3], buffer[4]);

        // only now convert bytes to an integer
        lastModifiedDate = *reinterpret_cast<uint64_t*>(buffer);

        cout<< lastModifiedDate <<endl;

        // read 4 bytes (will be in the wrong order)
        myfile.read(buffer, 4);

        // swap the bytes
        std::swap(buffer[0], buffer[3]);
        std::swap(buffer[1], buffer[2]);

        // only now convert bytes to an integer
        attributeLength = *reinterpret_cast<uint32_t*>(buffer);

        cout<< attributeLength <<endl;

        myfile.read(buffer, attributeLength);


        // now I am not sure how should I get the actual attribute value here?

        //close the stream:
        myfile.close();
    }

    else
        cout << "Unable to open file";

    return 0;
}

Can anybody take a look on C++ code and see what I can do to improve it, as I don't think it is looking much efficient? Any better way to deserialize the Byte Array and extract relevant information on the C++ side?

share|improve this question

1 Answer 1

up vote 1 down vote accepted

Obviously the code isn't portable to big-endian machines. I'll use C syntax, since I'm more familiar with that than C++.

If you have endian.h, you can use the functions in there; if not, you should have arpa/inet.h which defines functions for swapping network byte order (big-endian) to host byte order, but lacks a function for 64-bit values. Look for either be16toh (from endian.h) or ntohs (from arpa/inet.h) and friends.

Why not read directly into the values:

fread((void *)&employeeId, sizeof(employeeId), 1, file);
employeeId = be16toh(employeeId);

Since you can manipulate pointers in C, you just need to provide a universal pointer (void *) to the read function where it should place the results. The & operator takes the address of a value. Once that is done, you can manipulate the value directly, as above.

Using this Java test code:

import java.io.*;

public class write {
  public static void main(String... args) throws Exception {
    final FileOutputStream file = new FileOutputStream("java.dat");
    final DataOutputStream data = new DataOutputStream(file);

    final long time = System.currentTimeMillis();
    final short value = 32219;

    //  fill a table with a..z0..9
    final byte[] table = new byte[36];
    int index = 0;
    for (int i = 0; i < 26; i++) {
      table[index++] = (byte)(i + 'a');
    }
    for (int i = 0 ; i < 10; i++) {
      table[index++] = (byte)(i + '0');
    }

    data.writeLong(time);
    data.writeShort(value);
    data.writeInt(table.length);
    data.write(table);
    data.close();

    System.out.format("wrote time: %d%n  value: %d%n  length: %d%n  table:%n", time, value, table.length);
    for (int i = 0; i < table.length; i++) {
      System.out.format("%c ", (char)table[i]);
    }
    System.out.println();
  }
}

The output from this code is:

wrote time: 1380743479723
  value: 32219
  length: 36
  table:
a b c d e f g h i j k l m n o p q r s t u v w x y z 0 1 2 3 4 5 6 7 8 9 

You can read the values in with this C code:

#include <stdio.h>
#include <stdlib.h>
#include <endian.h>
#include <sys/types.h>

int main(int argc, char **argv) {
  int64_t time;
  int16_t value;
  int32_t length;
  u_int8_t *array;

  FILE *in = fopen("java.dat", "rb");

  fread(&time, sizeof(time), 1, in);
  time = (int64_t)be64toh( (u_int64_t)time);

  fread(&value, sizeof(value), 1, in);
  value = (int16_t)be16toh( (u_int16_t)value );

  fread(&length, sizeof(length), 1, in);
  length = (int32_t)be32toh( (u_int32_t)length );

  array = (u_int8_t *)malloc(length);
  fread(array, sizeof(array[0]), length, in);

  fclose(in);

  printf("time: %ld\nvalue: %d\narray length: %d\narray:\n", time, value, length);
  for (int i = 0; i < length; i++) {
    printf("%c ", array[i]);
  }
  printf("\n");

  free(array);
  return 0;
}

I compiled this on Ubuntu x64 with clang. Its output was:

./a.out
time: 1380743479723
value: 32219
array length: 36
array:
a b c d e f g h i j k l m n o p q r s t u v w x y z 0 1 2 3 4 5 6 7 8 9 

Keep in mind that the only unsigned types in Java are byte (8 bits) and char (16-32 bits).

share|improve this answer
    
Thanks for suggestion.. The big problems is, I am not C++ developer, I am mainly a Java developer so that's why I am facing lot of problem.. By reading lot of stuff I was able to write that bunch of code in C++... Can you help me on this with a simple example basis on my code how can I deserialize on C++ side? –  lining Oct 2 '13 at 6:25
    
Thanks for the help.. –  lining Oct 2 '13 at 17:37
    
Thanks for edit.. One quick question I have is- Does this c++ code work with my Java example in the way how it is writing into a file? –  lining Oct 2 '13 at 21:24
    
You are writing a short, long, and int; I'm writing a long, short, int. So it's out of order. I just noticed you're specifying BIG_ENDIAN for your ByteBuffer. Can you not just use LITTLE_ENDIAN on output? –  WeaponsGrade Oct 2 '13 at 21:34
    
I guess BIG ENDIAN is the preferred format for BYTE ORDER when we are dealing with cross platform issues? RIght? That's why I was following this.. Is there any way to use my c++ code to deserialize it properly? I am able to extract the attribute length but not sure how to read it back in my c++ code? –  lining Oct 2 '13 at 21:41

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.