Streambuffer and string manipulation

Question

One of my colleagues told me that you can use streambuffer for std::string, and instead of using string id in the for loop, you can declare it outside.

I am not sure whether this will make any difference or not.

void getRecord(uint64_t currentTime, const std::vector<uint32_t> &shards) {

    try {

        currentTime = (currentTime / 1000) * 1000;

        uint64_t start = (currentTime / (60 * 60 * 1000 * 24))  % 3;

        string query;
        Cassandra::Result result;

        std::string base(boost::lexical_cast<std::string>(currentTime) + ".");

        for (std::vector<uint32_t>::const_iterator it = shards.begin() ; it != shards.end(); ++it) {

            string id = base + boost::lexical_cast<std::string>(*it); // use stream buffer

            query = "select name, value from keyspace.hour_"+boost::lexical_cast<std::string>(start)+" where id ='"+id+"';";

            result = Cassandra::execute_query(query);

        }
    }
}

I am coming from a Java background, so I'm not sure which one will be more efficient compared to the other, as C++ is all about memory allocation. Is there any way to optimize the above code, even slightly?

UPDATE:-

For me what I am trying to optimize is the other part leaving Cassandra query part as it is..

I know there might be few minimal optimization that I am supposed to make but if it is worth doing it, then I might do it now then later figuring it out.

Below is my full code. Again I am mainly worried about my other string manipulation instead of Cassandra query part - I have removed some extra code which was not needed..

void getData(uint64_t currentTime, const std::vector<uint32_t> &shards) {

    uint16_t id;
    uint64_t lmd;
    uint32_t length;
    const char* binary_value;
    string column_key;
    string bt;
    uint64_t user_id;
    string colo;

    bool flag = false;

    try {

        currentTime = (currentTime / 1000) * 1000;

        uint64_t start = (currentTime / (60 * 60 * 1000 * 24))  % 3;

        string query;
        Cassandra::Result result;

        std::string base(boost::lexical_cast<std::string>(currentTime) + ".");

        for (std::vector<uint32_t>::const_iterator it = shards.begin() ; it != shards.end(); ++it) {

            string id = base + boost::lexical_cast<std::string>(*it); // use stream buffer

            query = "select name, value from keyspace.hour_"+boost::lexical_cast<std::string>(start)+" where id ='"+id+"';";

            result = Cassandra::execute_query(query);

            while (result && result.value()->next()) {

                char* key = 0;
                for (size_t i = 0; i < result.value()->column_count(); ++i) {
                    cql::cql_byte_t* data = NULL;
                    cql::cql_int_t size = 0;
                    result.value()->get_data(i, &data, size);

                    if (!flag) {
                        key = reinterpret_cast<char*>(data);
                        flag = true;
                    } else {

                        int index=0;
                        id = Mgr::get_uint16((&data[index])); 
                        index += 2;

                        lmd = Mgr::get_uint64((&data[index])); 
                        index += 8;

                        length = Mgr::get_uint32((&data[index])); 
                        index += 4;

                        binary_value = reinterpret_cast<const char *>(&data[index]);
                        flag = false;
                    }
                }

                if(key) {
                    vector<string> res;
                    char* p;
                    char* totken = strtok_r(key, ".", &p);
                    while(totken != NULL)
                    {
                        res.push_back(totken);
                        totken = strtok_r(NULL, ".", &p);
                    }

                    column_key= res[0];
                    bt= res[1];
                    user_id= atoi(res[2].c_str());
                    colo = res[3];
                }         
            }
        }
    } catch (const std::exception& ex) {

    }
}

I would not care about optimizing the string code (write it in the most easy way to read and thus maintain). Be far the slowest part of that code is the line: Cassandra::execute_query(query); by many orders of magnitude. — Loki Astari, Dec 4 '13 at 5:14
I'd be more worried about sql injection (granted you fully control what can go in here but this is a bad sign for other parts of the code), does Cassandra have parameterized queries? — ratchet freak, Dec 4 '13 at 8:39
For me what I am trying to optimize is the other part leaving Cassandra query part as it is.. I know there might be few minimal optimization that I am supposed to make but if it is worth doing it, then I might do it now then later figuring it out. — user2809564, Dec 4 '13 at 20:09
In that case, you really need to update your question to show what it's really doing. Right now your method does nothing (it does a lot of work and it throws it all away) and the entire method can be removed.... — rolfl♦, Dec 4 '13 at 20:14

rolfl · Answer 1 · 2013-12-04 03:30:11Z

This looks like incomplete code to me.... what do you do with result once you have it?

Still, I look at what your code does, and can't help but think your priorities are wrong in this instance.....

sure, streambuffer may hypothetically be faster than your String manipulation, but, really, who cares? (Note: I say 'who cares' from the context of the work you are doing).

Running any database query is going to take much, much longer than the string concatenation operations you are doing.....

When Cassandra gets your query, it is going to strip it down to components, analyze them for validity, relate them to database structures, optimize the query to use the best possible access plans, and then execute the query....

I would guess that the work you are doing in your method is probably 'orders of magnitude' less than what Cassandra is going to do...

In this instance, for your code specifically, I would recommend that you use the most readable and maintainable mechanisms. If you want to use streambuffers then do it, but whatever you do the motivation should not, in this case, be because of performance.

Read up on Amdahl's Law ... ;-)

For me what I am trying to optimize is the other part leaving Cassandra query part as it is.. I know there might be few minimal optimization that I am supposed to make but if it is worth doing it, then I might do it now then later figuring it out such as the way I am doing for string manipulation and other minor things if any.. — user2809564, Dec 4 '13 at 20:10

cstura · Answer 2 · 2013-12-04 12:55:15Z

up vote 0 down vote

how about this: (ps: remember to #include )

void getRecord(uint64_t currentTime, const std::vector<uint32_t> &shards) {

    try {

        currentTime = (currentTime / 1000) * 1000;

        uint64_t start = (currentTime / (60 * 60 * 1000 * 24))  % 3;

        ostringstream query;
        Cassandra::Result result;

        std::string base(boost::lexical_cast<std::string>(currentTime) + ".");

        for (std::vector<uint32_t>::const_iterator it = shards.begin() ; it != shards.end(); ++it) {

            ostringstream id;
            id << base << boost::lexical_cast<std::string>(*it);
            if(!query.str().empty()) {
               query << " union all ";
            }

            query << "select name, value from keyspace.hour_" \
                  << boost::lexical_cast<std::string>(start) << \
                  " where id ='" << id << "';";

        }

        result = Cassandra::execute_query(query.str());
    }

edited Dec 4 '13 at 12:55

answered Dec 4 '13 at 10:35

cstura
11

2

Please do not just post code. Post explanations along with the code. – Bobby Dec 4 '13 at 16:21

The big optimization here goes along with what was said by: rolfl because the big problem is what the database is going to do with the query. so other than using an ostringstream to build the string over the iterator this code runs a single query that returns all of the results in one go, making the database go through the process of looking up the data once. – cstura Dec 4 '13 at 16:23

I have updated the question with more details..!! – user2809564 Dec 4 '13 at 21:27

add comment

asked	6 months ago
viewed	197 times
active	6 months ago

current community

your communities

more stack exchange communities

Streambuffer and string manipulation

2 Answers

Your Answer

Not the answer you're looking for? Browse other questions tagged c++ optimization strings or ask your own question.

Hot Network Questions

current community

your communities

more stack exchange communities

Streambuffer and string manipulation

2 Answers

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged c++ optimization strings or ask your own question.

Related

Hot Network Questions