Code Review Stack Exchange is a question and answer site for peer programmer code reviews. Join them; it only takes a minute:

Sign up
Here's how it works:
  1. Anybody can ask a question
  2. Anybody can answer
  3. The best answers are voted up and rise to the top

To import the data contained into the file my_file.txt that have the form:

Label[0] = 0.980252
Label[1] = -nan
Label[2] = -nan
Label[3] = -nan
Label[4] = 0.664706
Label[5] = -nan
Label[6] = -nan
Label[7] = -nan
Label[8] = -nan
Label[9] = -nan
Label[10] = -nan
Label[11] = 0.800183
Label[12] = -nan
Label[13] = -nan
Label[14] = -nan
Label[15] = 0
Mean Data = 15

I wrote the following code:

import numpy as np

with open('myfile.txt', 'r') as file_txt_original:
    data = file_txt_original.read()

    data = data.replace('Mean data', '-1')
    data = data.replace('Label[', '')
    data = data.replace(']', '')
    data = data.replace(' = ', ', ')

    file_txt_original.close()

with open('new_file.txt', 'w') as file_txt_copy:

    file_txt_copy.write(data)
    file_txt_copy.close()

my_array = np.loadtxt('new_file.txt', delimiter=',')

It works but this to me seems still quite an tricky solution... Any suggestion to improve this code without doing so many replacement or without saving an additional structure?

share|improve this question
up vote 1 down vote accepted

I don't quite get why the data is written out into a new file again; it would be more "typical" to parse each line and create the array simultaneously.

That said, apart from that concern the only other thing I'd like to point out is that the close calls on the file objects aren't necessary because you (absolutely correctly) already put them in a with block, so that the close method will be automatically called if the block is exited.

Edit:

Okay, so for clarification, I mean something like the following:

import re

import numpy as np

with open('myfile.txt', 'r') as file_txt_original:
    my_array = np.array([])

    for line in file_txt_original:
        matches = re.match("Label\[(\d+)] = (.*)", line)
        if matches:
            index, value = matches.groups()
            index = int(index)
            if index >= my_array.size:
                my_array.resize(index + 1)
            my_array[index] = float(value)

Obviously it would be much better to the size of the array from the start, or maybe collecting things into a list and only allocate the array at the end.

share|improve this answer
    
Yes, thanks, I tried to follow your suggestion, but I can not find any function to load the string 'data' into a numpy array without saving it before in txt and after having performed the modifications. Maybe I am missing something easy... – SeF Mar 9 at 10:43
    
See the edit for a clarification on what I meant to say. – ferada Mar 9 at 12:27

You can concatenate the replace strings after you open the file, it will give better visibility

data = file_txt_original.read()
.replace('Mean data', '-1')
.replace('Label[', '')
.replace(']', '')
.replace(' = ', ', ')
share|improve this answer

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.