15

I have to convert a numpy array of floats to a string (to store in a SQL DB) and then also convert the same string back into a numpy float array.

This is how I'm going to a string (based on this article)

VIstring = ''.join(['%.5f,' % num for num in VI])
VIstring= VIstring[:-1] #Get rid of the last comma

So firstly this does work, is it a good way to go? Is their a better way to get rid of that last comma? Or can I get the join method to insert the commas for me?

Then secondly,more importantly, is there a clever way to get from the string back to a float array?

Here is an example of the array and the string:

VI
array([ 17.95024446,  17.51670904,  17.08894626,  16.66695611,
        16.25073861,  15.84029374,  15.4356215 ,  15.0367219 ,
        14.64359494,  14.25624062,  13.87465893,  13.49884988,
        13.12881346,  12.76454968,  12.40605854,  12.00293814,
        11.96379322,  11.96272486,  11.96142533,  11.96010489,
        11.95881595,  12.26924591,  12.67548634,  13.08158864,
        13.4877041 ,  13.87701221,  14.40238245,  14.94943786,
        15.49364166,  16.03681428,  16.5498035 ,  16.78362298,
        16.90331119,  17.02299387,  17.12193689,  17.09448654,
        17.00066063,  16.9300633 ,  16.97229868,  17.2169709 ,  17.75368411])

VIstring
'17.95024,17.51671,17.08895,16.66696,16.25074,15.84029,15.43562,15.03672,14.64359,14.25624,13.87466,13.49885,13.12881,12.76455,12.40606,12.00294,11.96379,11.96272,11.96143,11.96010,11.95882,12.26925,12.67549,13.08159,13.48770,13.87701,14.40238,14.94944,15.49364,16.03681,16.54980,16.78362,16.90331,17.02299,17.12194,17.09449,17.00066,16.93006,16.97230,17.21697,17.75368'

Oh yes and the loss of precision from the %.5f is totally fine, these values are interpolated by the original points only have 4 decimal place precision so I don't need to beat that. So when recovering the numpy array, I'm happy to only get 5 decimal place precision (obviously I suppose)

2
  • 1
    You might check out the numpy savetxt and loadtxt functions Commented May 10, 2013 at 13:13
  • @MattAnderson Is there a way to use those to put the text straight into strings and load it straight out of strings in memory rather than using files? Commented May 10, 2013 at 13:13

3 Answers 3

26

First you should use join this way to avoid the last comma issue:

VIstring = ','.join(['%.5f' % num for num in VI])

Then to read it back, use numpy.fromstring:

np.fromstring(VIstring, sep=',')
2
  • Very nice function suggestion @Boud. Commented Oct 9, 2016 at 2:58
  • You're welcome @Pramit : pandas is powerful enough that it makes users forget numpy features underneath Commented Oct 9, 2016 at 3:01
10

If you want some string representation whatsoever (not necessarily CSV), you could try this, which I have been using:

import numpy, json

## arr is some numpy.ndarray
s = json.dumps(arr.tolist())
arrback = numpy.array(json.loads(s))

It works for most common datatypes.

1
  • 1
    +1 This is pretty cool, especially if you need to keep the precision. Commented May 10, 2013 at 13:40
10
>>> import numpy  as np
>>> from cStringIO import StringIO
>>> VI = np.array([ 17.95024446,  17.51670904,  17.08894626,  16.66695611,
        16.25073861,  15.84029374,  15.4356215 ,  15.0367219 ,
        14.64359494,  14.25624062,  13.87465893,  13.49884988,
        13.12881346,  12.76454968,  12.40605854,  12.00293814,
        11.96379322,  11.96272486,  11.96142533,  11.96010489,
        11.95881595,  12.26924591,  12.67548634,  13.08158864,
        13.4877041 ,  13.87701221,  14.40238245,  14.94943786,
        15.49364166,  16.03681428,  16.5498035 ,  16.78362298,
        16.90331119,  17.02299387,  17.12193689,  17.09448654,
        17.00066063,  16.9300633 ,  16.97229868,  17.2169709 ,  17.75368411])
>>> s = StringIO()
>>> np.savetxt(s, VI, fmt='%.5f', newline=",")
>>> s.getvalue()
'17.95024,17.51671,17.08895,16.66696,16.25074,15.84029,15.43562,15.03672,14.64359,14.25624,13.87466,13.49885,13.12881,12.76455,12.40606,12.00294,11.96379,11.96272,11.96143,11.96010,11.95882,12.26925,12.67549,13.08159,13.48770,13.87701,14.40238,14.94944,15.49364,16.03681,16.54980,16.78362,16.90331,17.02299,17.12194,17.09449,17.00066,16.93006,16.97230,17.21697,17.75368,'
>>> np.fromstring(s.getvalue(), sep=',')
array([ 17.95024,  17.51671,  17.08895,  16.66696,  16.25074,  15.84029,
        15.43562,  15.03672,  14.64359,  14.25624,  13.87466,  13.49885,
        13.12881,  12.76455,  12.40606,  12.00294,  11.96379,  11.96272,
        11.96143,  11.9601 ,  11.95882,  12.26925,  12.67549,  13.08159,
        13.4877 ,  13.87701,  14.40238,  14.94944,  15.49364,  16.03681,
        16.5498 ,  16.78362,  16.90331,  17.02299,  17.12194,  17.09449,
        17.00066,  16.93006,  16.9723 ,  17.21697,  17.75368])
3
  • ah, set the string as the file buffer...way to go. Knew there should be some clever way there Commented May 10, 2013 at 13:24
  • This is pretty similar to method 5 from the link I posted, I suppose I should have noticed it. Thanks. I'm going to stick with Boud's method probably Commented May 10, 2013 at 13:48
  • @Dan not really since all of the operations in my code are performed at the C level so It's likely to be faster, also it avoids reinventing the wheel by using numpy functions. Commented May 10, 2013 at 13:52

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.