Take the 2-minute tour ×
Stack Overflow is a question and answer site for professional and enthusiast programmers. It's 100% free, no registration required.

Having a large array in which column[0] corresponds with the day, col[1]=month, col[2]=year and col[3]=hours (the latter is a float and also contains info on minutes and seconds in the fraction), what is the most efficient way to convert these columns into an array of datetimes?

update below: I tinkered with the dt.datetime function so it handles array input as well as fractional years, months whatever. I haven't tested this thoroughly yet and there are probably more elegant ways to do it but here goes.

from __future__import division

def getrem(input):
    "this function yields the value behind the decimal point"
    import numpy as np
    output=abs(input-np.fix(input))
    return output

def datenum(Yr,Mo=1,Da=1,Hr=0,Mi=0,Se=0,Ms=0):
    "this function works as regular datetime.datetime, but allows for float input"
    import numpy as np    
    import datetime as dt
    import calendar

    #correct faulty zero input
    if Mo<1:
        Mo+=1
    if Da<1:
        Da+=1        

    #distribute the year fraction over days    
    if  getrem(Yr)>0:
        if calendar.isleap(np.floor(Yr)):
            fac=366       
        else:
            fac=365               
        Da=Da+getrem(Yr)*fac
        Yr=int(Yr)
    #if months exceeds 12, pump to years         
    while int(Mo)>12:
        Yr=Yr+1
        Mo=Mo-12
    #distribute fractional months to days              
    if getrem(Mo)>0:
        Da=Da+getrem(Mo)*calendar.monthrange(Yr,int(Mo))[1]
        Mo=int(Mo)
    #datetime input for 28 days always works excess is pumped to timedelta    
    if Da>28:
        extraDa=Da-28
        Da=28
    else:
        extraDa=0 
    # sometimes input is such that you get 0 day or month values, this fixes this anomaly           
    if int(Da)==0:
       Da+=1
    if int(Mo)==0:
       Mo+=1

    #datetime calculation           
    mytime=dt.datetime(int(Yr),int(Mo),int(Da))+dt.timedelta(days=extraDa+getrem(Da),hours=Hr,minutes=Mi,seconds=Se,microseconds=Ms)
    return mytime    

def araydatenum(*args):
    mydatetimes=[datenum(*[a.squeeze()[x] for a in args]) for x in range(len(args[0].squeeze()))]
    return mydatetimes 
share|improve this question
4  
By datetimes do you mean Python datetime.datetime objects (for which you'll need an array of dtype='object'), or do you mean NumPy datetime64 objects introduced in NumPy 1.7? –  unutbu Nov 4 '13 at 14:22
    
either format will do –  tragewombat Nov 6 '13 at 8:36
1  
Do you really have fractional years and months? Sounds like a pain. :-) BTW, timedelta will handle the extra hours, minutes, and seconds, saving you at least the part of pushing up the remainders. –  ratatoskr Nov 7 '13 at 16:47
    
Fractional years is a common occurrence, as is days and sometimes hours. Months is pretty rare. You're right about timedelta, I might tweak it a bit further still. –  tragewombat Nov 8 '13 at 9:23

1 Answer 1

up vote 2 down vote accepted

Can't speak to the most efficient, but it can be done easily like this:

import datetime as dt
mydatetimes = [dt.datetime(x[2], x[1], x[0]) + dt.timedelta(hours=x[3]) for x in myarray]

This creates a regular python list, not a numpy array. Just add numpy.array( ... ) around the right hand side to make it an array with dtype=object.

share|improve this answer
1  
A variation would be np.apply_along_axis(foo,1,myarray), where foo is a simple function wrapper around your dt.datetime.... But it's actually slower than your np.numpy([foo(x) for x in myarray]). For custom actions like this, straightforward Python is often the best starting point. –  hpaulj Nov 4 '13 at 23:19
    
This works (see edit update above), but I still wonder if a for loop (or apply_along_axis) cannot be avoided. –  tragewombat Nov 7 '13 at 15:31

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.