Take the 2-minute tour ×
Code Review Stack Exchange is a question and answer site for peer programmer code reviews. It's 100% free, no registration required.

I wrote a function split-seq-by-n which accepts a sequence and a number and splits the sequence into subsequences of length n (the last subsequence getting the remaining elements). It works on all subclasses of 'sequence, among others 'string, 'cons, 'vector etc.

;; Splits sequence into subsequences of length n
(defun split-seq-by-n (seq n)
  (labels ((seq-split (seq n &optional acc orig-n)
         (cond ((zerop (length seq)) (nreverse acc))
           ((zerop n) (seq-split seq 
                          orig-n
                          (cons (subseq seq 0 0) acc)
                          orig-n))
           (t (seq-split (subseq seq 1)
                      (1- n) 
                      (cons (concatenate (class-of seq) 
                             (if acc (car acc) (subseq seq 0 0))
                             (list (elt seq 0)))
                        (cdr acc))
                      orig-n)))))
    (seq-split seq n nil n)))

Any comments, improvements, critiques welcome.

share|improve this question

2 Answers 2

It looks okay for a recursion exercise, but it's naive code. Not usable in 'production'.

  • it conses like mad. It creates a lot of intermediate garbage. Splitting a string involves making lots of smaller strings.
  • for lists it is not efficient, too
  • CONCATENATE in loops or recursive functions is a code smell
  • the recursive implementation with the subroutine is good style in a functional language, but makes it hard to debug in the real world. For example, how do you TRACE the internal function?
  • Common Lisp does not guarantee tail call optimization (TCO). Individual implementations do support it. Still, for general portable code it might be preferable to code loop.
share|improve this answer

When writing functions for manipulating sequences, it's typically easier to write a version that handles the list case specially, and then uses the generic sequence functions for everything else (since they'll probably be fine for vectors). Taking that approach, I'd write the following code. The main interface, split-seq-by-n takes the sequence and the n, and calls either split-list-by-n (optimized for lists) or split-sequence-by-n (which works for generic sequences, and so is probably efficient for vectors):

(defun split-seq-by-n (sequence n)
  (if (listp sequence) 
      (split-list-by-n sequence n)
      (split-sequence-by-n sequence n)))

(defun split-sequence-by-n (sequence n)
  (loop 
     :with length := (length sequence)
     :for start :from 0 :by n :below length
     :collecting (subseq vector start (min length (+ start n)))))

(defun split-list-by-n (list n)
  (do ((nn (1- n) (1- nn))
       (part '())
       (parts '()))
      ((endp list) 
       (nreverse (if (endp part) parts
                     (list* (nreverse part) parts))))
    (push (pop list) part)
    (when (zerop nn)
      (push (nreverse part) parts)
      (setf part '()
            nn n))))

Since the generic sequence functions should be fine for sequences that support random indexing, split-sequence-by-n is pretty straightforward. The only bit that's a little complex it caching the length at first so that we have a maximum value for the end value in subseq. split-list-by-n is a bit more complex. It iterates over the list just once, allocating only as much new structure as is strictly necessary.

share|improve this answer
    
that's nice. I particularly like your use of list* instead of cons. –  Wojciech Gac Mar 11 '14 at 23:37
    
@WojciechGac When I'm working with conses as lists, I use first, rest, and list*. When I'm working with them as trees, I use car, cdr, and cons. –  Joshua Taylor Mar 11 '14 at 23:55

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.