Tell me more ×
Stack Overflow is a question and answer site for professional and enthusiast programmers. It's 100% free, no registration required.

the documentation of math.combinatorics states that all functions return lazy sequences.

However if I try to run subsets with a lot of data

(last (combinatorics/subsets (range 20)))
;OutOfMemoryError Java heap space  clojure.lang.RT.cons (RT.java:559)

I get an OutOfMemory Error.

Running

(last (range))

burns cpu but doesn't return an error.

Clojure doesn't seem to "hold on the head" like explained in this so question

Could someone explain me why this is happening and how I can use bigger ranges in subsets

Update

It seems to work on some peoples computers as the comments suggest. So I will post my system configuration

I run a Mac (10.8.3) and installed clojure (1.5.1) with homebrew.

My Java Version is

% java -version
java version "1.6.0_45"
Java(TM) SE Runtime Environment (build 1.6.0_45-b06-451-11M4406)
Java HotSpot(TM) 64-Bit Server VM (build 20.45-b01-451, mixed mode)

I didn't change any of the default settings. I also reinstalled all dependencies, by deleting the ~/.m2 folder

My projects.clj

And the command I used was this

% lein repl
nREPL server started on port 61774
REPL-y 0.1.10
Clojure 1.5.1
=> (require 'clojure.math.combinatorics)
nil
=> (last (clojure.math.combinatorics/subsets (range 20)))
OutOfMemoryError Java heap space  clojure.lang.RT.cons (RT.java:570)
or
OutOfMemoryError Java heap space  clojure.math.combinatorics/index-combinations/fn--1148/step--1164 (combinatorics.clj:64)

I tested the problem on a colleagues laptop and he had the same issue, but he was on a mac, too.

share|improve this question
Works fine without OOM error.. which clojure version you are using? – Ankur Apr 24 at 15:19
Works for me using Clojure 1.5.1 on Linux with default JVM settings. – user100464 Apr 24 at 16:14
clojure 1.5.1 on a mac (installed with homebrew). I updated the problem description with more details. – innotune Apr 27 at 12:12
@Ankur user100464 What happens when you run (last (clojure.math.combinatorics/subsets (range 1000)))? What -Xmx settings do you have? – innotune Apr 29 at 9:30

This question has an open bounty worth +50 reputation from innotune ending tomorrow.

This question has not received enough attention.

3 Answers

You want to compute the power set of a set with 1000 elements? You know that's going to have 2^1000 elements, right? That is so large I can't even find a good way to describe how enormous it is. If you're trying to work with such a set, and you can do so lazily, your problem won't be memory: it will be computation time. Let's say you have a supercomputer with infinite memory, capable of processing a trillion items per nanosecond: that's 10^21 items processed per second, or about 10^29 items per year. Even this supercomputer will take much, much longer than the lifetime of the universe to work through the items of (subsets (range 1000)).

So I'd say, stop worrying about the memory usage of this collection, and work on an algorithm that doesn't involve walking through sequences with more elements than there are atoms in the universe.

share|improve this answer
You're absolutely right. Computing a big solution set with subsets will take ages and is not a viable option in a real application. However having a version of subsets that is not prone to constant memory usage would be important for applications with a more realistic use case, too. – innotune yesterday

The issue is that subsets uses mapcat, and mapcat is not lazy enough as it uses apply which realizes and holds some of the elements to be concatenated. See a very nice explanation here. Using the lazier mapcat version of that link in subsets should fix the issue:

(defn my-mapcat
   [f coll]
   (lazy-seq
     (if (not-empty coll)
      (concat
      (f (first coll))
     (my-mapcat f (rest coll))))))

(defn subsets
  "All the subsets of items"
  [items]
  (my-mapcat (fn [n] (clojure.math.combinatorics/combinations items n))
  (range (inc (count items)))))

 (last (subsets (range 50))) ;; this will take hours to compute, good luck with it!
share|improve this answer
I expected the library to have constant memory usage. I can increased Xmx to 4gb, but this can barely handle 50 Elements in the range. I want to create subsets of sets with 1000+ elements. – innotune Apr 29 at 9:52
what version of java are you using? – dAni Apr 29 at 12:09
I'm running java version "1.6.0_45" on a Mac OS X in 64bit – innotune Apr 29 at 14:52
I have updated the answer. Hope it helps. – dAni Apr 29 at 21:39

In the absence of command line arguments, the startup heap size parameters of a JVM are determined by various ergonomics

The defaults (JDK 6) are

   
   initial heap size    memory / 64
   maximum heap size    MIN(memory / 4, 1GB)

but you can force an absolute value using the -Xmx and -Xms args You can find more detail here

share|improve this answer
Setting xmx isn't the issue. I try to understand why this function takes more memory than it should. – innotune Apr 29 at 16:36

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.