Tell me more ×
Code Review Stack Exchange is a question and answer site for peer programmer code reviews. It's 100% free, no registration required.

Hey all, I've got a clojure function here that's meant to parse a string of form

"DDXXXYYY"

where DD is meant to be discarded, and XXX and YYY are string representations of integers. Here's my code:

(defn split-id [tileid]
  (map #(Integer/parseInt %)
       (map (partial apply str)
            (partition 3 (drop 2 tileid)))))

Or, written with the threading macro:

(defn split-id [tileid]
  (map #(Integer/parseInt %)
       (map (partial apply str)
            (->> tileid (drop 2) (partition 3)))))

Does anyone have any recommendations for a simple way to do this? It feels like clojure.contrib.string should have a partition function that takes a string and an integer. Looking forward to your suggestions!

UPDATE:

I think I've come up with a fairly good block of code for the above function, which I had to extend a bit. I'd love comments and suggestions on the final block, here.

(defn parse-ints
  "Converts all strings in the supplied collection to their integer
  representation."
  [coll]
  (map #(Integer/parseInt %) coll))

(def res-map
  ^{:doc "Map between the second digit in a MODIS TileID
metadata value and the corresponding resolution."}
  {:1 "1000"
   :2 "500"
   :4 "250"})

(defn tileid->res
  "Returns a string representation of the resolution referenced by the
supplied MODIS TileID."
  [tileid]
  (res-map (keyword (subs tileid 1 2))))

(defn tileid->xy
  "Extracts integer representations of the MODIS X and Y coordinates
referenced by the supplied MODIS TileID."
  [tileid]
  (parse-ints
   (map (partial apply str)
        (partition 3 (subs tileid 2)))))

(defn split-id
  "Returns a sequence containing the resolution, X and Y
  coordinates (on the MODIS grid) referenced by the supplied MODIS
  TileID."
  [tileid]
  (flatten
   ((juxt tileid->res
          tileid->xy) tileid)))
share|improve this question

4 Answers

up vote 4 down vote accepted

Alternative solution is to keep it really simple:

(defn extract-int [s start end]
  (Integer/parseInt (subs s start end)))

(defn split-id [s]
  (list
    (extract-int s 2 5)
    (extract-int s 5 8)))

I believe that hardcoding two calls to extract-int is a simpler and more elegant way to meet the requirement than a solution which involves a map. Sure it's nice to use abstraction with higher-order functions, but YAGNI principle applies.

share|improve this answer

Just a though, perhaps you're looking at this the wrong way. This is a parsing question, and simple parses are often best executed through regular expressions. Clojure has reader macros and special functions to work with Java regexes.

share|improve this answer

Hey all -- I write up a few versions of the above problem for the guys on my team. I'll post it here in case anyone's interested -- this is for folks who don't know clojure, so apologies for the in-line explanations, if they're not needed. Please see my update to the original question for the code I ended up using, along with docstrings.


(to my team)

Guys, here are a few ways to accomplish the task solved by tileid->xy, shown above. I'm not sure I'm happy with the final version, so I've left these here for us to peruse together. Enjoy!

code:

(fn [tileid]
  (parse-ints
   (map (partial subs tileid) [2 5] [5 8])))

comments:

This first one is... misleading! Here, we're actually calling (subs tileid...) in the first elements of each of the other vectors, not on each vector in turn. I wouldn't go with something like this; it works, but it's confusing.

code:

(fn [tileid]
  (parse-ints
   (map (partial apply subs tileid) [[2 5] [5 8]])))

comments:

This is a little bit better, but not really. Now we have a vector of vectors -- partial takes the stuff after it, and builds a function that takes less arguments than that stuff needs. (in this case, we get: the function (apply subs tileid ...), where the ellipsis can be any group of arguments. Since we're mapping that, we get [2 5] and [5 8] fed in in turn.)

code:

(fn [tileid]
  (parse-ints
   ((juxt #(subs % 2 5) #(subs % 5 8)) tileid)))

comments:

Another way to do the same thing. In this case, juxt gives us a sequence that is the juxtaposition of all of the functions. When we send in an argument, we get the juxtaposition of the results -- in this case, ((subs tileid 2 5) (subs tileid 5 8)), where subs is substring (as above).

code:

(fn [tileid]
  (parse-ints
   (map (partial apply str)
        (partition 3 (s/drop 2 tileid)))))

comments:

This is the same as the next one; it actually compiles to the same byte code, I'm just not using the threading macro here. I don't know if the threading macro makes the code more clear, in this case, but it's a good example of little tweaks for readability, independent of performance.

code:

(fn [tileid]
  (parse-ints
   (map (partial apply str)
        (->> tileid
             (s/drop 2)
             (partition 3)))))

comments:

The referenced one, with the threading macro. As you can see, it takes each thing, evaluates it, and inserts it as the last thing in the next entry. So, here we have tileid inserted, making (s/drop 2 tileid). The whole thing expands to the code above. BE CAREFUL with this! It's a really good solution, but only when it makes things clear -- not when it's used as a way to think iteratively. There's usually some transformation on a list that you can do to get the same effects as the threading macro. (Once we all learn to code the Clojure Way, this stuff will be fine :)

code:

(defn tileid->res
  "Returns a string representation of the resolution referenced by the
supplied MODIS TileID."
  [tileid]
  (res-map (keyword (subs tileid 1 2))))

(fn [tileid]
  (parse-ints
   (map (partial apply str)
        (partition 3 (subs tileid 2)))))

comments:

I ended up going with this bad boy, because it looked closest to my other function, tileid->res. Looking at them together. See how they're of similar? This is nice, as it helps relate the concepts.

share|improve this answer

You could try the following, using the Java substring method and a vector of offsets:

(defn split-id [tileid]
  (map 
    #(Integer. (.substring tileid % (+ % 3)) ) 
    [2 5]))
share|improve this answer

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.