Take the 2-minute tour ×

Code Review Stack Exchange is a question and answer site for peer programmer code reviews. It's 100% free, no registration required.

Ruby Script to modify and split CSVs

up vote 1 down vote favorite

I have written the following script that is pointed at a CSV through it's file name and then splits the file into "drops" (for mailings) and does a couple of operations on it. As of now it works, but seems fairly slow when I use it on any significant number of records (>5000). Also there is a section in there where I need to seed each file with semi-static data and the portion of the code where I store it is just ugly. If anyone can suggest improvements from a approach, style, logical, or really any perspective. What follows are the functions for seeding the file, for creating and writing the CSVs, and for transforming the rows for output. (seeding portion is in the following gist) Part of the challenge in making this is that the headers might not always be the same or in the same order, so I need some way of comparing an "ideal" to what actually is, or so I think. As of right now it is tragically uncommented, so I'll work on editing those in soon.

def create_output_csvs(source_file, start_WO, start_title, po_number)
  # Prep variables for use
  drop_sizes = read_drop_sizes
  dealer_pin = read_dealer_pin
  purl = read_dealer_purl
  purl = ".#{purl}" if purl[0] != '.'
  drop = 0
  current_title = start_title
  current_drop_number =  start_WO
  stop_at = drop_sizes[drop] - 1
  start_at = 0
  pin_seq = 100_099

  ipd_head = header_to_ipd_header(source_file)
  write_file = "#{current_drop_number} for import.csv"
  CSV.foreach(source_file, headers: true).each_with_index do |row, i|

    write_header_to(ipd_head, write_file) if i == start_at

    pin_seq += 1
    pin = "#{dealer_pin}-#{pin_seq}"

    CSV.open(write_file, 'a') { |out| out << transform_row(row, pin, purl) }

    if i == stop_at
      ipd_seed(write_file, current_drop_number, pin_seq, dealer_pin, purl)
      drop += 1
      create_ipd_pallet_flag(current_drop_number, po_number, current_title)

      if drop < drop_sizes.size
        current_drop_number = next_work_order_number(current_drop_number)
        current_title = next_job_title(current_title)
        write_file = "#{current_drop_number} for import.csv"
        pin_seq = 100_100 * (drop + 1)
        pin = "#{dealer_pin}-#{pin_seq}"
        start_at += drop_sizes[drop]
        stop_at += drop_sizes[drop]
        nav_to_next_drop_folder("#{current_drop_number} #{current_title}")
      end
    end
    drop == drop_sizes.length ? break : true
  end
end

Individual row transformations:

def transform_row(row, pin, purl)
  tmphead = row.headers
  lname_i = tmphead.find_index { |l| /lname.*|last.*/i=~l }
  fname_i = tmphead.find_index { |l| /fname.*|first.*/i=~l }
  first = row[fname_i].capitalize.gsub(/\s+/, '')
  last = row[lname_i].capitalize.gsub(/\s+/, '')
  sfx_i = tmphead.find_index { |l| /sfx.*|suffix.*|sufx.*/i =~ l }
  mi_i = tmphead.find_index { |l| /mi.*|mname/i =~ l }

  row[lname_i] = full_name(first, last, row[mi_i], row[sfx_i])

  row[mi_i] = salutat(first, last, row[sfx_i])
  row[fname_i] = first

  full_purl = "#{first}#{last}#{purl}".downcase

  address2_i = tmphead.find_index { |l| /add.*2/i =~ l }

  if address2_i
    address = "#{row[(address2_i - 1)]} #{row[address2_i]}"
    row[address2_i - 1] = address
    row.delete(address2_i)
  end

  row << pin
  row << full_purl
  row.delete_if { |h| col_blacklist?(h[0]) }
  row
end

Any suggestions for improvement would be appreciated, but primarily I am interested in reworking the code so that it's more readable/maintainable as well as speeding it up in general.

edited May 5 at 17:51

asked May 5 at 16:33

Soviet_Jesus
527

add a comment |

Your Answer

Sign up or log in

Post as a guest

Name

Post as a guest

Name

discard

By posting your answer, you agree to the privacy policy and terms of service.

Browse other questions tagged beginner ruby csv or ask your own question.

question feed

asked	4 months ago
viewed	40 times

current community

your communities

more stack exchange communities

Ruby Script to modify and split CSVs

Your Answer

Browse other questions tagged beginner ruby csv or ask your own question.

Hot Network Questions

current community

your communities

more stack exchange communities

Ruby Script to modify and split CSVs

Know someone who can answer? Share a link to this question via email, Google+, Twitter, or Facebook.

Your Answer

Sign up or log in

Post as a guest

Browse other questions tagged beginner ruby csv or ask your own question.

Related

Hot Network Questions