Take the 2-minute tour ×
Unix & Linux Stack Exchange is a question and answer site for users of Linux, FreeBSD and other Un*x-like operating systems. It's 100% free, no registration required.

I am trying to write a bash script. In a directory I have 2 fastq files:

A-122-3.BH7WBVADXX.lane_1_P1_I24.hg19.sequence.fastq

A-122-3.BH7WBVADXX.lane_1_P2_I24.hg19.sequence.fastq

I just want to loop on P1 lets say something like this

for f in *_P1*
do
   SOMETHING
done

Now In SOMETHING part I want to define some varables which i would be using later in my code I need to extract these values from my string: A-122-3.BH7WBVADXX.lane_1_P1_I24.hg19.sequence.fastq

I need ID = A-122-3-BH7WBVADXX-1

I need PU = BH7WBVADXX

I need LB = A-122-3

Then I will solve it further.

NOTE : FILENAME ARE NOT OF SAME LENGTH. A-122-3 PART VARIES FOR DIFFERENT SAMPLES AND ALSO THIS PART I24 VARIES. Thanks

share|improve this question

1 Answer 1

up vote 2 down vote accepted

Assuming that each filename you are processing has the same length and that each substring has the same length, you can split based on this. Also, sure where the -1 part on the ID comes from, so I assume you get it from lane_1.

for file in *_P1*
do
  id=${file:0:18}-${file:24:1}
  pu=${file:8:10}
  lb=${file:0:7}

  echo "id=$id pu=$pu lb=$lb"
done

Update

This should work provided certain dots and underscores remain consistent:

for file in *_P1*
do
  lb=${file%%.*}

  pu=${file%%.lane_*}
  pu=${pu#*.}

  num=${file%%_P*}
  num=${num##*_}

  id="$lb-$pu-$num"

  echo "id=$id pu=$pu lb=$lb"
done
share|improve this answer
    
Hi, I am afraid but each filename does not have same length. I should have mentioned that before. Fixing. You are correct, 1 comes from lane_1 –  user3138373 Apr 8 '14 at 20:48
1  
@user3138373, updated. –  Graeme Apr 8 '14 at 21:13

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.