dd: multiple input files

Question

I need to concatenate chunks from two files:

if I needed concatenate whole files, I could simply do

cat file1 file2 > output

But I need to skip first 1MB from the first file, and I only want 10 MB from the second file. Sounds like a job for dd.

dd if=file1 bs=1M count=99 skip=1 of=temp1
dd if=file2 bs=1M count=10 of=temp2
cat temp1 temp2 > final_output

Is there a possibility to do this in one step? ie, without the need to save the intermediate results? Can I use multiple input files in dd ?

meuh · Answer 1 · 2016-05-02 11:36:39Z

up vote 16 down vote

dd can write to stdout too.

( dd if=file1 bs=1M count=99 skip=1
  dd if=file2 bs=1M count=10  ) > final_output

answered yesterday

meuh

8,8851321

This is probably the best way. The output file isn't closed/reopened (like it is with oflag=append conv=notrunc), so filesystems that do delayed allocation (like XFS) are least likely to decide the file is done being written when there's still more to go. – Peter Cordes yesterday

@PeterCordes that's a good point, but as long as dd isn't asked to sync, delayed allocation shouldn't kick in immediately anyway (unless memory is tight in which case neither method will postpone allocation). – Stephen Kitt yesterday

@StephenKitt: You're probably right. I was thinking of XFS's speculative preallocation, where it does need to specially detect the close/reopen access pattern (sometimes seen for log files). – Peter Cordes yesterday

3

In shells like bash and mksh that don't optimize out the fork for the last command in a subshell, you can make it slightly more efficient by replacing the subshell with a command group. For other shells, it shouldn't matter, and the subshell approach might even be slightly more efficient as the shell doesn't need to save and restore stdout. – Stéphane Chazelas yesterday

add a comment |

Stephen Kitt · Answer 2 · 2016-05-02 08:18:19Z

I don't think you can easily read multiple files in a single dd invocation, but you can append to build the output file in several steps:

dd if=file1 bs=1M count=99 skip=1 of=final_output
dd if=file2 bs=1M count=10 of=final_output oflag=append conv=notrunc

You need to specify both conv=notrunc and oflag=append. The first avoids truncating the output, the second starts writing from the end of the existing file.

Stéphane Chazelas · Answer 3 · 2016-05-02 16:04:44Z

up vote 8 down vote

Bear in mind that dd is a raw interface to the read(), write() and lseek() system call. You can only use it reliably to extract chunks of data off regular files, block devices and some character devices (like /dev/urandom), that is files for which read(buf, size) is guaranteed to return size as long as the end of the file is not reached.

For pipes, sockets and most character devices (like ttys), you have no such guarantee unless you do read()s of size 1, or use the GNU dd extension iflag=fullblock.

So either:

{
  gdd < file1 bs=1M iflag=fullblock count=99 skip=1
  gdd < file2 bs=1M iflag=fullblock count=10
} > final_output

Or:

M=1048576
{
  dd < file1 bs=1 count="$((99*M))" skip="$M"
  dd < file2 bs=1 count="$((10*M))"
} > final_output

Or with shells with builtin support for a seek operator like ksh93:

M=1048576
{
  command /opt/ast/bin/head -c "$((99*M))" < file1 <#((M))
  command /opt/ast/bin/head -c "$((10*M))" < file2
}

Or zsh (assuming your head supports the -c option here):

zmodload zsh/system &&
{
  sysseek 1048576 && head -c 99M &&
  head -c 10M < file2
} < file1 > final_output

edited yesterday

answered yesterday

Stéphane Chazelas

155k23241425

Do you really need the quotes? Wont the result always be an integer? – Steven Penny yesterday

@StevenPenny, leaving the expansion unquoted is asking the shell to split+glob it which wouldn't make any sense here. The split part being done on the current value of $IFS. That's irrespective of the content of the variable/expansion. See also Security implications of forgetting to quote a variable in bash/POSIX shells – Stéphane Chazelas 17 hours ago

@Stéphane Chazelas - in the first example, you are using gdd instead of dd. Is that a typo, or is that intentional ? – Martin Vegter 16 hours ago

add a comment |

Stephen Kitt · Answer 4 · 2016-05-02 15:47:24Z

up vote 3 down vote

With a bashism, and a functionally "useless use of cat", but closest to the syntax the OP uses:

cat <(dd if=file1 bs=1M count=99 skip=1) \
    <(dd if=file2 bs=1M count=10) \
   > final_output

(That being said, Stephen Kitt's answer seems to be the most efficient possible method.)

edited yesterday

Stephen Kitt

27k44269

answered yesterday

agc

23510

3

Strictly speaking, <(...) is a kshism which both zsh and bash copied. – Stéphane Chazelas yesterday

add a comment |

asked	yesterday
viewed	557 times
active	yesterday

current community

your communities

more stack exchange communities

dd: multiple input files

4 Answers 4

Your Answer

Not the answer you're looking for? Browse other questions tagged files dd cat data or ask your own question.

Linked

Hot Network Questions

current community

your communities

more stack exchange communities

dd: multiple input files

4 Answers 4

Did you find this question interesting? Try our newsletter

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged files dd cat data or ask your own question.

Linked

Related

Hot Network Questions