2

I have several XML files named TC_Circle1, TC_Circle2, `TC_Point1, etc in a directory and I want to use a script to update the start and stop dates in each file. The start and stop dates are inside and tags in each file.

I had a script that worked when we were using Sun machines but it is not working on the new HP Linux machines. It doesn't show any errors and doesn't change the dates. I need help getting it to work in Linux. The script:

#!/usr/local/bin/perl
$numArgs = @ARGV;
if ($numArgs != 2) 
{
print "Usage: replace_default_date.pl DEFAULT_START_DATE DEFAULT_STOP_DATE\n";
}

@filenames = `ls TC*`;
chomp(@filenames);
foreach $file (@filenames)
{
  open(REGFILE, "$file") || die "Cannot open |$file|";
  @lines = <REGFILE>; 
  close(REGFILE);

  open(WRITEFILE), ">$file") || die "Cannot open |$file|";

  foreach $line (@lines)
  {
    if ($line =~ /DEFAULT_START_DATE/)
    {
      $newline = "  " . $ARGV[0];
      print WRITEFILE "$newline\n";
    }
    elsif ($line =~ /DEFAULT_STOP_DATE/)
    {
      $newline = "  " . $ARGV[1];
      print WRITEFILE "$newline\n";
    }
    else 
    {
      print WRITEFILE "$line\n";
    }
  }
  close  (WRITEFILE);
}

Here's how the files to be modified look at the beginning:

<RequestSomething xmlns="http://something.com/accessservice">
   <period xmlns="">
     <start>2013-03-06T00:00:00</start>
     <stop>2013-03-07T00:00:00</stop>
   </period>
    ... The rest of the xml file...
 </RequestSomething>

Thanks in advance, Crystal

5
  • can you show us a (short!) example of the input data?
    – mirod
    Commented May 8, 2013 at 16:29
  • 1
    Very dangerous. If this program dies after starting the write, you lose the file. It should write to a temp file and then replace.
    – stark
    Commented May 8, 2013 at 16:34
  • The script is quite noobishly written, and does not treat the files as XML. It also won't run two times on the same file. Are you sure the original has 'ls TC*' with single quotes, not with backquotes, which would be required to run the string as a shell command?
    – amon
    Commented May 8, 2013 at 16:39
  • amon -- Well, I am a noob... since it seems to be a simple text change, I didn't think I needed to do anything special for an XML file. Do I? stark -- The files are copied to a new directory before this script is run so that I don't lose everything. Commented May 8, 2013 at 16:54
  • @user2363115, check out my answer. I think it's pretty good for what you want. I tried to explain everything, so hopefully you'll be able to follow.
    – Steve P.
    Commented May 8, 2013 at 17:04

3 Answers 3

1

There are several problems with your script.

1) There is a compile error because of an extra closing parenthesis:

open(WRITEFILE), ">$file") || die "Cannot open |$file|";

should be writting as

open(WRITEFILE, ">$file") || die "Cannot open |$file|";

2) You should use backticks instead of single quotes in

@filenames = 'ls TC*';

otherwise the @filenames will just contain the string 'ls TC*' instead of the actual list of the filenames:

@filenames = `ls TC*`;

3) Are you sure that the path to the perl interpreter is /usr/local/bin/perl ? (try which perl from the command line to check the path). If it is not then the first line should be changed.

4) The script will never work on the XML data you showed us since it is designed to replace lines that contain the strings DEFAULT_START_DATE and DEFAULT_STOP_DATE (with dates provided as arguments to the script). These strings do not appear in the data you showed us.

However, the script would work if the XML file is something like this:

<RequestSomething xmlns="http://something.com/accessservice">   
    <period xmlns="">
      <start>
          DEFAULT_START_DATE     
      </start>
      <stop>
          DEFAULT_STOP_DATE
      </stop>
    </period>
     ... The rest of the xml file...
 </RequestSomething>

I hope this will help you to get it to work, but anyway I would recommend that you rewrite the script because it uses a very unreliable and dangerous way of changing XML files.

0

Since what you want to do is relatively simple, you don't truly need to treat it as an .xml. I'm going to treat it as you were, so as to avoid confusion. For the way you're doing it, it seems as though Tie::File is a great option. For example:

Test.xml:

<RequestSomething xmlns="http://something.com/accessservice">
   <period xmlns="">
     <start>2013-03-06T00:00:00</start>
     <stop>2013-03-07T00:00:00</stop>
   </period>
    ... The rest of the xml file...
 </RequestSomething>

Code:

use Tie::File;
use strict;
use warnings;

my @ra=();
tie @ra, 'Tie::File', "test.xml" or die;
my $length=scalar(@ra);

for (my $i=0; $i < $length; $i++)
{
    if ($ra[$i] =~ /(\s*)<start>.*<\/start>/)
    {
        $ra[$i]="$1<start>$ARGV[0]<\/start>";
    }
    elsif ($ra[$i] =~ /(\s*)<stop>.*<\/stop>/)
    {
        $ra[$i]="$1<stop>$ARGV[1]<\/stop>";
    }
}

Using Tie::File, you can go into your file, and use an array to access/modify its contents. (\s*)<stop>.*<\/stop> basically does the following: (\s*) extracts all while space before into $1. <stop>.*<\/stop> Looks for the stop tags with any set of non-newline characters in between them. Once we know that we're in the correct line, we simply change that line by modifying the array, which as I said directly changes the file. We put the $1 in there to preserve indentation.

Here the new test.xml when I executed perl test.pl 1am 2pm:

<RequestSomething xmlns="http://something.com/accessservice">
   <period xmlns="">
     <start>1am</start>
     <stop>2pm</stop>
   </period>
    ... The rest of the xml file...
 </RequestSomething>

You can add the option to go through all of the necessary files, just make sure that after every file, you reset your array, ie @ra=(); Good luck. Hope this helps!

EDIT: see comment on untie array, you should probably do that, too.

4
  • PS: Not sure how familiar with perl you are, but if you don't have Tie::File installed, this won't work. To install it, simply go into cmd or terminal and open the cpan shell by typing "cpan" (no quotes). Once cpan is open type: "install Tie::File" and press enter. Then your perl module will install itself.
    – Steve P.
    Commented May 8, 2013 at 17:14
  • 1
    I think it's best to always untie @array.
    – chrsblck
    Commented May 8, 2013 at 17:30
  • While the OP is installing things from cpan, now would be a good time to install a XML parser. eg. XML::LibXML or XML::Parser.
    – chrsblck
    Commented May 8, 2013 at 17:33
  • @chrsblck, yeah, I didn't tell her to do that because this is a really simple task and that may just complicate things for her. Obviously, I agree that something like that should be used if we truly want to do stuff with multiple fields in the .xml, but for her purposes, this should suffice (I made an edit about untie, thanks).
    – Steve P.
    Commented May 8, 2013 at 17:44
0

Why don't you use an XML parser? Can't you install from the CPAN on that machine?

You could use XML::Simple, if the file isn't big, or XML::Twig otherwise — though callback handlers may be tricky if you're not used to them.

I am showing you an easy way with XML::XPath.

use XML::XPath;
use DateTime;

my $xp = XML::XPath->new(filename => 'input.xml');

$xp->setNodeText('/RequestSomething/period/start', DateTime->now->strftime("%FT%T"));
$xp->setNodeText('/RequestSomething/period/stop', DateTime->now->add(days=>1)->strftime("%FT%T"));

open my $fh, '>', 'output.xml' or die "$!";
print $fh $xp->getNodeAsXML();
close $fh; 

I used DateTime to set the current date but you can of course do without it.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.