2

I need to get the child node data values of nodes with a given name in an XML file, using a Perl script. I am using XML::LibXML::Simple.

A code snippet is shown below:

my $booklist = XMLin(path);

  foreach my $book (@{$booklist->{detail}}) {
    print $book->{name} . "\n";
}

And the XML file looks like the following:

<?xml version='1.0' encoding='iso-8859-1'?>
<booklist>
<book>
<detail label='label1' status='active' type='none'>
<name>book1</name>
</detail >
<detail label='label2' status='active' type='none'>
<name>book2</name>
</detail >
</book>
</booklist>

When I use the above code, I got the following error message: "Not an ARRAY reference"

Can anyone please help me ?

3
  • are you like? book1, book2, text Commented Jun 27, 2013 at 7:20
  • can you please explain what you which output? Commented Jun 27, 2013 at 7:23
  • Yes I want book1 and book2 text as output Commented Jun 27, 2013 at 7:25

5 Answers 5

2

Below a solution for XML::Simple, which was used in the OP.

use strict;
use warnings;
use XML::Simple;

my $booklist = XMLin($ARGV[0], KeyAttr => [], ForceArray => qr/detail/);

foreach my $book (@{$booklist->{book}->{detail}}) {
    print $book->{name} . "\n";
}

The important piece here are the options given to XMLin, forcing the "detail" subnodes to be represented as an array.

A good quick start for XML::Simple is the documentation on CPAN: http://metacpan.org/pod/XML::Simple

1
  • Well done. But linking anyone to the XML::Simple docs is a mistake--they are pathetic.
    – 7stud
    Commented Jun 27, 2013 at 8:08
1

I think like this....

use strict;
use XML::Twig;

my $text = join '', <DATA>;
my $story_file = XML::Twig->new(
                twig_handlers =>{
                'name' => \&name,
                keep_atts_order => 1,
},
                pretty_print => 'indented',
);
$story_file->parse($text);

sub name {
        my ($stroy_file, $name) = @_;
    print $name->text, "\n";
}

__END__
<?xml version='1.0' encoding='iso-8859-1'?>
<booklist>
<book>
<detail label='label1' status='active' type='none'>
<name>book1</name>
</detail >
<detail label='label2' status='active' type='none'>
<name>book2</name>
</detail >
</book>
</booklist>
3
  • I tried your code and when i tried to install XML::Twig package using ppm command i got the below error message "Downloading XML-Twig-3.32...redirect Downloading XML-Twig-3.32...failed 401 Authorization Required ppm install failed: 401 Authorization Required" Commented Jun 27, 2013 at 7:50
  • why keep_atts_order? It's not needed here.
    – mirod
    Commented Jun 27, 2013 at 7:56
  • XML::Twig seems to be only available to paying customers in a lot of perl versions, see code.activestate.com/ppm/XML-Twig. Using repositories listed in PPM::Repositories (metacpan.org/module/JDB/PPM-Repositories-0.19/Repositories.pm) I see it in bribes.org/perl/ppm . I am not familiar with Activestate Perl, but you should be able to add a new repositories to it so you can get XML::Twig.
    – mirod
    Commented Jun 27, 2013 at 8:04
1

From the XML::Simple docs:

The use of this module in new code is discouraged. Other modules are available which provide more straightforward and consistent interfaces. In particular, XML::LibXML is highly recommended.

The major problems with this module are the large number of options and the arbitrary ways in which these options interact - often with unexpected results.

Anyway.

In your code, you are skimming over the fact that the booklist contains books which contain details. The booklist has no immediate details. Here is a short solution using XML::LibXML:

use strict; use warnings; use 5.010; use XML::LibXML;

my $dom = XML::LibXML->load_xml(IO => \*DATA) or die "Can't load";

for my $detail ($dom->findnodes('/booklist/book/detail')) {
    say $detail->findvalue('./name');
}

__DATA__
<?xml version='1.0' encoding='iso-8859-1'?>
<booklist>
  <book>
    <detail label='label1' status='active' type='none'>
      <name>book1</name>
    </detail >
    <detail label='label2' status='active' type='none'>
      <name>book2</name>
    </detail >
  </book>
</booklist>

As you can see in the XPATH expression /booklist/book/detail, we first have to look into the book before finding the details. Of course, this could be shortened to //detail.

In general, if a data structure isn't that what it seems, you should dump it, e.g.

use Data::Dumper;
print Dumper $booklist;

This would output:

$VAR1 = {
  'book' => {
    'detail' => {
      'book2' => {
        'status' => 'active',
        'type' => 'none',
        'label' => 'label2'
      },
      'book1' => {
        'status' => 'active',
        'type' => 'none',
        'label' => 'label1'
      }
    }
  }
};

So for some fucked up reason, the book1 and book2 strings are now keys in a nested hash. Do yourself a favour, and stop using the most complicated XML module on CPAN, the “XML::Simple”.

2
  • I updated your above code and run the script, that time i got below error message "Can't locate object method "load_xml" via package "XML::LibXML" (perhaps you for got to load "XML::LibXML"?)" then i used the ppm command to install the package ppm install XML::LibXML now i got the below error message ppm install failed: Can't find any package that provide XML::LibXML Commented Jun 27, 2013 at 7:47
  • There is a XML::LibXML package in the PPM repos. Therefore, it should work. Are you sure you've copied the whole code snippet, incl. use XML::LibXML? It ran under XML::LibXML v2.0018, perl5 v16.3 for me (although I don't use active perl)
    – amon
    Commented Jun 27, 2013 at 7:57
1

When you write:

@{ $booklist->{detail} }

...you are saying that $booklist->{detail} returns an array reference, and you want perl to dereference it into an array, i.e. the '@'.

Don't use <name> as a tag. XML::Simple parses that weirdly. Here is an example:

1)

<?xml version='1.0' encoding='iso-8859-1'?>
<booklist>
  <book>
      <bname>book1</bname>
  </book>
  <book>
      <bname>book2</bname>
  </book>
</booklist>

use strict;   
use warnings;   
use 5.016;  

use XML::Simple;
use Data::Dumper;



my $booklist = XMLin('xml.xml');
print Dumper($booklist);


--output:--

$VAR1 = {
          'book' => [
                    {
                      'bname' => 'book1'
                    },
                    {
                      'bname' => 'book2'
                    }
                  ]
        };

2) Now look at what happens when you use a <name> tag:

<?xml version='1.0' encoding='iso-8859-1'?>
<booklist>
  <book>
      <name>book1</bname>
  </book>
  <book>
      <name>book2</bname>
  </book>
</booklist>

--output:--
$VAR1 = {
          'book' => {
                    'book2' => {},
                    'book1' => {}
                  }
        };

So with your original example:

<?xml version='1.0' encoding='iso-8859-1'?>
<booklist>
  <book>

    <detail label='label1' status='active' type='none'>
      <bname>book1</bname>
    </detail>

    <detail label='label2' status='active' type='none'>
      <bname>book2</bname>
    </detail>

  </book>
</booklist>


--output:--
$VAR1 = {
          'book' => {
                    'detail' => [
                                {
                                  'bname' => 'book1',
                                  'status' => 'active',
                                  'label' => 'label1',
                                  'type' => 'none'
                                },
                                {
                                  'bname' => 'book2',
                                  'status' => 'active',
                                  'label' => 'label2',
                                  'type' => 'none'
                                }
                              ]
                  }
        };

And to get all the bname tags, you can do this:

use strict;   
use warnings;   
use 5.016;  

use XML::Simple;
use Data::Dumper;

my $booklist = XMLin('xml.xml');
my $aref = $booklist->{book}{detail};

for my $href (@$aref) {
    say $href->{bname};
}


--output:--
book1
book2
3
  • Sorry i was wrongly updated my code. can you please check now ? Commented Jun 27, 2013 at 7:11
  • $booklist->{detail} returns more than one values and I'm going to store that into array and i'm printing. But i'm not sure about this if you have some other idea pls share me Commented Jun 27, 2013 at 7:20
  • Is there any other way to get name values ? Commented Jun 27, 2013 at 7:21
0

Yet another way using XML::Rules (assuming the point is to get stuff in 'detail' rather than just print content of 'name'):

use XML::Rules;
my @rules = (
  detail => sub {
    print "$_[1]{name}\n";
    return;
  },
  name => 'content',
  _default => undef,
);

my $xr = XML::Rules->new(rules => \@rules);
$xr->parsefile("tmp.xml");

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.