Unix & Linux Stack Exchange is a question and answer site for users of Linux, FreeBSD and other Un*x-like operating systems. It's 100% free, no registration required.

Sign up
Here's how it works:
  1. Anybody can ask a question
  2. Anybody can answer
  3. The best answers are voted up and rise to the top

I would like to know how can I split my data from the following format:

<datas>
 <data>
  <name>Name1</name>
 </data>
 <data>
  <name>Name2</name>
 </data>
</datas>

to the following format:

<data><name>Name1</name></data>
<data><name>Name2</name></data>

The parsed data would be sent to a Python script as follows:

 python script.py <data><name>Name1<name></data>
 python script.py <data><name>Name2<name></data>

I have tried commands like:

echo 'cat /datas/data' | xmllint --shell file.xml

but how can I pass the output in the desired format to the Python script?

share|improve this question
1  
Is the formatting important, or do you just want to extract all <data> tags and the lower tags? – Kusalananda yesterday
    
thank you for helping me to make the format, Kusalananda – Aryise yesterday
    
the format is important. Because I am passing the formatted data to a python script as arguement – Aryise yesterday
3  
It would be better if the Python script parsed the XML (using an XML parser) and extracted the bits it needed... – Kusalananda yesterday
1  
oh! I thought the format you mentioned was <data><name> there can be newlines after the data tag. Python is using xml.etree.ElementTree – Aryise yesterday
up vote 5 down vote accepted

I would preprocess the data with XMLStarlet:

$ xml sel -t -c '/datas/data' -nl data.xml
<data>
  <name>Name1</name>
 </data><data>
  <name>Name2</name>
 </data>

Then it depends on how you Python script wants to read this data. Hopefully, it's from a file or from standard input...

share|improve this answer
    
hmmmm can xml sel work in mac os? what library should I install to run xml command? :) – Aryise yesterday
    
@Aryise I'm working on the command line on a MacBook Air running El Capitan. – Kusalananda yesterday
    
@Aryise I'm using XMLStarlet from NetBSD's pkgsrc package system, but I believe it's available through Homebrew as well. – Kusalananda yesterday
    
oh my.... I am using Mac OS X Yosemite. They tell me: -bash: xml: command not found :( – Aryise yesterday
    
@Aryise You will have to install it through some means. Homebrew and MacPorts have many good utilities for Mac, and both have XMLStarlet. – Kusalananda yesterday

I'd use xslt.

the xslt stylesheet looks like this

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:template match="/datas">
  <xsl:apply-templates select="data"/>
</xsl:template>

<xsl:template match="data">
  <data><name><xsl:value-of select="./name"/></name></data><xsl:text>&#xa;</xsl:text>
</xsl:template>

</xsl:stylesheet>

for the transformation use the program xsltproc.

say your input file is named in.xml

the xslt stylesheet is named in.xsl

then the call is

 xsltproc in.xsl in.xml

output:

<?xml version="1.0"?>
<data><name>Name1</name></data>
<data><name>Name2</name></data>
share|improve this answer
    
is there a way not to modify the xml file. I dont think I am allow to modify it. I can only change the structure of execution flow. Currently, I am using .sh script to schedule the run for the robot. Now the robot should be in different instance for different <data> tag. so I can only change the script for it. :( – Aryise yesterday
    
you don't have to modify your input file. the xml code in my example is the stylesheet you have to provide. in my example: YOUR DATA => in.xml the stylesheet =>in.xsl. just copy the example xml code into in.xsl and it should work. – murphy yesterday
    
oh I see.. Although it takes additional step, it can be a backup plan if I really couldnt parse the XML directly into the way I wanted it. I should automate the process of copy the data in XML to XSL. Because the XML file that is used is dynamic. for instance, today can be using a.xml file but the next run can be either b or c.xml – Aryise yesterday

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.