Code Review Stack Exchange is a question and answer site for peer programmer code reviews. It's 100% free, no registration required.

Sign up
Here's how it works:
  1. Anybody can ask a question
  2. Anybody can answer
  3. The best answers are voted up and rise to the top

I learned Bash a million years ago. I just wrote this simple script used to get the first lot of HTML comments from a file, and spit it out in order to create a README.md file.

It is just. So. Ugly. I read bits and pieces over the years, and I am sure it can be improved so much...

Here we go:

#!/bin/bash

IFS=''
active='0';
cat  hot-form-validator.html | while read "line";do

  echo $line | grep '\-\->' > /dev/null
  if [ $active = '1' -a $? = '0' ];then 
    exit 0;
  fi;

  suppress=0;
  echo $line | grep '^ *@' > /dev/null
  if [ $? = '0' ];then suppress='1'; fi;


  if [ $active = '1' -a $suppress = '0' ];then echo $line;fi;

  echo $line | grep "<!--" > /dev/null
  if [ $? = '0' ];then active='1'; fi;

done

Questions:

  • Is there a better way to do grep and then check $?? Back in the day it was the way to go, but...

  • Should active be a proper number rather than a string with a number? I know, it could be anything... but having a string that can be 0 or 1 just feels wrong.

  • Is there a better way to preserve spaces, rather than zapping IFS?

  • Any more pearls of wisdom, other than quitting my (short lived) career of bash scripter?

share|improve this question

A bug

  • If your html contains a backspace character e.g

    <p>The special character \n is a way to include read line</p>

your code interprets it a

The special character n is a way to include read line.

to avoid this, add "-r" switch to your read line

cat hot-form-validator.html | while read -r "line";do

  • You want to check if your file exists before reading it
if [ -f $filename ]; then
#do something
else
#do something
fi
share|improve this answer

A Bash loop is not the best approach. If you are doing a lot of line-by-line processing, and invoking grep a lot, then awk would be a more appropriate tool.

awk '/<!--/ { ACTIVE = 1; next }
     /-->/  { exit }
     ACTIVE { print }' < hot-form-validator.html

But line-by-line processing is not an appropriate way to parse HTML. xsltproc, for example, could do a proper job of extracting the first comment in an HTML file:

xsltproc --html first-comment.xsl hot-form-validator.html

… where first-comment.xml contains:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output method="text"/>
  <xsl:template match="/"><xsl:value-of select="//comment()[1]"/></xsl:template>
</xsl:stylesheet>
share|improve this answer
    
I don't really want turn xsltproc a requirement to generate the documentation -- especially since I control the contents of the starting file. But, I am SOLD on AWK! What's the best resource to learn it? – Merc yesterday
    
But also... what about the script in the original post? What if I really wanted to improve it? – Merc yesterday
    
This tutorial seems good. Or, if you are a masochist, you could try to figure it all out from the man page. – 200_success yesterday

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.