Tell me more ×
Stack Overflow is a question and answer site for professional and enthusiast programmers. It's 100% free, no registration required.

I am combining all my javascriupt into one neat file in order to lower http requests! Im stuck removing the comments /* comments */ and // comments. My level is by far below minification or parsing stuff. I know how to make macaroni strings. Anything more complex than that, you will not find in my computer or kitchen, SO:

QUESTION

meanwhile at combining it to one file, i want to remove all comments.
What is the correct regex for this?

<?php
header('Content-type: text/javascript');    
$offset = 60 * 60 * 24; // Cache for a day
header ('Cache-Control: max-age=' . $offset . ', must-revalidate');
header ('Expires: ' . gmdate ("D, d M Y H:i:s", time() + $offset) . ' GMT');

ob_start("compress");
function compress($buffer) {

# NOT SURE, not all new lines are removed??
# remove tabs, spaces, newlines, etc.
$buffer = str_replace(array("\r\n", "\r", "\t", '  ', '    '), '', $buffer);  

# WORKS !!!
# remove comments / XXXXXXX
$buffer = preg_replace('(// .+)', '', $buffer);

######################################################################## 
# !! STUCK HERE !! OUTPUT FILE LOOKS OK BUT WEBSITE DOESNT LOAD OK IF THIS IS ON
# remove comments / * XXX  (enters etc) XXXX  * /
# $buffer = preg_replace('#/\*(?:.(?!/)|[^\*](?=/)|(?<!\*)/)*\*/#s', '', $buffer);
########################################################################        

return $buffer;
}

include('../file1.js');
include('../file2.js');  
ob_end_flush();
?>

It would be great if it would catch and delete the following:

/* XXXX */

/* 
  XXXX
  XXXX
*/

Thats all! Cant get it to work nomatter what regex i use even with this incredible tool, where i FOUND the right match to be:

RegExp: /\/\*(?:.(?!/)|[^\*](?=/)|(?<!\*)/)*\*\//gs
pattern: \/\*(?:.(?!/)|[^\*](?=/)|(?<!\*)/)*\*\/
flags: gs

http://gskinner.com/RegExr/

share|improve this question

2 Answers

up vote 4 down vote accepted

Using regular expressions isn't the most efficient way of removing Javascript comments. You need a string parser and minifier. See http://razorsharpcode.blogspot.com/2010/02/lightweight-javascript-and-css.html

If you insist on regex patterns, think about how you would parse this simple code that contains no Javascript comments at all:

var regex=/(ftp|http|https):\/\//; alert('hello, world'); return regex;

Notice the double slash before alert(). A stupid parser that uses regular expressions will treat valid Javascript code as comments!

share|improve this answer
1  
I second this suggestion; @Sam: If you're trying to optimize your responses, you should really consider using minifiers in place of custom comment removers. Google Closure Compiler, YUI Compressor or Uglify JS should be your go-to tools. – Krof Drakula Dec 10 '10 at 7:31
Thanks folks, but OMG that looks scary difficult! My main goal was to reduce HTTp request. Meanwhile at it i thought why not reduce the comments. Thats really OKA for me for now, for real. I value now not the most perfect/fastest, but something i understand and can grow with – Sam Dec 10 '10 at 8:40
2  
@Sam: Usually I am a major advocate of regexp, but comments are overly complex to remove from javascript, definitely a job for a minimizing parser. – Orbling Dec 10 '10 at 15:03
Thanks all, ive changed my question! – Sam Dec 10 '10 at 17:16

I hit a scenario where I load javascript code over the network via xhr, and need to clean the code for eval'ing. I'm using this code:

var cleanCode = function cleanCode(input) {
  // Remove special characters:
  return input.replace(/\/\/[^"'].*?\n/g, '') /* //comments, but ignore a "//" if there is a quote after it */
              .replace(/\n/g, '')             /* Leftover newlines */
              .replace(/\r/g, '');            /* Carriage returns  */
};

I tested using this and the only snag I had was a comment like:

//'key': 'value'

So I had to change that code to the inline comment:

/*'key': 'value'*/ //LEAVE as inline comment, please

This code doesn't get altered:

url += '//' + ...

Feel free to edit/suggest edits to make this more readable.

share|improve this answer
1  
This method is highly unreliable, because it relies on the assumption that // is always the beginning of a comment. Examples of failure: "Python's division: 1 // 2" and /* foo(); // some comment */ – Rob W Dec 8 '12 at 8:48
Yes, this code is extremely fragile/non-robust/unreliable. You need to have well formatted javascript. The worst case is when you have a // style comment start with a ' or ", like: //"you dope" – Devin G Rhode Dec 8 '12 at 10:08

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.