I am new in java. I am getting java Stack overflow Exception in regex strHindiText. What should I do for that?
try {
// This regex convert the pattern "{\fldrslt {\fcs1 \ab\af24 \fcs0 ऩ}{"
// into "{\fldrslt {\fcs1 \ab\af24 \fcs0 ऩ}}}{"
// strHindiText = strHindiText.replaceAll("\\{(\\\\fldrslt[ ])\\{((\\\\\\S+[ ])+)((\\s*&#\\d+;\\s*(-|,|/|\\(|\\)|\"|;|\\.|'|<|>|:|\\?)*)+)\\}\\{","{$1{$2$4}}}{");
// This regex convert the pattern "{\fcs0 \af0 ऩ{ or {\fcs0 \af0 *\tab ऩ{"
// into "{\fcs0 \af0 ऩ }{"
strHindiText = strHindiText.replaceAll("\\{\\s*((\\\\\\S+[ ](\\*)?)+\\s*)(-|,|/|\\(|\\)|\"|;|\\.|'|<|>|:|\\?)*[ ]*(((&#\\d+;)[ ]*(-|,|/|\\(|\\)|\"|;|\\.|'|<|>|:|\\?)*[ ]*)+)\\{", "{$1 $4$5 }{");
// This regex convert the pattern "{ऩ \fcs0 \af0 {"
// into "{ऩ \fcs0 \af0 }{"
strHindiText = strHindiText.replaceAll("\\{\\s*(((&#\\d+;)[ ]*(-|,|/|\\(|\\)|\"|;|\\.|'|<|>|:|\\?)*[ ]*)+)[ ]*((\\\\\\S+[ ])+)\\{", "{$1 $5 }{");
} catch(StackOverflowError er) {
System.out.println("Third try Block StackOverflowError in regex pattern to reform the rtf tags................");
er.printStackTrace();
// throw er;
}
Whenever these strHindiText contain large data it gives an java stackoverflow exception:
java.lang.StackOverflowError
2013-08-08 15:35:07,743 ERROR [STDERR] (http-127.0.0.1-80-9) at java.util.regex.Pattern$Curly.match0(Pattern.java:3754)
2013-08-08 15:35:07,743 ERROR [STDERR] (http-127.0.0.1-80-9) at java.util.regex.Pattern$Curly.match(Pattern.java:3744)
2013-08-08 15:35:07,744 ERROR [STDERR] (http-127.0.0.1-80-9) at java.util.regex.Pattern$GroupTail.match(Pattern.java:4227)
2013-08-08 15:35:07,744 ERROR [STDERR] (http-127.0.0.1-80-9) at java.util.regex.Pattern$BmpCharProperty.match(Pattern.java:3366)
2013-08-08 15:35:07,745 ERROR [STDERR] (http-127.0.0.1-80-9) at java.util.regex.Pattern$Curly.match0(Pattern.java:3782)
2013-08-08 15:35:07,745 ERROR [STDERR] (http-127.0.0.1-80-9) at java.util.regex.Pattern$Curly.match(Pattern.java:3744)
My strHindiText data is:
`{\rtlch\fcs1 \af1\afs18 \ltrch\fcs0 \f1\fs18\cf21\insrsid13505584 भोपाल  । \par }\pard\plain \ltrpar\s16\ql \li0\ri0\sb100\sa100\sbauto1\saauto1\sl240\slmult0\widctlpar\wrapdefault\aspalpha\aspnum\faauto\adjustright\rin0\lin0\itap0\pararsid13505584 \cbpat20 \rtlch\fcs1 \af0\afs24\alang1025 \ltrch\fcs0 \fs24\lang1033\langfe1033\cgrid\langnp1033\langfenp1033 {\rtlch\fcs1 \ab\af1\afs18 \ltrch\fcs0 \cs21\b\f1\fs18\cf21\insrsid13505584 अन्वेषण करें  :}{\rtlch\fcs1 \af1\afs18 \ltrch\fcs0 \f1\fs18\cf21\insrsid13505584 \par भोपाल , मध्य प्रदेश की राजधानी प्राकृतिक सुंद`
|
are probably causing recursive calls, resulting in the stackoverflow. Regex stuff is complicated in general, and your regex is big. I'm not surprised. – keyser Aug 8 '13 at 10:02a|b|c
) to use the alternative notation:[abc]
, this should make the regex clearer, and you just need to escape the closing bracket and no other character. Also, it looks like you want to do something that regexes aren't good for - parsing - for something that isn't text but has a higher ordering. – Tassos Bassoukos Aug 8 '13 at 10:33RegEx
for such enormous parsings.. it's not very performant, since the regex expression compiles every time you try to match a string. – GGrec Aug 8 '13 at 11:13