Character.java in » JDK-Core » lang » java » lang » Java Source Code / Java DocumentationJava Source Code and Java Documentation

Java Source Code / Java Documentation
1.	JDK Core
2.	JDK Modules
3.	JDK Modules com.sun
4.	JDK Modules com.sun.java
5.	JDK Modules Platform
6.	JDK Modules sun
7.	Open Source Graphic Library
8.	Open Source IDE Eclipse
9.	Open Source J2EE
10.	Open Source JBOSS
11.	Open Source JDBC Driver
12.	Open Source Library
13.	Open Source Library Database
14.	Open Source Net
15.	Science
16.	Sevlet Container
17.	SUN GlassFish
18.	Swing Library
19.	Web Services apache cxf 2.0.1
20.	Web Services AXIS2
21.	XML
Java
Java Tutorial
Oracle PL/SQL Tutorial
Java Source Code / Java Documentation » JDK Core » lang » java.lang
Source Cross Referenced Class Diagram Java Document (Java Doc)
0001:        /*
0002:         * Copyright 2002-2006 Sun Microsystems, Inc.  All Rights Reserved.
0003:         * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
0004:         *
0005:         * This code is free software; you can redistribute it and/or modify it
0006:         * under the terms of the GNU General Public License version 2 only, as
0007:         * published by the Free Software Foundation.  Sun designates this
0008:         * particular file as subject to the "Classpath" exception as provided
0009:         * by Sun in the LICENSE file that accompanied this code.
0010:         *
0011:         * This code is distributed in the hope that it will be useful, but WITHOUT
0012:         * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
0013:         * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
0014:         * version 2 for more details (a copy is included in the LICENSE file that
0015:         * accompanied this code).
0016:         *
0017:         * You should have received a copy of the GNU General Public License version
0018:         * 2 along with this work; if not, write to the Free Software Foundation,
0019:         * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA.
0020:         *
0021:         * Please contact Sun Microsystems, Inc., 4150 Network Circle, Santa Clara,
0022:         * CA 95054 USA or visit www.sun.com if you need additional information or
0023:         * have any questions.
0024:         */
0025:
0026:        package java.lang;
0027:
0028:        import java.util.Map;
0029:        import java.util.HashMap;
0030:        import java.util.Locale;
0031:
0032:        /**
0033:         * The <code>Character</code> class wraps a value of the primitive
0034:         * type <code>char</code> in an object. An object of type
0035:         * <code>Character</code> contains a single field whose type is
0036:         * <code>char</code>.
0037:         * <p>
0038:         * In addition, this class provides several methods for determining
0039:         * a character's category (lowercase letter, digit, etc.) and for converting
0040:         * characters from uppercase to lowercase and vice versa.
0041:         * <p>
0042:         * Character information is based on the Unicode Standard, version 4.0.
0043:         * <p>
0044:         * The methods and data of class <code>Character</code> are defined by
0045:         * the information in the <i>UnicodeData</i> file that is part of the
0046:         * Unicode Character Database maintained by the Unicode
0047:         * Consortium. This file specifies various properties including name
0048:         * and general category for every defined Unicode code point or
0049:         * character range.
0050:         * <p>
0051:         * The file and its description are available from the Unicode Consortium at:
0052:         * <ul>
0053:         * <li><a href="http://www.unicode.org">http://www.unicode.org</a>
0054:         * </ul>
0055:         *
0056:         * <h4><a name="unicode">Unicode Character Representations</a></h4>
0057:         *
0058:         * <p>The <code>char</code> data type (and therefore the value that a
0059:         * <code>Character</code> object encapsulates) are based on the
0060:         * original Unicode specification, which defined characters as
0061:         * fixed-width 16-bit entities. The Unicode standard has since been
0062:         * changed to allow for characters whose representation requires more
0063:         * than 16 bits.  The range of legal <em>code point</em>s is now
0064:         * U+0000 to U+10FFFF, known as <em>Unicode scalar value</em>.
0065:         * (Refer to the <a
0066:         * href="http://www.unicode.org/reports/tr27/#notation"><i>
0067:         * definition</i></a> of the U+<i>n</i> notation in the Unicode
0068:         * standard.)
0069:         *
0070:         * <p>The set of characters from U+0000 to U+FFFF is sometimes
0071:         * referred to as the <em>Basic Multilingual Plane (BMP)</em>. <a
0072:         * name="supplementary">Characters</a> whose code points are greater
0073:         * than U+FFFF are called <em>supplementary character</em>s.  The Java
0074:         * 2 platform uses the UTF-16 representation in <code>char</code>
0075:         * arrays and in the <code>String</code> and <code>StringBuffer</code>
0076:         * classes. In this representation, supplementary characters are
0077:         * represented as a pair of <code>char</code> values, the first from
0078:         * the <em>high-surrogates</em> range, (&#92;uD800-&#92;uDBFF), the
0079:         * second from the <em>low-surrogates</em> range
0080:         * (&#92;uDC00-&#92;uDFFF).
0081:         *
0082:         * <p>A <code>char</code> value, therefore, represents Basic
0083:         * Multilingual Plane (BMP) code points, including the surrogate
0084:         * code points, or code units of the UTF-16 encoding. An
0085:         * <code>int</code> value represents all Unicode code points,
0086:         * including supplementary code points. The lower (least significant)
0087:         * 21 bits of <code>int</code> are used to represent Unicode code
0088:         * points and the upper (most significant) 11 bits must be zero.
0089:         * Unless otherwise specified, the behavior with respect to
0090:         * supplementary characters and surrogate <code>char</code> values is
0091:         * as follows:
0092:         *
0093:         * <ul>
0094:         * <li>The methods that only accept a <code>char</code> value cannot support
0095:         * supplementary characters. They treat <code>char</code> values from the
0096:         * surrogate ranges as undefined characters. For example,
0097:         * <code>Character.isLetter('&#92;uD840')</code> returns <code>false</code>, even though
0098:         * this specific value if followed by any low-surrogate value in a string
0099:         * would represent a letter.
0100:         *
0101:         * <li>The methods that accept an <code>int</code> value support all
0102:         * Unicode characters, including supplementary characters. For
0103:         * example, <code>Character.isLetter(0x2F81A)</code> returns
0104:         * <code>true</code> because the code point value represents a letter
0105:         * (a CJK ideograph).
0106:         * </ul>
0107:         *
0108:         * <p>In the Java SE API documentation, <em>Unicode code point</em> is
0109:         * used for character values in the range between U+0000 and U+10FFFF,
0110:         * and <em>Unicode code unit</em> is used for 16-bit
0111:         * <code>char</code> values that are code units of the <em>UTF-16</em>
0112:         * encoding. For more information on Unicode terminology, refer to the
0113:         * <a href="http://www.unicode.org/glossary/">Unicode Glossary</a>.
0114:         *
0115:         * @author  Lee Boynton
0116:         * @author  Guy Steele
0117:         * @author  Akira Tanaka
0118:         * @since   1.0
0119:         */
0120:        public final class Character extends Object implements 
0121:                java.io.Serializable, Comparable<Character> {
0122:            /**
0123:             * The minimum radix available for conversion to and from strings.
0124:             * The constant value of this field is the smallest value permitted
0125:             * for the radix argument in radix-conversion methods such as the
0126:             * <code>digit</code> method, the <code>forDigit</code>
0127:             * method, and the <code>toString</code> method of class
0128:             * <code>Integer</code>.
0129:             *
0130:             * @see     java.lang.Character#digit(char, int)
0131:             * @see     java.lang.Character#forDigit(int, int)
0132:             * @see     java.lang.Integer#toString(int, int)
0133:             * @see     java.lang.Integer#valueOf(java.lang.String)
0134:             */
0135:            public static final int MIN_RADIX = 2;
0136:
0137:            /**
0138:             * The maximum radix available for conversion to and from strings.
0139:             * The constant value of this field is the largest value permitted
0140:             * for the radix argument in radix-conversion methods such as the
0141:             * <code>digit</code> method, the <code>forDigit</code>
0142:             * method, and the <code>toString</code> method of class
0143:             * <code>Integer</code>.
0144:             *
0145:             * @see     java.lang.Character#digit(char, int)
0146:             * @see     java.lang.Character#forDigit(int, int)
0147:             * @see     java.lang.Integer#toString(int, int)
0148:             * @see     java.lang.Integer#valueOf(java.lang.String)
0149:             */
0150:            public static final int MAX_RADIX = 36;
0151:
0152:            /**
0153:             * The constant value of this field is the smallest value of type
0154:             * <code>char</code>, <code>'&#92;u0000'</code>.
0155:             *
0156:             * @since   1.0.2
0157:             */
0158:            public static final char MIN_VALUE = '\u0000';
0159:
0160:            /**
0161:             * The constant value of this field is the largest value of type
0162:             * <code>char</code>, <code>'&#92;uFFFF'</code>.
0163:             *
0164:             * @since   1.0.2
0165:             */
0166:            public static final char MAX_VALUE = '\uffff';
0167:
0168:            /**
0169:             * The <code>Class</code> instance representing the primitive type
0170:             * <code>char</code>.
0171:             *
0172:             * @since   1.1
0173:             */
0174:            public static final Class<Character> TYPE = Class
0175:                    .getPrimitiveClass("char");
0176:
0177:            /*
0178:             * Normative general types
0179:             */
0180:
0181:            /*
0182:             * General character types
0183:             */
0184:
0185:            /**
0186:             * General category "Cn" in the Unicode specification.
0187:             * @since   1.1
0188:             */
0189:            public static final byte UNASSIGNED = 0;
0190:
0191:            /**
0192:             * General category "Lu" in the Unicode specification.
0193:             * @since   1.1
0194:             */
0195:            public static final byte UPPERCASE_LETTER = 1;
0196:
0197:            /**
0198:             * General category "Ll" in the Unicode specification.
0199:             * @since   1.1
0200:             */
0201:            public static final byte LOWERCASE_LETTER = 2;
0202:
0203:            /**
0204:             * General category "Lt" in the Unicode specification.
0205:             * @since   1.1
0206:             */
0207:            public static final byte TITLECASE_LETTER = 3;
0208:
0209:            /**
0210:             * General category "Lm" in the Unicode specification.
0211:             * @since   1.1
0212:             */
0213:            public static final byte MODIFIER_LETTER = 4;
0214:
0215:            /**
0216:             * General category "Lo" in the Unicode specification.
0217:             * @since   1.1
0218:             */
0219:            public static final byte OTHER_LETTER = 5;
0220:
0221:            /**
0222:             * General category "Mn" in the Unicode specification.
0223:             * @since   1.1
0224:             */
0225:            public static final byte NON_SPACING_MARK = 6;
0226:
0227:            /**
0228:             * General category "Me" in the Unicode specification.
0229:             * @since   1.1
0230:             */
0231:            public static final byte ENCLOSING_MARK = 7;
0232:
0233:            /**
0234:             * General category "Mc" in the Unicode specification.
0235:             * @since   1.1
0236:             */
0237:            public static final byte COMBINING_SPACING_MARK = 8;
0238:
0239:            /**
0240:             * General category "Nd" in the Unicode specification.
0241:             * @since   1.1
0242:             */
0243:            public static final byte DECIMAL_DIGIT_NUMBER = 9;
0244:
0245:            /**
0246:             * General category "Nl" in the Unicode specification.
0247:             * @since   1.1
0248:             */
0249:            public static final byte LETTER_NUMBER = 10;
0250:
0251:            /**
0252:             * General category "No" in the Unicode specification.
0253:             * @since   1.1
0254:             */
0255:            public static final byte OTHER_NUMBER = 11;
0256:
0257:            /**
0258:             * General category "Zs" in the Unicode specification.
0259:             * @since   1.1
0260:             */
0261:            public static final byte SPACE_SEPARATOR = 12;
0262:
0263:            /**
0264:             * General category "Zl" in the Unicode specification.
0265:             * @since   1.1
0266:             */
0267:            public static final byte LINE_SEPARATOR = 13;
0268:
0269:            /**
0270:             * General category "Zp" in the Unicode specification.
0271:             * @since   1.1
0272:             */
0273:            public static final byte PARAGRAPH_SEPARATOR = 14;
0274:
0275:            /**
0276:             * General category "Cc" in the Unicode specification.
0277:             * @since   1.1
0278:             */
0279:            public static final byte CONTROL = 15;
0280:
0281:            /**
0282:             * General category "Cf" in the Unicode specification.
0283:             * @since   1.1
0284:             */
0285:            public static final byte FORMAT = 16;
0286:
0287:            /**
0288:             * General category "Co" in the Unicode specification.
0289:             * @since   1.1
0290:             */
0291:            public static final byte PRIVATE_USE = 18;
0292:
0293:            /**
0294:             * General category "Cs" in the Unicode specification.
0295:             * @since   1.1
0296:             */
0297:            public static final byte SURROGATE = 19;
0298:
0299:            /**
0300:             * General category "Pd" in the Unicode specification.
0301:             * @since   1.1
0302:             */
0303:            public static final byte DASH_PUNCTUATION = 20;
0304:
0305:            /**
0306:             * General category "Ps" in the Unicode specification.
0307:             * @since   1.1
0308:             */
0309:            public static final byte START_PUNCTUATION = 21;
0310:
0311:            /**
0312:             * General category "Pe" in the Unicode specification.
0313:             * @since   1.1
0314:             */
0315:            public static final byte END_PUNCTUATION = 22;
0316:
0317:            /**
0318:             * General category "Pc" in the Unicode specification.
0319:             * @since   1.1
0320:             */
0321:            public static final byte CONNECTOR_PUNCTUATION = 23;
0322:
0323:            /**
0324:             * General category "Po" in the Unicode specification.
0325:             * @since   1.1
0326:             */
0327:            public static final byte OTHER_PUNCTUATION = 24;
0328:
0329:            /**
0330:             * General category "Sm" in the Unicode specification.
0331:             * @since   1.1
0332:             */
0333:            public static final byte MATH_SYMBOL = 25;
0334:
0335:            /**
0336:             * General category "Sc" in the Unicode specification.
0337:             * @since   1.1
0338:             */
0339:            public static final byte CURRENCY_SYMBOL = 26;
0340:
0341:            /**
0342:             * General category "Sk" in the Unicode specification.
0343:             * @since   1.1
0344:             */
0345:            public static final byte MODIFIER_SYMBOL = 27;
0346:
0347:            /**
0348:             * General category "So" in the Unicode specification.
0349:             * @since   1.1
0350:             */
0351:            public static final byte OTHER_SYMBOL = 28;
0352:
0353:            /**
0354:             * General category "Pi" in the Unicode specification.
0355:             * @since   1.4
0356:             */
0357:            public static final byte INITIAL_QUOTE_PUNCTUATION = 29;
0358:
0359:            /**
0360:             * General category "Pf" in the Unicode specification.
0361:             * @since   1.4
0362:             */
0363:            public static final byte FINAL_QUOTE_PUNCTUATION = 30;
0364:
0365:            /**
0366:             * Error flag. Use int (code point) to avoid confusion with U+FFFF.
0367:             */
0368:            static final int ERROR = 0xFFFFFFFF;
0369:
0370:            /**
0371:             * Undefined bidirectional character type. Undefined <code>char</code>
0372:             * values have undefined directionality in the Unicode specification.
0373:             * @since 1.4
0374:             */
0375:            public static final byte DIRECTIONALITY_UNDEFINED = -1;
0376:
0377:            /**
0378:             * Strong bidirectional character type "L" in the Unicode specification.
0379:             * @since 1.4
0380:             */
0381:            public static final byte DIRECTIONALITY_LEFT_TO_RIGHT = 0;
0382:
0383:            /**
0384:             * Strong bidirectional character type "R" in the Unicode specification.
0385:             * @since 1.4
0386:             */
0387:            public static final byte DIRECTIONALITY_RIGHT_TO_LEFT = 1;
0388:
0389:            /**
0390:             * Strong bidirectional character type "AL" in the Unicode specification.
0391:             * @since 1.4
0392:             */
0393:            public static final byte DIRECTIONALITY_RIGHT_TO_LEFT_ARABIC = 2;
0394:
0395:            /**
0396:             * Weak bidirectional character type "EN" in the Unicode specification.
0397:             * @since 1.4
0398:             */
0399:            public static final byte DIRECTIONALITY_EUROPEAN_NUMBER = 3;
0400:
0401:            /**
0402:             * Weak bidirectional character type "ES" in the Unicode specification.
0403:             * @since 1.4
0404:             */
0405:            public static final byte DIRECTIONALITY_EUROPEAN_NUMBER_SEPARATOR = 4;
0406:
0407:            /**
0408:             * Weak bidirectional character type "ET" in the Unicode specification.
0409:             * @since 1.4
0410:             */
0411:            public static final byte DIRECTIONALITY_EUROPEAN_NUMBER_TERMINATOR = 5;
0412:
0413:            /**
0414:             * Weak bidirectional character type "AN" in the Unicode specification.
0415:             * @since 1.4
0416:             */
0417:            public static final byte DIRECTIONALITY_ARABIC_NUMBER = 6;
0418:
0419:            /**
0420:             * Weak bidirectional character type "CS" in the Unicode specification.
0421:             * @since 1.4
0422:             */
0423:            public static final byte DIRECTIONALITY_COMMON_NUMBER_SEPARATOR = 7;
0424:
0425:            /**
0426:             * Weak bidirectional character type "NSM" in the Unicode specification.
0427:             * @since 1.4
0428:             */
0429:            public static final byte DIRECTIONALITY_NONSPACING_MARK = 8;
0430:
0431:            /**
0432:             * Weak bidirectional character type "BN" in the Unicode specification.
0433:             * @since 1.4
0434:             */
0435:            public static final byte DIRECTIONALITY_BOUNDARY_NEUTRAL = 9;
0436:
0437:            /**
0438:             * Neutral bidirectional character type "B" in the Unicode specification.
0439:             * @since 1.4
0440:             */
0441:            public static final byte DIRECTIONALITY_PARAGRAPH_SEPARATOR = 10;
0442:
0443:            /**
0444:             * Neutral bidirectional character type "S" in the Unicode specification.
0445:             * @since 1.4
0446:             */
0447:            public static final byte DIRECTIONALITY_SEGMENT_SEPARATOR = 11;
0448:
0449:            /**
0450:             * Neutral bidirectional character type "WS" in the Unicode specification.
0451:             * @since 1.4
0452:             */
0453:            public static final byte DIRECTIONALITY_WHITESPACE = 12;
0454:
0455:            /**
0456:             * Neutral bidirectional character type "ON" in the Unicode specification.
0457:             * @since 1.4
0458:             */
0459:            public static final byte DIRECTIONALITY_OTHER_NEUTRALS = 13;
0460:
0461:            /**
0462:             * Strong bidirectional character type "LRE" in the Unicode specification.
0463:             * @since 1.4
0464:             */
0465:            public static final byte DIRECTIONALITY_LEFT_TO_RIGHT_EMBEDDING = 14;
0466:
0467:            /**
0468:             * Strong bidirectional character type "LRO" in the Unicode specification.
0469:             * @since 1.4
0470:             */
0471:            public static final byte DIRECTIONALITY_LEFT_TO_RIGHT_OVERRIDE = 15;
0472:
0473:            /**
0474:             * Strong bidirectional character type "RLE" in the Unicode specification.
0475:             * @since 1.4
0476:             */
0477:            public static final byte DIRECTIONALITY_RIGHT_TO_LEFT_EMBEDDING = 16;
0478:
0479:            /**
0480:             * Strong bidirectional character type "RLO" in the Unicode specification.
0481:             * @since 1.4
0482:             */
0483:            public static final byte DIRECTIONALITY_RIGHT_TO_LEFT_OVERRIDE = 17;
0484:
0485:            /**
0486:             * Weak bidirectional character type "PDF" in the Unicode specification.
0487:             * @since 1.4
0488:             */
0489:            public static final byte DIRECTIONALITY_POP_DIRECTIONAL_FORMAT = 18;
0490:
0491:            /**
0492:             * The minimum value of a Unicode high-surrogate code unit in the
0493:             * UTF-16 encoding. A high-surrogate is also known as a
0494:             * <i>leading-surrogate</i>.
0495:             *
0496:             * @since 1.5
0497:             */
0498:            public static final char MIN_HIGH_SURROGATE = '\uD800';
0499:
0500:            /**
0501:             * The maximum value of a Unicode high-surrogate code unit in the
0502:             * UTF-16 encoding. A high-surrogate is also known as a
0503:             * <i>leading-surrogate</i>.
0504:             *
0505:             * @since 1.5
0506:             */
0507:            public static final char MAX_HIGH_SURROGATE = '\uDBFF';
0508:
0509:            /**
0510:             * The minimum value of a Unicode low-surrogate code unit in the
0511:             * UTF-16 encoding. A low-surrogate is also known as a
0512:             * <i>trailing-surrogate</i>.
0513:             *
0514:             * @since 1.5
0515:             */
0516:            public static final char MIN_LOW_SURROGATE = '\uDC00';
0517:
0518:            /**
0519:             * The maximum value of a Unicode low-surrogate code unit in the
0520:             * UTF-16 encoding. A low-surrogate is also known as a
0521:             * <i>trailing-surrogate</i>.
0522:             *
0523:             * @since 1.5
0524:             */
0525:            public static final char MAX_LOW_SURROGATE = '\uDFFF';
0526:
0527:            /**
0528:             * The minimum value of a Unicode surrogate code unit in the UTF-16 encoding.
0529:             *
0530:             * @since 1.5
0531:             */
0532:            public static final char MIN_SURROGATE = MIN_HIGH_SURROGATE;
0533:
0534:            /**
0535:             * The maximum value of a Unicode surrogate code unit in the UTF-16 encoding.
0536:             *
0537:             * @since 1.5
0538:             */
0539:            public static final char MAX_SURROGATE = MAX_LOW_SURROGATE;
0540:
0541:            /**
0542:             * The minimum value of a supplementary code point.
0543:             *
0544:             * @since 1.5
0545:             */
0546:            public static final int MIN_SUPPLEMENTARY_CODE_POINT = 0x010000;
0547:
0548:            /**
0549:             * The minimum value of a Unicode code point.
0550:             * 
0551:             * @since 1.5
0552:             */
0553:            public static final int MIN_CODE_POINT = 0x000000;
0554:
0555:            /**
0556:             * The maximum value of a Unicode code point.
0557:             *
0558:             * @since 1.5
0559:             */
0560:            public static final int MAX_CODE_POINT = 0x10ffff;
0561:
0562:            /**
0563:             * Instances of this class represent particular subsets of the Unicode
0564:             * character set.  The only family of subsets defined in the
0565:             * <code>Character</code> class is <code>{@link Character.UnicodeBlock
0566:             * UnicodeBlock}</code>.  Other portions of the Java API may define other
0567:             * subsets for their own purposes.
0568:             *
0569:             * @since 1.2
0570:             */
0571:            public static class Subset {
0572:
0573:                private String name;
0574:
0575:                /**
0576:                 * Constructs a new <code>Subset</code> instance.
0577:                 *
0578:                 * @exception NullPointerException if name is <code>null</code>
0579:                 * @param  name  The name of this subset
0580:                 */
0581:                protected Subset(String name) {
0582:                    if (name == null) {
0583:                        throw new NullPointerException("name");
0584:                    }
0585:                    this .name = name;
0586:                }
0587:
0588:                /**
0589:                 * Compares two <code>Subset</code> objects for equality.
0590:                 * This method returns <code>true</code> if and only if
0591:                 * <code>this</code> and the argument refer to the same
0592:                 * object; since this method is <code>final</code>, this
0593:                 * guarantee holds for all subclasses.
0594:                 */
0595:                public final boolean equals(Object obj) {
0596:                    return (this  == obj);
0597:                }
0598:
0599:                /**
0600:                 * Returns the standard hash code as defined by the
0601:                 * <code>{@link Object#hashCode}</code> method.  This method
0602:                 * is <code>final</code> in order to ensure that the
0603:                 * <code>equals</code> and <code>hashCode</code> methods will
0604:                 * be consistent in all subclasses.
0605:                 */
0606:                public final int hashCode() {
0607:                    return super .hashCode();
0608:                }
0609:
0610:                /**
0611:                 * Returns the name of this subset.
0612:                 */
0613:                public final String toString() {
0614:                    return name;
0615:                }
0616:            }
0617:
0618:            /**
0619:             * A family of character subsets representing the character blocks in the
0620:             * Unicode specification. Character blocks generally define characters
0621:             * used for a specific script or purpose. A character is contained by
0622:             * at most one Unicode block.
0623:             *
0624:             * @since 1.2
0625:             */
0626:            public static final class UnicodeBlock extends Subset {
0627:
0628:                private static Map map = new HashMap();
0629:
0630:                /**
0631:                 * Create a UnicodeBlock with the given identifier name. 
0632:                 * This name must be the same as the block identifier.
0633:                 */
0634:                private UnicodeBlock(String idName) {
0635:                    super (idName);
0636:                    map.put(idName.toUpperCase(Locale.US), this );
0637:                }
0638:
0639:                /**
0640:                 * Create a UnicodeBlock with the given identifier name and
0641:                 * alias name.
0642:                 */
0643:                private UnicodeBlock(String idName, String alias) {
0644:                    this (idName);
0645:                    map.put(alias.toUpperCase(Locale.US), this );
0646:                }
0647:
0648:                /** 
0649:                 * Create a UnicodeBlock with the given identifier name and 
0650:                 * alias names.
0651:                 */
0652:                private UnicodeBlock(String idName, String[] aliasName) {
0653:                    this (idName);
0654:                    if (aliasName != null) {
0655:                        for (int x = 0; x < aliasName.length; ++x) {
0656:                            map.put(aliasName[x].toUpperCase(Locale.US), this );
0657:                        }
0658:                    }
0659:                }
0660:
0661:                /**
0662:                 * Constant for the "Basic Latin" Unicode character block.
0663:                 * @since 1.2
0664:                 */
0665:                public static final UnicodeBlock BASIC_LATIN = new UnicodeBlock(
0666:                        "BASIC_LATIN", new String[] { "Basic Latin",
0667:                                "BasicLatin" });
0668:
0669:                /**
0670:                 * Constant for the "Latin-1 Supplement" Unicode character block.
0671:                 * @since 1.2
0672:                 */
0673:                public static final UnicodeBlock LATIN_1_SUPPLEMENT = new UnicodeBlock(
0674:                        "LATIN_1_SUPPLEMENT", new String[] {
0675:                                "Latin-1 Supplement", "Latin-1Supplement" });
0676:
0677:                /**
0678:                 * Constant for the "Latin Extended-A" Unicode character block.
0679:                 * @since 1.2
0680:                 */
0681:                public static final UnicodeBlock LATIN_EXTENDED_A = new UnicodeBlock(
0682:                        "LATIN_EXTENDED_A", new String[] { "Latin Extended-A",
0683:                                "LatinExtended-A" });
0684:
0685:                /**
0686:                 * Constant for the "Latin Extended-B" Unicode character block.
0687:                 * @since 1.2
0688:                 */
0689:                public static final UnicodeBlock LATIN_EXTENDED_B = new UnicodeBlock(
0690:                        "LATIN_EXTENDED_B", new String[] { "Latin Extended-B",
0691:                                "LatinExtended-B" });
0692:
0693:                /**
0694:                 * Constant for the "IPA Extensions" Unicode character block.
0695:                 * @since 1.2
0696:                 */
0697:                public static final UnicodeBlock IPA_EXTENSIONS = new UnicodeBlock(
0698:                        "IPA_EXTENSIONS", new String[] { "IPA Extensions",
0699:                                "IPAExtensions" });
0700:
0701:                /**
0702:                 * Constant for the "Spacing Modifier Letters" Unicode character block.
0703:                 * @since 1.2
0704:                 */
0705:                public static final UnicodeBlock SPACING_MODIFIER_LETTERS = new UnicodeBlock(
0706:                        "SPACING_MODIFIER_LETTERS", new String[] {
0707:                                "Spacing Modifier Letters",
0708:                                "SpacingModifierLetters" });
0709:
0710:                /**
0711:                 * Constant for the "Combining Diacritical Marks" Unicode character block.
0712:                 * @since 1.2
0713:                 */
0714:                public static final UnicodeBlock COMBINING_DIACRITICAL_MARKS = new UnicodeBlock(
0715:                        "COMBINING_DIACRITICAL_MARKS", new String[] {
0716:                                "Combining Diacritical Marks",
0717:                                "CombiningDiacriticalMarks" });
0718:
0719:                /**
0720:                 * Constant for the "Greek and Coptic" Unicode character block.
0721:                 * <p>
0722:                 * This block was previously known as the "Greek" block.
0723:                 *
0724:                 * @since 1.2
0725:                 */
0726:                public static final UnicodeBlock GREEK = new UnicodeBlock(
0727:                        "GREEK", new String[] { "Greek and Coptic",
0728:                                "GreekandCoptic" });
0729:
0730:                /**
0731:                 * Constant for the "Cyrillic" Unicode character block.
0732:                 * @since 1.2
0733:                 */
0734:                public static final UnicodeBlock CYRILLIC = new UnicodeBlock(
0735:                        "CYRILLIC");
0736:
0737:                /**
0738:                 * Constant for the "Armenian" Unicode character block.
0739:                 * @since 1.2
0740:                 */
0741:                public static final UnicodeBlock ARMENIAN = new UnicodeBlock(
0742:                        "ARMENIAN");
0743:
0744:                /**
0745:                 * Constant for the "Hebrew" Unicode character block.
0746:                 * @since 1.2
0747:                 */
0748:                public static final UnicodeBlock HEBREW = new UnicodeBlock(
0749:                        "HEBREW");
0750:
0751:                /**
0752:                 * Constant for the "Arabic" Unicode character block.
0753:                 * @since 1.2
0754:                 */
0755:                public static final UnicodeBlock ARABIC = new UnicodeBlock(
0756:                        "ARABIC");
0757:
0758:                /**
0759:                 * Constant for the "Devanagari" Unicode character block.
0760:                 * @since 1.2
0761:                 */
0762:                public static final UnicodeBlock DEVANAGARI = new UnicodeBlock(
0763:                        "DEVANAGARI");
0764:
0765:                /**
0766:                 * Constant for the "Bengali" Unicode character block.
0767:                 * @since 1.2
0768:                 */
0769:                public static final UnicodeBlock BENGALI = new UnicodeBlock(
0770:                        "BENGALI");
0771:
0772:                /**
0773:                 * Constant for the "Gurmukhi" Unicode character block.
0774:                 * @since 1.2
0775:                 */
0776:                public static final UnicodeBlock GURMUKHI = new UnicodeBlock(
0777:                        "GURMUKHI");
0778:
0779:                /**
0780:                 * Constant for the "Gujarati" Unicode character block.
0781:                 * @since 1.2
0782:                 */
0783:                public static final UnicodeBlock GUJARATI = new UnicodeBlock(
0784:                        "GUJARATI");
0785:
0786:                /**
0787:                 * Constant for the "Oriya" Unicode character block.
0788:                 * @since 1.2
0789:                 */
0790:                public static final UnicodeBlock ORIYA = new UnicodeBlock(
0791:                        "ORIYA");
0792:
0793:                /**
0794:                 * Constant for the "Tamil" Unicode character block.
0795:                 * @since 1.2
0796:                 */
0797:                public static final UnicodeBlock TAMIL = new UnicodeBlock(
0798:                        "TAMIL");
0799:
0800:                /**
0801:                 * Constant for the "Telugu" Unicode character block.
0802:                 * @since 1.2
0803:                 */
0804:                public static final UnicodeBlock TELUGU = new UnicodeBlock(
0805:                        "TELUGU");
0806:
0807:                /**
0808:                 * Constant for the "Kannada" Unicode character block.
0809:                 * @since 1.2
0810:                 */
0811:                public static final UnicodeBlock KANNADA = new UnicodeBlock(
0812:                        "KANNADA");
0813:
0814:                /**
0815:                 * Constant for the "Malayalam" Unicode character block.
0816:                 * @since 1.2
0817:                 */
0818:                public static final UnicodeBlock MALAYALAM = new UnicodeBlock(
0819:                        "MALAYALAM");
0820:
0821:                /**
0822:                 * Constant for the "Thai" Unicode character block.
0823:                 * @since 1.2
0824:                 */
0825:                public static final UnicodeBlock THAI = new UnicodeBlock("THAI");
0826:
0827:                /**
0828:                 * Constant for the "Lao" Unicode character block.
0829:                 * @since 1.2
0830:                 */
0831:                public static final UnicodeBlock LAO = new UnicodeBlock("LAO");
0832:
0833:                /**
0834:                 * Constant for the "Tibetan" Unicode character block.
0835:                 * @since 1.2
0836:                 */
0837:                public static final UnicodeBlock TIBETAN = new UnicodeBlock(
0838:                        "TIBETAN");
0839:
0840:                /**
0841:                 * Constant for the "Georgian" Unicode character block.
0842:                 * @since 1.2
0843:                 */
0844:                public static final UnicodeBlock GEORGIAN = new UnicodeBlock(
0845:                        "GEORGIAN");
0846:
0847:                /**
0848:                 * Constant for the "Hangul Jamo" Unicode character block.
0849:                 * @since 1.2
0850:                 */
0851:                public static final UnicodeBlock HANGUL_JAMO = new UnicodeBlock(
0852:                        "HANGUL_JAMO", new String[] { "Hangul Jamo",
0853:                                "HangulJamo" });
0854:
0855:                /**
0856:                 * Constant for the "Latin Extended Additional" Unicode character block.
0857:                 * @since 1.2
0858:                 */
0859:                public static final UnicodeBlock LATIN_EXTENDED_ADDITIONAL = new UnicodeBlock(
0860:                        "LATIN_EXTENDED_ADDITIONAL", new String[] {
0861:                                "Latin Extended Additional",
0862:                                "LatinExtendedAdditional" });
0863:
0864:                /**
0865:                 * Constant for the "Greek Extended" Unicode character block.
0866:                 * @since 1.2
0867:                 */
0868:                public static final UnicodeBlock GREEK_EXTENDED = new UnicodeBlock(
0869:                        "GREEK_EXTENDED", new String[] { "Greek Extended",
0870:                                "GreekExtended" });
0871:
0872:                /**
0873:                 * Constant for the "General Punctuation" Unicode character block.
0874:                 * @since 1.2
0875:                 */
0876:                public static final UnicodeBlock GENERAL_PUNCTUATION = new UnicodeBlock(
0877:                        "GENERAL_PUNCTUATION", new String[] {
0878:                                "General Punctuation", "GeneralPunctuation" });
0879:
0880:                /**
0881:                 * Constant for the "Superscripts and Subscripts" Unicode character block.
0882:                 * @since 1.2
0883:                 */
0884:                public static final UnicodeBlock SUPERSCRIPTS_AND_SUBSCRIPTS = new UnicodeBlock(
0885:                        "SUPERSCRIPTS_AND_SUBSCRIPTS", new String[] {
0886:                                "Superscripts and Subscripts",
0887:                                "SuperscriptsandSubscripts" });
0888:
0889:                /**
0890:                 * Constant for the "Currency Symbols" Unicode character block.
0891:                 * @since 1.2
0892:                 */
0893:                public static final UnicodeBlock CURRENCY_SYMBOLS = new UnicodeBlock(
0894:                        "CURRENCY_SYMBOLS", new String[] { "Currency Symbols",
0895:                                "CurrencySymbols" });
0896:
0897:                /**
0898:                 * Constant for the "Combining Diacritical Marks for Symbols" Unicode character block.
0899:                 * <p>
0900:                 * This block was previously known as "Combining Marks for Symbols".
0901:                 * @since 1.2
0902:                 */
0903:                public static final UnicodeBlock COMBINING_MARKS_FOR_SYMBOLS = new UnicodeBlock(
0904:                        "COMBINING_MARKS_FOR_SYMBOLS", new String[] {
0905:                                "Combining Diacritical Marks for Symbols",
0906:                                "CombiningDiacriticalMarksforSymbols",
0907:                                "Combining Marks for Symbols",
0908:                                "CombiningMarksforSymbols" });
0909:
0910:                /**
0911:                 * Constant for the "Letterlike Symbols" Unicode character block.
0912:                 * @since 1.2
0913:                 */
0914:                public static final UnicodeBlock LETTERLIKE_SYMBOLS = new UnicodeBlock(
0915:                        "LETTERLIKE_SYMBOLS", new String[] {
0916:                                "Letterlike Symbols", "LetterlikeSymbols" });
0917:
0918:                /**
0919:                 * Constant for the "Number Forms" Unicode character block.
0920:                 * @since 1.2
0921:                 */
0922:                public static final UnicodeBlock NUMBER_FORMS = new UnicodeBlock(
0923:                        "NUMBER_FORMS", new String[] { "Number Forms",
0924:                                "NumberForms" });
0925:
0926:                /**
0927:                 * Constant for the "Arrows" Unicode character block.
0928:                 * @since 1.2
0929:                 */
0930:                public static final UnicodeBlock ARROWS = new UnicodeBlock(
0931:                        "ARROWS");
0932:
0933:                /**
0934:                 * Constant for the "Mathematical Operators" Unicode character block.
0935:                 * @since 1.2
0936:                 */
0937:                public static final UnicodeBlock MATHEMATICAL_OPERATORS = new UnicodeBlock(
0938:                        "MATHEMATICAL_OPERATORS", new String[] {
0939:                                "Mathematical Operators",
0940:                                "MathematicalOperators" });
0941:
0942:                /**
0943:                 * Constant for the "Miscellaneous Technical" Unicode character block.
0944:                 * @since 1.2
0945:                 */
0946:                public static final UnicodeBlock MISCELLANEOUS_TECHNICAL = new UnicodeBlock(
0947:                        "MISCELLANEOUS_TECHNICAL", new String[] {
0948:                                "Miscellaneous Technical",
0949:                                "MiscellaneousTechnical" });
0950:
0951:                /**
0952:                 * Constant for the "Control Pictures" Unicode character block.
0953:                 * @since 1.2
0954:                 */
0955:                public static final UnicodeBlock CONTROL_PICTURES = new UnicodeBlock(
0956:                        "CONTROL_PICTURES", new String[] { "Control Pictures",
0957:                                "ControlPictures" });
0958:
0959:                /**
0960:                 * Constant for the "Optical Character Recognition" Unicode character block.
0961:                 * @since 1.2
0962:                 */
0963:                public static final UnicodeBlock OPTICAL_CHARACTER_RECOGNITION = new UnicodeBlock(
0964:                        "OPTICAL_CHARACTER_RECOGNITION", new String[] {
0965:                                "Optical Character Recognition",
0966:                                "OpticalCharacterRecognition" });
0967:
0968:                /**
0969:                 * Constant for the "Enclosed Alphanumerics" Unicode character block.
0970:                 * @since 1.2
0971:                 */
0972:                public static final UnicodeBlock ENCLOSED_ALPHANUMERICS = new UnicodeBlock(
0973:                        "ENCLOSED_ALPHANUMERICS", new String[] {
0974:                                "Enclosed Alphanumerics",
0975:                                "EnclosedAlphanumerics" });
0976:
0977:                /**
0978:                 * Constant for the "Box Drawing" Unicode character block.
0979:                 * @since 1.2
0980:                 */
0981:                public static final UnicodeBlock BOX_DRAWING = new UnicodeBlock(
0982:                        "BOX_DRAWING", new String[] { "Box Drawing",
0983:                                "BoxDrawing" });
0984:
0985:                /**
0986:                 * Constant for the "Block Elements" Unicode character block.
0987:                 * @since 1.2
0988:                 */
0989:                public static final UnicodeBlock BLOCK_ELEMENTS = new UnicodeBlock(
0990:                        "BLOCK_ELEMENTS", new String[] { "Block Elements",
0991:                                "BlockElements" });
0992:
0993:                /**
0994:                 * Constant for the "Geometric Shapes" Unicode character block.
0995:                 * @since 1.2
0996:                 */
0997:                public static final UnicodeBlock GEOMETRIC_SHAPES = new UnicodeBlock(
0998:                        "GEOMETRIC_SHAPES", new String[] { "Geometric Shapes",
0999:                                "GeometricShapes" });
1000:
1001:                /**
1002:                 * Constant for the "Miscellaneous Symbols" Unicode character block.
1003:                 * @since 1.2
1004:                 */
1005:                public static final UnicodeBlock MISCELLANEOUS_SYMBOLS = new UnicodeBlock(
1006:                        "MISCELLANEOUS_SYMBOLS",
1007:                        new String[] { "Miscellaneous Symbols",
1008:                                "MiscellaneousSymbols" });
1009:
1010:                /**
1011:                 * Constant for the "Dingbats" Unicode character block.
1012:                 * @since 1.2
1013:                 */
1014:                public static final UnicodeBlock DINGBATS = new UnicodeBlock(
1015:                        "DINGBATS");
1016:
1017:                /**
1018:                 * Constant for the "CJK Symbols and Punctuation" Unicode character block.
1019:                 * @since 1.2
1020:                 */
1021:                public static final UnicodeBlock CJK_SYMBOLS_AND_PUNCTUATION = new UnicodeBlock(
1022:                        "CJK_SYMBOLS_AND_PUNCTUATION", new String[] {
1023:                                "CJK Symbols and Punctuation",
1024:                                "CJKSymbolsandPunctuation" });
1025:
1026:                /**
1027:                 * Constant for the "Hiragana" Unicode character block.
1028:                 * @since 1.2
1029:                 */
1030:                public static final UnicodeBlock HIRAGANA = new UnicodeBlock(
1031:                        "HIRAGANA");
1032:
1033:                /**
1034:                 * Constant for the "Katakana" Unicode character block.
1035:                 * @since 1.2
1036:                 */
1037:                public static final UnicodeBlock KATAKANA = new UnicodeBlock(
1038:                        "KATAKANA");
1039:
1040:                /**
1041:                 * Constant for the "Bopomofo" Unicode character block.
1042:                 * @since 1.2
1043:                 */
1044:                public static final UnicodeBlock BOPOMOFO = new UnicodeBlock(
1045:                        "BOPOMOFO");
1046:
1047:                /**
1048:                 * Constant for the "Hangul Compatibility Jamo" Unicode character block.
1049:                 * @since 1.2
1050:                 */
1051:                public static final UnicodeBlock HANGUL_COMPATIBILITY_JAMO = new UnicodeBlock(
1052:                        "HANGUL_COMPATIBILITY_JAMO", new String[] {
1053:                                "Hangul Compatibility Jamo",
1054:                                "HangulCompatibilityJamo" });
1055:
1056:                /**
1057:                 * Constant for the "Kanbun" Unicode character block.
1058:                 * @since 1.2
1059:                 */
1060:                public static final UnicodeBlock KANBUN = new UnicodeBlock(
1061:                        "KANBUN");
1062:
1063:                /**
1064:                 * Constant for the "Enclosed CJK Letters and Months" Unicode character block.
1065:                 * @since 1.2
1066:                 */
1067:                public static final UnicodeBlock ENCLOSED_CJK_LETTERS_AND_MONTHS = new UnicodeBlock(
1068:                        "ENCLOSED_CJK_LETTERS_AND_MONTHS", new String[] {
1069:                                "Enclosed CJK Letters and Months",
1070:                                "EnclosedCJKLettersandMonths" });
1071:
1072:                /**
1073:                 * Constant for the "CJK Compatibility" Unicode character block.
1074:                 * @since 1.2
1075:                 */
1076:                public static final UnicodeBlock CJK_COMPATIBILITY = new UnicodeBlock(
1077:                        "CJK_COMPATIBILITY", new String[] {
1078:                                "CJK Compatibility", "CJKCompatibility" });
1079:
1080:                /**
1081:                 * Constant for the "CJK Unified Ideographs" Unicode character block.
1082:                 * @since 1.2
1083:                 */
1084:                public static final UnicodeBlock CJK_UNIFIED_IDEOGRAPHS = new UnicodeBlock(
1085:                        "CJK_UNIFIED_IDEOGRAPHS", new String[] {
1086:                                "CJK Unified Ideographs",
1087:                                "CJKUnifiedIdeographs" });
1088:
1089:                /**
1090:                 * Constant for the "Hangul Syllables" Unicode character block.
1091:                 * @since 1.2
1092:                 */
1093:                public static final UnicodeBlock HANGUL_SYLLABLES = new UnicodeBlock(
1094:                        "HANGUL_SYLLABLES", new String[] { "Hangul Syllables",
1095:                                "HangulSyllables" });
1096:
1097:                /**
1098:                 * Constant for the "Private Use Area" Unicode character block.
1099:                 * @since 1.2
1100:                 */
1101:                public static final UnicodeBlock PRIVATE_USE_AREA = new UnicodeBlock(
1102:                        "PRIVATE_USE_AREA", new String[] { "Private Use Area",
1103:                                "PrivateUseArea" });
1104:
1105:                /**
1106:                 * Constant for the "CJK Compatibility Ideographs" Unicode character block.
1107:                 * @since 1.2
1108:                 */
1109:                public static final UnicodeBlock CJK_COMPATIBILITY_IDEOGRAPHS = new UnicodeBlock(
1110:                        "CJK_COMPATIBILITY_IDEOGRAPHS", new String[] {
1111:                                "CJK Compatibility Ideographs",
1112:                                "CJKCompatibilityIdeographs" });
1113:
1114:                /**
1115:                 * Constant for the "Alphabetic Presentation Forms" Unicode character block.
1116:                 * @since 1.2
1117:                 */
1118:                public static final UnicodeBlock ALPHABETIC_PRESENTATION_FORMS = new UnicodeBlock(
1119:                        "ALPHABETIC_PRESENTATION_FORMS", new String[] {
1120:                                "Alphabetic Presentation Forms",
1121:                                "AlphabeticPresentationForms" });
1122:
1123:                /**
1124:                 * Constant for the "Arabic Presentation Forms-A" Unicode character block.
1125:                 * @since 1.2
1126:                 */
1127:                public static final UnicodeBlock ARABIC_PRESENTATION_FORMS_A = new UnicodeBlock(
1128:                        "ARABIC_PRESENTATION_FORMS_A", new String[] {
1129:                                "Arabic Presentation Forms-A",
1130:                                "ArabicPresentationForms-A" });
1131:
1132:                /**
1133:                 * Constant for the "Combining Half Marks" Unicode character block.
1134:                 * @since 1.2
1135:                 */
1136:                public static final UnicodeBlock COMBINING_HALF_MARKS = new UnicodeBlock(
1137:                        "COMBINING_HALF_MARKS", new String[] {
1138:                                "Combining Half Marks", "CombiningHalfMarks" });
1139:
1140:                /**
1141:                 * Constant for the "CJK Compatibility Forms" Unicode character block.
1142:                 * @since 1.2
1143:                 */
1144:                public static final UnicodeBlock CJK_COMPATIBILITY_FORMS = new UnicodeBlock(
1145:                        "CJK_COMPATIBILITY_FORMS", new String[] {
1146:                                "CJK Compatibility Forms",
1147:                                "CJKCompatibilityForms" });
1148:
1149:                /**
1150:                 * Constant for the "Small Form Variants" Unicode character block.
1151:                 * @since 1.2
1152:                 */
1153:                public static final UnicodeBlock SMALL_FORM_VARIANTS = new UnicodeBlock(
1154:                        "SMALL_FORM_VARIANTS", new String[] {
1155:                                "Small Form Variants", "SmallFormVariants" });
1156:
1157:                /**
1158:                 * Constant for the "Arabic Presentation Forms-B" Unicode character block.
1159:                 * @since 1.2
1160:                 */
1161:                public static final UnicodeBlock ARABIC_PRESENTATION_FORMS_B = new UnicodeBlock(
1162:                        "ARABIC_PRESENTATION_FORMS_B", new String[] {
1163:                                "Arabic Presentation Forms-B",
1164:                                "ArabicPresentationForms-B" });
1165:
1166:                /**
1167:                 * Constant for the "Halfwidth and Fullwidth Forms" Unicode character block.
1168:                 * @since 1.2
1169:                 */
1170:                public static final UnicodeBlock HALFWIDTH_AND_FULLWIDTH_FORMS = new UnicodeBlock(
1171:                        "HALFWIDTH_AND_FULLWIDTH_FORMS", new String[] {
1172:                                "Halfwidth and Fullwidth Forms",
1173:                                "HalfwidthandFullwidthForms" });
1174:
1175:                /**
1176:                 * Constant for the "Specials" Unicode character block.
1177:                 * @since 1.2
1178:                 */
1179:                public static final UnicodeBlock SPECIALS = new UnicodeBlock(
1180:                        "SPECIALS");
1181:
1182:                /**
1183:                 * @deprecated As of J2SE 5, use {@link #HIGH_SURROGATES},
1184:                 *             {@link #HIGH_PRIVATE_USE_SURROGATES}, and
1185:                 *             {@link #LOW_SURROGATES}. These new constants match 
1186:                 *             the block definitions of the Unicode Standard.
1187:                 *             The {@link #of(char)} and {@link #of(int)} methods
1188:                 *             return the new constants, not SURROGATES_AREA.
1189:                 */
1190:                @Deprecated
1191:                public static final UnicodeBlock SURROGATES_AREA = new UnicodeBlock(
1192:                        "SURROGATES_AREA");
1193:
1194:                /**
1195:                 * Constant for the "Syriac" Unicode character block.
1196:                 * @since 1.4
1197:                 */
1198:                public static final UnicodeBlock SYRIAC = new UnicodeBlock(
1199:                        "SYRIAC");
1200:
1201:                /**
1202:                 * Constant for the "Thaana" Unicode character block.
1203:                 * @since 1.4
1204:                 */
1205:                public static final UnicodeBlock THAANA = new UnicodeBlock(
1206:                        "THAANA");
1207:
1208:                /** 
1209:                 * Constant for the "Sinhala" Unicode character block.
1210:                 * @since 1.4
1211:                 */
1212:                public static final UnicodeBlock SINHALA = new UnicodeBlock(
1213:                        "SINHALA");
1214:
1215:                /**
1216:                 * Constant for the "Myanmar" Unicode character block.
1217:                 * @since 1.4
1218:                 */
1219:                public static final UnicodeBlock MYANMAR = new UnicodeBlock(
1220:                        "MYANMAR");
1221:
1222:                /**
1223:                 * Constant for the "Ethiopic" Unicode character block.
1224:                 * @since 1.4
1225:                 */
1226:                public static final UnicodeBlock ETHIOPIC = new UnicodeBlock(
1227:                        "ETHIOPIC");
1228:
1229:                /**
1230:                 * Constant for the "Cherokee" Unicode character block.
1231:                 * @since 1.4
1232:                 */
1233:                public static final UnicodeBlock CHEROKEE = new UnicodeBlock(
1234:                        "CHEROKEE");
1235:
1236:                /**
1237:                 * Constant for the "Unified Canadian Aboriginal Syllabics" Unicode character block.
1238:                 * @since 1.4
1239:                 */
1240:                public static final UnicodeBlock UNIFIED_CANADIAN_ABORIGINAL_SYLLABICS = new UnicodeBlock(
1241:                        "UNIFIED_CANADIAN_ABORIGINAL_SYLLABICS", new String[] {
1242:                                "Unified Canadian Aboriginal Syllabics",
1243:                                "UnifiedCanadianAboriginalSyllabics" });
1244:
1245:                /**
1246:                 * Constant for the "Ogham" Unicode character block.
1247:                 * @since 1.4
1248:                 */
1249:                public static final UnicodeBlock OGHAM = new UnicodeBlock(
1250:                        "OGHAM");
1251:
1252:                /**
1253:                 * Constant for the "Runic" Unicode character block.
1254:                 * @since 1.4
1255:                 */
1256:                public static final UnicodeBlock RUNIC = new UnicodeBlock(
1257:                        "RUNIC");
1258:
1259:                /**
1260:                 * Constant for the "Khmer" Unicode character block.
1261:                 * @since 1.4
1262:                 */
1263:                public static final UnicodeBlock KHMER = new UnicodeBlock(
1264:                        "KHMER");
1265:
1266:                /**
1267:                 * Constant for the "Mongolian" Unicode character block.
1268:                 * @since 1.4
1269:                 */
1270:                public static final UnicodeBlock MONGOLIAN = new UnicodeBlock(
1271:                        "MONGOLIAN");
1272:
1273:                /**
1274:                 * Constant for the "Braille Patterns" Unicode character block.
1275:                 * @since 1.4
1276:                 */
1277:                public static final UnicodeBlock BRAILLE_PATTERNS = new UnicodeBlock(
1278:                        "BRAILLE_PATTERNS", new String[] { "Braille Patterns",
1279:                                "BraillePatterns" });
1280:
1281:                /**
1282:                 * Constant for the "CJK Radicals Supplement" Unicode character block.
1283:                 * @since 1.4
1284:                 */
1285:                public static final UnicodeBlock CJK_RADICALS_SUPPLEMENT = new UnicodeBlock(
1286:                        "CJK_RADICALS_SUPPLEMENT", new String[] {
1287:                                "CJK Radicals Supplement",
1288:                                "CJKRadicalsSupplement" });
1289:
1290:                /**
1291:                 * Constant for the "Kangxi Radicals" Unicode character block.
1292:                 * @since 1.4
1293:                 */
1294:                public static final UnicodeBlock KANGXI_RADICALS = new UnicodeBlock(
1295:                        "KANGXI_RADICALS", new String[] { "Kangxi Radicals",
1296:                                "KangxiRadicals" });
1297:
1298:                /**
1299:                 * Constant for the "Ideographic Description Characters" Unicode character block.
1300:                 * @since 1.4
1301:                 */
1302:                public static final UnicodeBlock IDEOGRAPHIC_DESCRIPTION_CHARACTERS = new UnicodeBlock(
1303:                        "IDEOGRAPHIC_DESCRIPTION_CHARACTERS", new String[] {
1304:                                "Ideographic Description Characters",
1305:                                "IdeographicDescriptionCharacters" });
1306:
1307:                /**
1308:                 * Constant for the "Bopomofo Extended" Unicode character block.
1309:                 * @since 1.4
1310:                 */
1311:                public static final UnicodeBlock BOPOMOFO_EXTENDED = new UnicodeBlock(
1312:                        "BOPOMOFO_EXTENDED", new String[] {
1313:                                "Bopomofo Extended", "BopomofoExtended" });
1314:
1315:                /**
1316:                 * Constant for the "CJK Unified Ideographs Extension A" Unicode character block.
1317:                 * @since 1.4
1318:                 */
1319:                public static final UnicodeBlock CJK_UNIFIED_IDEOGRAPHS_EXTENSION_A = new UnicodeBlock(
1320:                        "CJK_UNIFIED_IDEOGRAPHS_EXTENSION_A", new String[] {
1321:                                "CJK Unified Ideographs Extension A",
1322:                                "CJKUnifiedIdeographsExtensionA" });
1323:
1324:                /**
1325:                 * Constant for the "Yi Syllables" Unicode character block.
1326:                 * @since 1.4
1327:                 */
1328:                public static final UnicodeBlock YI_SYLLABLES = new UnicodeBlock(
1329:                        "YI_SYLLABLES", new String[] { "Yi Syllables",
1330:                                "YiSyllables" });
1331:
1332:                /**
1333:                 * Constant for the "Yi Radicals" Unicode character block.
1334:                 * @since 1.4
1335:                 */
1336:                public static final UnicodeBlock YI_RADICALS = new UnicodeBlock(
1337:                        "YI_RADICALS", new String[] { "Yi Radicals",
1338:                                "YiRadicals" });
1339:
1340:                /**
1341:                 * Constant for the "Cyrillic Supplementary" Unicode character block.
1342:                 * @since 1.5
1343:                 */
1344:                public static final UnicodeBlock CYRILLIC_SUPPLEMENTARY = new UnicodeBlock(
1345:                        "CYRILLIC_SUPPLEMENTARY", new String[] {
1346:                                "Cyrillic Supplementary",
1347:                                "CyrillicSupplementary" });
1348:
1349:                /**
1350:                 * Constant for the "Tagalog" Unicode character block.
1351:                 * @since 1.5
1352:                 */
1353:                public static final UnicodeBlock TAGALOG = new UnicodeBlock(
1354:                        "TAGALOG");
1355:
1356:                /**
1357:                 * Constant for the "Hanunoo" Unicode character block.
1358:                 * @since 1.5
1359:                 */
1360:                public static final UnicodeBlock HANUNOO = new UnicodeBlock(
1361:                        "HANUNOO");
1362:
1363:                /**
1364:                 * Constant for the "Buhid" Unicode character block.
1365:                 * @since 1.5
1366:                 */
1367:                public static final UnicodeBlock BUHID = new UnicodeBlock(
1368:                        "BUHID");
1369:
1370:                /**
1371:                 * Constant for the "Tagbanwa" Unicode character block.
1372:                 * @since 1.5
1373:                 */
1374:                public static final UnicodeBlock TAGBANWA = new UnicodeBlock(
1375:                        "TAGBANWA");
1376:
1377:                /**
1378:                 * Constant for the "Limbu" Unicode character block.
1379:                 * @since 1.5
1380:                 */
1381:                public static final UnicodeBlock LIMBU = new UnicodeBlock(
1382:                        "LIMBU");
1383:
1384:                /**
1385:                 * Constant for the "Tai Le" Unicode character block.
1386:                 * @since 1.5
1387:                 */
1388:                public static final UnicodeBlock TAI_LE = new UnicodeBlock(
1389:                        "TAI_LE", new String[] { "Tai Le", "TaiLe" });
1390:
1391:                /**
1392:                 * Constant for the "Khmer Symbols" Unicode character block.
1393:                 * @since 1.5
1394:                 */
1395:                public static final UnicodeBlock KHMER_SYMBOLS = new UnicodeBlock(
1396:                        "KHMER_SYMBOLS", new String[] { "Khmer Symbols",
1397:                                "KhmerSymbols" });
1398:
1399:                /**
1400:                 * Constant for the "Phonetic Extensions" Unicode character block.
1401:                 * @since 1.5
1402:                 */
1403:                public static final UnicodeBlock PHONETIC_EXTENSIONS = new UnicodeBlock(
1404:                        "PHONETIC_EXTENSIONS", new String[] {
1405:                                "Phonetic Extensions", "PhoneticExtensions" });
1406:
1407:                /**
1408:                 * Constant for the "Miscellaneous Mathematical Symbols-A" Unicode character block.
1409:                 * @since 1.5
1410:                 */
1411:                public static final UnicodeBlock MISCELLANEOUS_MATHEMATICAL_SYMBOLS_A = new UnicodeBlock(
1412:                        "MISCELLANEOUS_MATHEMATICAL_SYMBOLS_A", new String[] {
1413:                                "Miscellaneous Mathematical Symbols-A",
1414:                                "MiscellaneousMathematicalSymbols-A" });
1415:
1416:                /**
1417:                 * Constant for the "Supplemental Arrows-A" Unicode character block.
1418:                 * @since 1.5
1419:                 */
1420:                public static final UnicodeBlock SUPPLEMENTAL_ARROWS_A = new UnicodeBlock(
1421:                        "SUPPLEMENTAL_ARROWS_A",
1422:                        new String[] { "Supplemental Arrows-A",
1423:                                "SupplementalArrows-A" });
1424:
1425:                /**
1426:                 * Constant for the "Supplemental Arrows-B" Unicode character block.
1427:                 * @since 1.5
1428:                 */
1429:                public static final UnicodeBlock SUPPLEMENTAL_ARROWS_B = new UnicodeBlock(
1430:                        "SUPPLEMENTAL_ARROWS_B",
1431:                        new String[] { "Supplemental Arrows-B",
1432:                                "SupplementalArrows-B" });
1433:
1434:                /**
1435:                 * Constant for the "Miscellaneous Mathematical Symbols-B" Unicode character block.
1436:                 * @since 1.5
1437:                 */
1438:                public static final UnicodeBlock MISCELLANEOUS_MATHEMATICAL_SYMBOLS_B = new UnicodeBlock(
1439:                        "MISCELLANEOUS_MATHEMATICAL_SYMBOLS_B", new String[] {
1440:                                "Miscellaneous Mathematical Symbols-B",
1441:                                "MiscellaneousMathematicalSymbols-B" });
1442:
1443:                /**
1444:                 * Constant for the "Supplemental Mathematical Operators" Unicode character block.
1445:                 * @since 1.5
1446:                 */
1447:                public static final UnicodeBlock SUPPLEMENTAL_MATHEMATICAL_OPERATORS = new UnicodeBlock(
1448:                        "SUPPLEMENTAL_MATHEMATICAL_OPERATORS", new String[] {
1449:                                "Supplemental Mathematical Operators",
1450:                                "SupplementalMathematicalOperators" });
1451:
1452:                /**
1453:                 * Constant for the "Miscellaneous Symbols and Arrows" Unicode character block.
1454:                 * @since 1.5
1455:                 */
1456:                public static final UnicodeBlock MISCELLANEOUS_SYMBOLS_AND_ARROWS = new UnicodeBlock(
1457:                        "MISCELLANEOUS_SYMBOLS_AND_ARROWS", new String[] {
1458:                                "Miscellaneous Symbols and Arrows",
1459:                                "MiscellaneousSymbolsandArrows" });
1460:
1461:                /**
1462:                 * Constant for the "Katakana Phonetic Extensions" Unicode character block.
1463:                 * @since 1.5
1464:                 */
1465:                public static final UnicodeBlock KATAKANA_PHONETIC_EXTENSIONS = new UnicodeBlock(
1466:                        "KATAKANA_PHONETIC_EXTENSIONS", new String[] {
1467:                                "Katakana Phonetic Extensions",
1468:                                "KatakanaPhoneticExtensions" });
1469:
1470:                /**
1471:                 * Constant for the "Yijing Hexagram Symbols" Unicode character block.
1472:                 * @since 1.5
1473:                 */
1474:                public static final UnicodeBlock YIJING_HEXAGRAM_SYMBOLS = new UnicodeBlock(
1475:                        "YIJING_HEXAGRAM_SYMBOLS", new String[] {
1476:                                "Yijing Hexagram Symbols",
1477:                                "YijingHexagramSymbols" });
1478:
1479:                /**
1480:                 * Constant for the "Variation Selectors" Unicode character block.
1481:                 * @since 1.5
1482:                 */
1483:                public static final UnicodeBlock VARIATION_SELECTORS = new UnicodeBlock(
1484:                        "VARIATION_SELECTORS", new String[] {
1485:                                "Variation Selectors", "VariationSelectors" });
1486:
1487:                /**
1488:                 * Constant for the "Linear B Syllabary" Unicode character block.
1489:                 * @since 1.5
1490:                 */
1491:                public static final UnicodeBlock LINEAR_B_SYLLABARY = new UnicodeBlock(
1492:                        "LINEAR_B_SYLLABARY", new String[] {
1493:                                "Linear B Syllabary", "LinearBSyllabary" });
1494:
1495:                /**
1496:                 * Constant for the "Linear B Ideograms" Unicode character block.
1497:                 * @since 1.5
1498:                 */
1499:                public static final UnicodeBlock LINEAR_B_IDEOGRAMS = new UnicodeBlock(
1500:                        "LINEAR_B_IDEOGRAMS", new String[] {
1501:                                "Linear B Ideograms", "LinearBIdeograms" });
1502:
1503:                /**
1504:                 * Constant for the "Aegean Numbers" Unicode character block.
1505:                 * @since 1.5
1506:                 */
1507:                public static final UnicodeBlock AEGEAN_NUMBERS = new UnicodeBlock(
1508:                        "AEGEAN_NUMBERS", new String[] { "Aegean Numbers",
1509:                                "AegeanNumbers" });
1510:
1511:                /**
1512:                 * Constant for the "Old Italic" Unicode character block.
1513:                 * @since 1.5
1514:                 */
1515:                public static final UnicodeBlock OLD_ITALIC = new UnicodeBlock(
1516:                        "OLD_ITALIC",
1517:                        new String[] { "Old Italic", "OldItalic" });
1518:
1519:                /**
1520:                 * Constant for the "Gothic" Unicode character block.
1521:                 * @since 1.5
1522:                 */
1523:                public static final UnicodeBlock GOTHIC = new UnicodeBlock(
1524:                        "GOTHIC");
1525:
1526:                /**
1527:                 * Constant for the "Ugaritic" Unicode character block.
1528:                 * @since 1.5
1529:                 */
1530:                public static final UnicodeBlock UGARITIC = new UnicodeBlock(
1531:                        "UGARITIC");
1532:
1533:                /**
1534:                 * Constant for the "Deseret" Unicode character block.
1535:                 * @since 1.5
1536:                 */
1537:                public static final UnicodeBlock DESERET = new UnicodeBlock(
1538:                        "DESERET");
1539:
1540:                /**
1541:                 * Constant for the "Shavian" Unicode character block.
1542:                 * @since 1.5
1543:                 */
1544:                public static final UnicodeBlock SHAVIAN = new UnicodeBlock(
1545:                        "SHAVIAN");
1546:
1547:                /**
1548:                 * Constant for the "Osmanya" Unicode character block.
1549:                 * @since 1.5
1550:                 */
1551:                public static final UnicodeBlock OSMANYA = new UnicodeBlock(
1552:                        "OSMANYA");
1553:
1554:                /**
1555:                 * Constant for the "Cypriot Syllabary" Unicode character block.
1556:                 * @since 1.5
1557:                 */
1558:                public static final UnicodeBlock CYPRIOT_SYLLABARY = new UnicodeBlock(
1559:                        "CYPRIOT_SYLLABARY", new String[] {
1560:                                "Cypriot Syllabary", "CypriotSyllabary" });
1561:
1562:                /**
1563:                 * Constant for the "Byzantine Musical Symbols" Unicode character block.
1564:                 * @since 1.5
1565:                 */
1566:                public static final UnicodeBlock BYZANTINE_MUSICAL_SYMBOLS = new UnicodeBlock(
1567:                        "BYZANTINE_MUSICAL_SYMBOLS", new String[] {
1568:                                "Byzantine Musical Symbols",
1569:                                "ByzantineMusicalSymbols" });
1570:
1571:                /**
1572:                 * Constant for the "Musical Symbols" Unicode character block.
1573:                 * @since 1.5
1574:                 */
1575:                public static final UnicodeBlock MUSICAL_SYMBOLS = new UnicodeBlock(
1576:                        "MUSICAL_SYMBOLS", new String[] { "Musical Symbols",
1577:                                "MusicalSymbols" });
1578:
1579:                /**
1580:                 * Constant for the "Tai Xuan Jing Symbols" Unicode character block.
1581:                 * @since 1.5
1582:                 */
1583:                public static final UnicodeBlock TAI_XUAN_JING_SYMBOLS = new UnicodeBlock(
1584:                        "TAI_XUAN_JING_SYMBOLS", new String[] {
1585:                                "Tai Xuan Jing Symbols", "TaiXuanJingSymbols" });
1586:
1587:                /**
1588:                 * Constant for the "Mathematical Alphanumeric Symbols" Unicode character block.
1589:                 * @since 1.5
1590:                 */
1591:                public static final UnicodeBlock MATHEMATICAL_ALPHANUMERIC_SYMBOLS = new UnicodeBlock(
1592:                        "MATHEMATICAL_ALPHANUMERIC_SYMBOLS", new String[] {
1593:                                "Mathematical Alphanumeric Symbols",
1594:                                "MathematicalAlphanumericSymbols" });
1595:
1596:                /**
1597:                 * Constant for the "CJK Unified Ideographs Extension B" Unicode character block.
1598:                 * @since 1.5
1599:                 */
1600:                public static final UnicodeBlock CJK_UNIFIED_IDEOGRAPHS_EXTENSION_B = new UnicodeBlock(
1601:                        "CJK_UNIFIED_IDEOGRAPHS_EXTENSION_B", new String[] {
1602:                                "CJK Unified Ideographs Extension B",
1603:                                "CJKUnifiedIdeographsExtensionB" });
1604:
1605:                /**
1606:                 * Constant for the "CJK Compatibility Ideographs Supplement" Unicode character block.
1607:                 * @since 1.5
1608:                 */
1609:                public static final UnicodeBlock CJK_COMPATIBILITY_IDEOGRAPHS_SUPPLEMENT = new UnicodeBlock(
1610:                        "CJK_COMPATIBILITY_IDEOGRAPHS_SUPPLEMENT",
1611:                        new String[] {
1612:                                "CJK Compatibility Ideographs Supplement",
1613:                                "CJKCompatibilityIdeographsSupplement" });
1614:
1615:                /**
1616:                 * Constant for the "Tags" Unicode character block.
1617:                 * @since 1.5
1618:                 */
1619:                public static final UnicodeBlock TAGS = new UnicodeBlock("TAGS");
1620:
1621:                /**
1622:                 * Constant for the "Variation Selectors Supplement" Unicode character block.
1623:                 * @since 1.5
1624:                 */
1625:                public static final UnicodeBlock VARIATION_SELECTORS_SUPPLEMENT = new UnicodeBlock(
1626:                        "VARIATION_SELECTORS_SUPPLEMENT", new String[] {
1627:                                "Variation Selectors Supplement",
1628:                                "VariationSelectorsSupplement" });
1629:
1630:                /**
1631:                 * Constant for the "Supplementary Private Use Area-A" Unicode character block.
1632:                 * @since 1.5
1633:                 */
1634:                public static final UnicodeBlock SUPPLEMENTARY_PRIVATE_USE_AREA_A = new UnicodeBlock(
1635:                        "SUPPLEMENTARY_PRIVATE_USE_AREA_A", new String[] {
1636:                                "Supplementary Private Use Area-A",
1637:                                "SupplementaryPrivateUseArea-A" });
1638:
1639:                /**
1640:                 * Constant for the "Supplementary Private Use Area-B" Unicode character block.
1641:                 * @since 1.5
1642:                 */
1643:                public static final UnicodeBlock SUPPLEMENTARY_PRIVATE_USE_AREA_B = new UnicodeBlock(
1644:                        "SUPPLEMENTARY_PRIVATE_USE_AREA_B", new String[] {
1645:                                "Supplementary Private Use Area-B",
1646:                                "SupplementaryPrivateUseArea-B" });
1647:
1648:                /**
1649:                 * Constant for the "High Surrogates" Unicode character block.
1650:                 * This block represents codepoint values in the high surrogate
1651:                 * range: 0xD800 through 0xDB7F
1652:                 *
1653:                 * @since 1.5
1654:                 */
1655:                public static final UnicodeBlock HIGH_SURROGATES = new UnicodeBlock(
1656:                        "HIGH_SURROGATES", new String[] { "High Surrogates",
1657:                                "HighSurrogates" });
1658:
1659:                /**
1660:                 * Constant for the "High Private Use Surrogates" Unicode character block.
1661:                 * This block represents codepoint values in the high surrogate
1662:                 * range: 0xDB80 through 0xDBFF
1663:                 *
1664:                 * @since 1.5
1665:                 */
1666:                public static final UnicodeBlock HIGH_PRIVATE_USE_SURROGATES = new UnicodeBlock(
1667:                        "HIGH_PRIVATE_USE_SURROGATES", new String[] {
1668:                                "High Private Use Surrogates",
1669:                                "HighPrivateUseSurrogates" });
1670:
1671:                /**
1672:                 * Constant for the "Low Surrogates" Unicode character block.
1673:                 * This block represents codepoint values in the high surrogate
1674:                 * range: 0xDC00 through 0xDFFF
1675:                 *
1676:                 * @since 1.5
1677:                 */
1678:                public static final UnicodeBlock LOW_SURROGATES = new UnicodeBlock(
1679:                        "LOW_SURROGATES", new String[] { "Low Surrogates",
1680:                                "LowSurrogates" });
1681:
1682:                private static final int blockStarts[] = { 0x0000, // Basic Latin
1683:                        0x0080, // Latin-1 Supplement
1684:                        0x0100, // Latin Extended-A
1685:                        0x0180, // Latin Extended-B
1686:                        0x0250, // IPA Extensions
1687:                        0x02B0, // Spacing Modifier Letters
1688:                        0x0300, // Combining Diacritical Marks
1689:                        0x0370, // Greek and Coptic
1690:                        0x0400, // Cyrillic
1691:                        0x0500, // Cyrillic Supplementary
1692:                        0x0530, // Armenian
1693:                        0x0590, // Hebrew
1694:                        0x0600, // Arabic
1695:                        0x0700, // Syriac
1696:                        0x0750, // unassigned
1697:                        0x0780, // Thaana
1698:                        0x07C0, // unassigned
1699:                        0x0900, // Devanagari
1700:                        0x0980, // Bengali
1701:                        0x0A00, // Gurmukhi
1702:                        0x0A80, // Gujarati
1703:                        0x0B00, // Oriya
1704:                        0x0B80, // Tamil
1705:                        0x0C00, // Telugu
1706:                        0x0C80, // Kannada
1707:                        0x0D00, // Malayalam
1708:                        0x0D80, // Sinhala
1709:                        0x0E00, // Thai
1710:                        0x0E80, // Lao
1711:                        0x0F00, // Tibetan
1712:                        0x1000, // Myanmar
1713:                        0x10A0, // Georgian
1714:                        0x1100, // Hangul Jamo
1715:                        0x1200, // Ethiopic
1716:                        0x1380, // unassigned
1717:                        0x13A0, // Cherokee
1718:                        0x1400, // Unified Canadian Aboriginal Syllabics
1719:                        0x1680, // Ogham
1720:                        0x16A0, // Runic
1721:                        0x1700, // Tagalog
1722:                        0x1720, // Hanunoo
1723:                        0x1740, // Buhid
1724:                        0x1760, // Tagbanwa
1725:                        0x1780, // Khmer
1726:                        0x1800, // Mongolian
1727:                        0x18B0, // unassigned
1728:                        0x1900, // Limbu
1729:                        0x1950, // Tai Le
1730:                        0x1980, // unassigned
1731:                        0x19E0, // Khmer Symbols
1732:                        0x1A00, // unassigned
1733:                        0x1D00, // Phonetic Extensions
1734:                        0x1D80, // unassigned
1735:                        0x1E00, // Latin Extended Additional
1736:                        0x1F00, // Greek Extended
1737:                        0x2000, // General Punctuation
1738:                        0x2070, // Superscripts and Subscripts
1739:                        0x20A0, // Currency Symbols
1740:                        0x20D0, // Combining Diacritical Marks for Symbols
1741:                        0x2100, // Letterlike Symbols
1742:                        0x2150, // Number Forms
1743:                        0x2190, // Arrows
1744:                        0x2200, // Mathematical Operators
1745:                        0x2300, // Miscellaneous Technical
1746:                        0x2400, // Control Pictures
1747:                        0x2440, // Optical Character Recognition
1748:                        0x2460, // Enclosed Alphanumerics
1749:                        0x2500, // Box Drawing
1750:                        0x2580, // Block Elements
1751:                        0x25A0, // Geometric Shapes
1752:                        0x2600, // Miscellaneous Symbols
1753:                        0x2700, // Dingbats
1754:                        0x27C0, // Miscellaneous Mathematical Symbols-A
1755:                        0x27F0, // Supplemental Arrows-A
1756:                        0x2800, // Braille Patterns
1757:                        0x2900, // Supplemental Arrows-B
1758:                        0x2980, // Miscellaneous Mathematical Symbols-B
1759:                        0x2A00, // Supplemental Mathematical Operators
1760:                        0x2B00, // Miscellaneous Symbols and Arrows
1761:                        0x2C00, // unassigned
1762:                        0x2E80, // CJK Radicals Supplement
1763:                        0x2F00, // Kangxi Radicals
1764:                        0x2FE0, // unassigned
1765:                        0x2FF0, // Ideographic Description Characters
1766:                        0x3000, // CJK Symbols and Punctuation
1767:                        0x3040, // Hiragana
1768:                        0x30A0, // Katakana
1769:                        0x3100, // Bopomofo
1770:                        0x3130, // Hangul Compatibility Jamo
1771:                        0x3190, // Kanbun
1772:                        0x31A0, // Bopomofo Extended
1773:                        0x31C0, // unassigned
1774:                        0x31F0, // Katakana Phonetic Extensions
1775:                        0x3200, // Enclosed CJK Letters and Months
1776:                        0x3300, // CJK Compatibility
1777:                        0x3400, // CJK Unified Ideographs Extension A
1778:                        0x4DC0, // Yijing Hexagram Symbols
1779:                        0x4E00, // CJK Unified Ideographs
1780:                        0xA000, // Yi Syllables
1781:                        0xA490, // Yi Radicals
1782:                        0xA4D0, // unassigned
1783:                        0xAC00, // Hangul Syllables
1784:                        0xD7B0, // unassigned
1785:                        0xD800, // High Surrogates
1786:                        0xDB80, // High Private Use Surrogates
1787:                        0xDC00, // Low Surrogates
1788:                        0xE000, // Private Use
1789:                        0xF900, // CJK Compatibility Ideographs
1790:                        0xFB00, // Alphabetic Presentation Forms
1791:                        0xFB50, // Arabic Presentation Forms-A
1792:                        0xFE00, // Variation Selectors
1793:                        0xFE10, // unassigned
1794:                        0xFE20, // Combining Half Marks
1795:                        0xFE30, // CJK Compatibility Forms
1796:                        0xFE50, // Small Form Variants
1797:                        0xFE70, // Arabic Presentation Forms-B
1798:                        0xFF00, // Halfwidth and Fullwidth Forms
1799:                        0xFFF0, // Specials
1800:                        0x10000, // Linear B Syllabary
1801:                        0x10080, // Linear B Ideograms
1802:                        0x10100, // Aegean Numbers
1803:                        0x10140, // unassigned
1804:                        0x10300, // Old Italic
1805:                        0x10330, // Gothic
1806:                        0x10350, // unassigned
1807:                        0x10380, // Ugaritic
1808:                        0x103A0, // unassigned
1809:                        0x10400, // Deseret
1810:                        0x10450, // Shavian
1811:                        0x10480, // Osmanya
1812:                        0x104B0, // unassigned
1813:                        0x10800, // Cypriot Syllabary
1814:                        0x10840, // unassigned
1815:                        0x1D000, // Byzantine Musical Symbols
1816:                        0x1D100, // Musical Symbols
1817:                        0x1D200, // unassigned
1818:                        0x1D300, // Tai Xuan Jing Symbols
1819:                        0x1D360, // unassigned
1820:                        0x1D400, // Mathematical Alphanumeric Symbols
1821:                        0x1D800, // unassigned
1822:                        0x20000, // CJK Unified Ideographs Extension B
1823:                        0x2A6E0, // unassigned
1824:                        0x2F800, // CJK Compatibility Ideographs Supplement
1825:                        0x2FA20, // unassigned
1826:                        0xE0000, // Tags
1827:                        0xE0080, // unassigned
1828:                        0xE0100, // Variation Selectors Supplement
1829:                        0xE01F0, // unassigned
1830:                        0xF0000, // Supplementary Private Use Area-A
1831:                        0x100000, // Supplementary Private Use Area-B
1832:                };
1833:
1834:                private static final UnicodeBlock[] blocks = { BASIC_LATIN,
1835:                        LATIN_1_SUPPLEMENT, LATIN_EXTENDED_A, LATIN_EXTENDED_B,
1836:                        IPA_EXTENSIONS, SPACING_MODIFIER_LETTERS,
1837:                        COMBINING_DIACRITICAL_MARKS, GREEK, CYRILLIC,
1838:                        CYRILLIC_SUPPLEMENTARY, ARMENIAN, HEBREW, ARABIC,
1839:                        SYRIAC, null, THAANA, null, DEVANAGARI, BENGALI,
1840:                        GURMUKHI, GUJARATI, ORIYA, TAMIL, TELUGU, KANNADA,
1841:                        MALAYALAM, SINHALA, THAI, LAO, TIBETAN, MYANMAR,
1842:                        GEORGIAN, HANGUL_JAMO, ETHIOPIC, null, CHEROKEE,
1843:                        UNIFIED_CANADIAN_ABORIGINAL_SYLLABICS, OGHAM, RUNIC,
1844:                        TAGALOG, HANUNOO, BUHID, TAGBANWA, KHMER, MONGOLIAN,
1845:                        null, LIMBU, TAI_LE, null, KHMER_SYMBOLS, null,
1846:                        PHONETIC_EXTENSIONS, null, LATIN_EXTENDED_ADDITIONAL,
1847:                        GREEK_EXTENDED, GENERAL_PUNCTUATION,
1848:                        SUPERSCRIPTS_AND_SUBSCRIPTS, CURRENCY_SYMBOLS,
1849:                        COMBINING_MARKS_FOR_SYMBOLS, LETTERLIKE_SYMBOLS,
1850:                        NUMBER_FORMS, ARROWS, MATHEMATICAL_OPERATORS,
1851:                        MISCELLANEOUS_TECHNICAL, CONTROL_PICTURES,
1852:                        OPTICAL_CHARACTER_RECOGNITION, ENCLOSED_ALPHANUMERICS,
1853:                        BOX_DRAWING, BLOCK_ELEMENTS, GEOMETRIC_SHAPES,
1854:                        MISCELLANEOUS_SYMBOLS, DINGBATS,
1855:                        MISCELLANEOUS_MATHEMATICAL_SYMBOLS_A,
1856:                        SUPPLEMENTAL_ARROWS_A, BRAILLE_PATTERNS,
1857:                        SUPPLEMENTAL_ARROWS_B,
1858:                        MISCELLANEOUS_MATHEMATICAL_SYMBOLS_B,
1859:                        SUPPLEMENTAL_MATHEMATICAL_OPERATORS,
1860:                        MISCELLANEOUS_SYMBOLS_AND_ARROWS, null,
1861:                        CJK_RADICALS_SUPPLEMENT, KANGXI_RADICALS, null,
1862:                        IDEOGRAPHIC_DESCRIPTION_CHARACTERS,
1863:                        CJK_SYMBOLS_AND_PUNCTUATION, HIRAGANA, KATAKANA,
1864:                        BOPOMOFO, HANGUL_COMPATIBILITY_JAMO, KANBUN,
1865:                        BOPOMOFO_EXTENDED, null, KATAKANA_PHONETIC_EXTENSIONS,
1866:                        ENCLOSED_CJK_LETTERS_AND_MONTHS, CJK_COMPATIBILITY,
1867:                        CJK_UNIFIED_IDEOGRAPHS_EXTENSION_A,
1868:                        YIJING_HEXAGRAM_SYMBOLS, CJK_UNIFIED_IDEOGRAPHS,
1869:                        YI_SYLLABLES, YI_RADICALS, null, HANGUL_SYLLABLES,
1870:                        null, HIGH_SURROGATES, HIGH_PRIVATE_USE_SURROGATES,
1871:                        LOW_SURROGATES, PRIVATE_USE_AREA,
1872:                        CJK_COMPATIBILITY_IDEOGRAPHS,
1873:                        ALPHABETIC_PRESENTATION_FORMS,
1874:                        ARABIC_PRESENTATION_FORMS_A, VARIATION_SELECTORS, null,
1875:                        COMBINING_HALF_MARKS, CJK_COMPATIBILITY_FORMS,
1876:                        SMALL_FORM_VARIANTS, ARABIC_PRESENTATION_FORMS_B,
1877:                        HALFWIDTH_AND_FULLWIDTH_FORMS, SPECIALS,
1878:                        LINEAR_B_SYLLABARY, LINEAR_B_IDEOGRAMS, AEGEAN_NUMBERS,
1879:                        null, OLD_ITALIC, GOTHIC, null, UGARITIC, null,
1880:                        DESERET, SHAVIAN, OSMANYA, null, CYPRIOT_SYLLABARY,
1881:                        null, BYZANTINE_MUSICAL_SYMBOLS, MUSICAL_SYMBOLS, null,
1882:                        TAI_XUAN_JING_SYMBOLS, null,
1883:                        MATHEMATICAL_ALPHANUMERIC_SYMBOLS, null,
1884:                        CJK_UNIFIED_IDEOGRAPHS_EXTENSION_B, null,
1885:                        CJK_COMPATIBILITY_IDEOGRAPHS_SUPPLEMENT, null, TAGS,
1886:                        null, VARIATION_SELECTORS_SUPPLEMENT, null,
1887:                        SUPPLEMENTARY_PRIVATE_USE_AREA_A,
1888:                        SUPPLEMENTARY_PRIVATE_USE_AREA_B };
1889:
1890:                /**
1891:                 * Returns the object representing the Unicode block containing the
1892:                 * given character, or <code>null</code> if the character is not a
1893:                 * member of a defined block.
1894:                 *
1895:                 * <p><b>Note:</b> This method cannot handle <a
1896:                 * href="Character.html#supplementary"> supplementary
1897:                 * characters</a>. To support all Unicode characters,
1898:                 * including supplementary characters, use the {@link
1899:                 * #of(int)} method.
1900:                 *
1901:                 * @param   c  The character in question
1902:                 * @return  The <code>UnicodeBlock</code> instance representing the
1903:                 *          Unicode block of which this character is a member, or
1904:                 *          <code>null</code> if the character is not a member of any
1905:                 *          Unicode block
1906:                 */
1907:                public static UnicodeBlock of(char c) {
1908:                    return of((int) c);
1909:                }
1910:
1911:                /**
1912:                 * Returns the object representing the Unicode block
1913:                 * containing the given character (Unicode code point), or
1914:                 * <code>null</code> if the character is not a member of a
1915:                 * defined block.
1916:                 *
1917:                 * @param   codePoint the character (Unicode code point) in question.
1918:                 * @return  The <code>UnicodeBlock</code> instance representing the
1919:                 *          Unicode block of which this character is a member, or
1920:                 *          <code>null</code> if the character is not a member of any
1921:                 *          Unicode block
1922:                 * @exception IllegalArgumentException if the specified
1923:                 * <code>codePoint</code> is an invalid Unicode code point.
1924:                 * @see Character#isValidCodePoint(int)
1925:                 * @since   1.5
1926:                 */
1927:                public static UnicodeBlock of(int codePoint) {
1928:                    if (!isValidCodePoint(codePoint)) {
1929:                        throw new IllegalArgumentException();
1930:                    }
1931:
1932:                    int top, bottom, current;
1933:                    bottom = 0;
1934:                    top = blockStarts.length;
1935:                    current = top / 2;
1936:
1937:                    // invariant: top > current >= bottom && codePoint >= unicodeBlockStarts[bottom]
1938:                    while (top - bottom > 1) {
1939:                        if (codePoint >= blockStarts[current]) {
1940:                            bottom = current;
1941:                        } else {
1942:                            top = current;
1943:                        }
1944:                        current = (top + bottom) / 2;
1945:                    }
1946:                    return blocks[current];
1947:                }
1948:
1949:                /**
1950:                 * Returns the UnicodeBlock with the given name. Block
1951:                 * names are determined by The Unicode Standard. The file
1952:                 * Blocks-&lt;version&gt;.txt defines blocks for a particular
1953:                 * version of the standard. The {@link Character} class specifies
1954:                 * the version of the standard that it supports.
1955:                 * <p>
1956:                 * This method accepts block names in the following forms:
1957:                 * <ol>
1958:                 * <li> Canonical block names as defined by the Unicode Standard.
1959:                 * For example, the standard defines a "Basic Latin" block. Therefore, this
1960:                 * method accepts "Basic Latin" as a valid block name. The documentation of 
1961:                 * each UnicodeBlock provides the canonical name.
1962:                 * <li>Canonical block names with all spaces removed. For example, "BasicLatin"
1963:                 * is a valid block name for the "Basic Latin" block.
1964:                 * <li>The text representation of each constant UnicodeBlock identifier.
1965:                 * For example, this method will return the {@link #BASIC_LATIN} block if
1966:                 * provided with the "BASIC_LATIN" name. This form replaces all spaces and
1967:                 *  hyphens in the canonical name with underscores.
1968:                 * </ol>
1969:                 * Finally, character case is ignored for all of the valid block name forms.
1970:                 * For example, "BASIC_LATIN" and "basic_latin" are both valid block names.
1971:                 * The en_US locale's case mapping rules are used to provide case-insensitive
1972:                 * string comparisons for block name validation.
1973:                 * <p>
1974:                 * If the Unicode Standard changes block names, both the previous and
1975:                 * current names will be accepted.
1976:                 *
1977:                 * @param blockName A <code>UnicodeBlock</code> name.
1978:                 * @return The <code>UnicodeBlock</code> instance identified
1979:                 *         by <code>blockName</code>
1980:                 * @throws IllegalArgumentException if <code>blockName</code> is an
1981:                 *         invalid name
1982:                 * @throws NullPointerException if <code>blockName</code> is null
1983:                 * @since 1.5
1984:                 */
1985:                public static final UnicodeBlock forName(String blockName) {
1986:                    UnicodeBlock block = (UnicodeBlock) map.get(blockName
1987:                            .toUpperCase(Locale.US));
1988:                    if (block == null) {
1989:                        throw new IllegalArgumentException();
1990:                    }
1991:                    return block;
1992:                }
1993:            }
1994:
1995:            /**
1996:             * The value of the <code>Character</code>.
1997:             *
1998:             * @serial
1999:             */
2000:            private final char value;
2001:
2002:            /** use serialVersionUID from JDK 1.0.2 for interoperability */
2003:            private static final long serialVersionUID = 3786198910865385080L;
2004:
2005:            /**
2006:             * Constructs a newly allocated <code>Character</code> object that
2007:             * represents the specified <code>char</code> value.
2008:             *
2009:             * @param  value   the value to be represented by the 
2010:             *                  <code>Character</code> object.
2011:             */
2012:            public Character(char value) {
2013:                this .value = value;
2014:            }
2015:
2016:            private static class CharacterCache {
2017:                private CharacterCache() {
2018:                }
2019:
2020:                static final Character cache[] = new Character[127 + 1];
2021:
2022:                static {
2023:                    for (int i = 0; i < cache.length; i++)
2024:                        cache[i] = new Character((char) i);
2025:                }
2026:            }
2027:
2028:            /**
2029:             * Returns a <tt>Character</tt> instance representing the specified
2030:             * <tt>char</tt> value.
2031:             * If a new <tt>Character</tt> instance is not required, this method
2032:             * should generally be used in preference to the constructor
2033:             * {@link #Character(char)}, as this method is likely to yield
2034:             * significantly better space and time performance by caching
2035:             * frequently requested values.
2036:             *
2037:             * @param  c a char value.
2038:             * @return a <tt>Character</tt> instance representing <tt>c</tt>.
2039:             * @since  1.5
2040:             */
2041:            public static Character valueOf(char c) {
2042:                if (c <= 127) { // must cache
2043:                    return CharacterCache.cache[(int) c];
2044:                }
2045:                return new Character(c);
2046:            }
2047:
2048:            /**
2049:             * Returns the value of this <code>Character</code> object.
2050:             * @return  the primitive <code>char</code> value represented by
2051:             *          this object.
2052:             */
2053:            public char charValue() {
2054:                return value;
2055:            }
2056:
2057:            /**
2058:             * Returns a hash code for this <code>Character</code>.
2059:             * @return  a hash code value for this object.
2060:             */
2061:            public int hashCode() {
2062:                return (int) value;
2063:            }
2064:
2065:            /**
2066:             * Compares this object against the specified object.
2067:             * The result is <code>true</code> if and only if the argument is not
2068:             * <code>null</code> and is a <code>Character</code> object that
2069:             * represents the same <code>char</code> value as this object.
2070:             *
2071:             * @param   obj   the object to compare with.
2072:             * @return  <code>true</code> if the objects are the same;
2073:             *          <code>false</code> otherwise.
2074:             */
2075:            public boolean equals(Object obj) {
2076:                if (obj instanceof  Character) {
2077:                    return value == ((Character) obj).charValue();
2078:                }
2079:                return false;
2080:            }
2081:
2082:            /**
2083:             * Returns a <code>String</code> object representing this
2084:             * <code>Character</code>'s value.  The result is a string of
2085:             * length 1 whose sole component is the primitive
2086:             * <code>char</code> value represented by this
2087:             * <code>Character</code> object.
2088:             *
2089:             * @return  a string representation of this object.
2090:             */
2091:            public String toString() {
2092:                char buf[] = { value };
2093:                return String.valueOf(buf);
2094:            }
2095:
2096:            /**
2097:             * Returns a <code>String</code> object representing the
2098:             * specified <code>char</code>.  The result is a string of length
2099:             * 1 consisting solely of the specified <code>char</code>.
2100:             *
2101:             * @param c the <code>char</code> to be converted
2102:             * @return the string representation of the specified <code>char</code>
2103:             * @since 1.4
2104:             */
2105:            public static String toString(char c) {
2106:                return String.valueOf(c);
2107:            }
2108:
2109:            /**
2110:             * Determines whether the specified code point is a valid Unicode
2111:             * code point value in the range of <code>0x0000</code> to
2112:             * <code>0x10FFFF</code> inclusive. This method is equivalent to
2113:             * the expression:
2114:             *
2115:             * <blockquote><pre>
2116:             * codePoint >= 0x0000 && codePoint <= 0x10FFFF
2117:             * </pre></blockquote>
2118:             *
2119:             * @param  codePoint the Unicode code point to be tested
2120:             * @return <code>true</code> if the specified code point value
2121:             * is a valid code point value;
2122:             * <code>false</code> otherwise.
2123:             * @since  1.5
2124:             */
2125:            public static boolean isValidCodePoint(int codePoint) {
2126:                return codePoint >= MIN_CODE_POINT
2127:                        && codePoint <= MAX_CODE_POINT;
2128:            }
2129:
2130:            /**
2131:             * Determines whether the specified character (Unicode code point)
2132:             * is in the supplementary character range. The method call is
2133:             * equivalent to the expression:
2134:             * <blockquote><pre>
2135:             * codePoint >= 0x10000 && codePoint <= 0x10FFFF
2136:             * </pre></blockquote>
2137:             *
2138:             * @param  codePoint the character (Unicode code point) to be tested
2139:             * @return <code>true</code> if the specified character is in the Unicode
2140:             *         supplementary character range; <code>false</code> otherwise.
2141:             * @since  1.5
2142:             */
2143:            public static boolean isSupplementaryCodePoint(int codePoint) {
2144:                return codePoint >= MIN_SUPPLEMENTARY_CODE_POINT
2145:                        && codePoint <= MAX_CODE_POINT;
2146:            }
2147:
2148:            /**
2149:             * Determines if the given <code>char</code> value is a
2150:             * high-surrogate code unit (also known as <i>leading-surrogate
2151:             * code unit</i>). Such values do not represent characters by
2152:             * themselves, but are used in the representation of <a
2153:             * href="#supplementary">supplementary characters</a> in the
2154:             * UTF-16 encoding.
2155:             *
2156:             * <p>This method returns <code>true</code> if and only if
2157:             * <blockquote><pre>ch >= '&#92;uD800' && ch <= '&#92;uDBFF'
2158:             * </pre></blockquote>
2159:             * is <code>true</code>.
2160:             *
2161:             * @param   ch   the <code>char</code> value to be tested.
2162:             * @return  <code>true</code> if the <code>char</code> value
2163:             *          is between '&#92;uD800' and '&#92;uDBFF' inclusive;
2164:             *          <code>false</code> otherwise.
2165:             * @see     java.lang.Character#isLowSurrogate(char)
2166:             * @see     Character.UnicodeBlock#of(int)
2167:             * @since   1.5
2168:             */
2169:            public static boolean isHighSurrogate(char ch) {
2170:                return ch >= MIN_HIGH_SURROGATE && ch <= MAX_HIGH_SURROGATE;
2171:            }
2172:
2173:            /**
2174:             * Determines if the given <code>char</code> value is a
2175:             * low-surrogate code unit (also known as <i>trailing-surrogate code
2176:             * unit</i>). Such values do not represent characters by themselves,
2177:             * but are used in the representation of <a
2178:             * href="#supplementary">supplementary characters</a> in the UTF-16 encoding.
2179:             *
2180:             * <p> This method returns <code>true</code> if and only if
2181:             * <blockquote><pre>ch >= '&#92;uDC00' && ch <= '&#92;uDFFF'
2182:             * </pre></blockquote> is <code>true</code>.
2183:             *
2184:             * @param   ch   the <code>char</code> value to be tested.
2185:             * @return  <code>true</code> if the <code>char</code> value
2186:             *          is between '&#92;uDC00' and '&#92;uDFFF' inclusive;
2187:             *          <code>false</code> otherwise.
2188:             * @see java.lang.Character#isHighSurrogate(char)
2189:             * @since   1.5
2190:             */
2191:            public static boolean isLowSurrogate(char ch) {
2192:                return ch >= MIN_LOW_SURROGATE && ch <= MAX_LOW_SURROGATE;
2193:            }
2194:
2195:            /**
2196:             * Determines whether the specified pair of <code>char</code>
2197:             * values is a valid surrogate pair. This method is equivalent to
2198:             * the expression:
2199:             * <blockquote><pre>
2200:             * isHighSurrogate(high) && isLowSurrogate(low)
2201:             * </pre></blockquote>
2202:             *
2203:             * @param  high the high-surrogate code value to be tested
2204:             * @param  low the low-surrogate code value to be tested
2205:             * @return <code>true</code> if the specified high and
2206:             * low-surrogate code values represent a valid surrogate pair;
2207:             * <code>false</code> otherwise.
2208:             * @since  1.5
2209:             */
2210:            public static boolean isSurrogatePair(char high, char low) {
2211:                return isHighSurrogate(high) && isLowSurrogate(low);
2212:            }
2213:
2214:            /**
2215:             * Determines the number of <code>char</code> values needed to
2216:             * represent the specified character (Unicode code point). If the
2217:             * specified character is equal to or greater than 0x10000, then
2218:             * the method returns 2. Otherwise, the method returns 1.
2219:             *
2220:             * <p>This method doesn't validate the specified character to be a
2221:             * valid Unicode code point. The caller must validate the
2222:             * character value using {@link #isValidCodePoint(int) isValidCodePoint}
2223:             * if necessary.
2224:             *
2225:             * @param   codePoint the character (Unicode code point) to be tested.
2226:             * @return  2 if the character is a valid supplementary character; 1 otherwise.
2227:             * @see     #isSupplementaryCodePoint(int)
2228:             * @since   1.5
2229:             */
2230:            public static int charCount(int codePoint) {
2231:                return codePoint >= MIN_SUPPLEMENTARY_CODE_POINT ? 2 : 1;
2232:            }
2233:
2234:            /**
2235:             * Converts the specified surrogate pair to its supplementary code
2236:             * point value. This method does not validate the specified
2237:             * surrogate pair. The caller must validate it using {@link
2238:             * #isSurrogatePair(char, char) isSurrogatePair} if necessary.
2239:             *
2240:             * @param  high the high-surrogate code unit
2241:             * @param  low the low-surrogate code unit
2242:             * @return the supplementary code point composed from the
2243:             *         specified surrogate pair.
2244:             * @since  1.5
2245:             */
2246:            public static int toCodePoint(char high, char low) {
2247:                return ((high - MIN_HIGH_SURROGATE) << 10)
2248:                        + (low - MIN_LOW_SURROGATE)
2249:                        + MIN_SUPPLEMENTARY_CODE_POINT;
2250:            }
2251:
2252:            /**
2253:             * Returns the code point at the given index of the
2254:             * <code>CharSequence</code>. If the <code>char</code> value at
2255:             * the given index in the <code>CharSequence</code> is in the
2256:             * high-surrogate range, the following index is less than the
2257:             * length of the <code>CharSequence</code>, and the
2258:             * <code>char</code> value at the following index is in the
2259:             * low-surrogate range, then the supplementary code point
2260:             * corresponding to this surrogate pair is returned. Otherwise,
2261:             * the <code>char</code> value at the given index is returned.
2262:             *
2263:             * @param seq a sequence of <code>char</code> values (Unicode code
2264:             * units)
2265:             * @param index the index to the <code>char</code> values (Unicode
2266:             * code units) in <code>seq</code> to be converted
2267:             * @return the Unicode code point at the given index
2268:             * @exception NullPointerException if <code>seq</code> is null.
2269:             * @exception IndexOutOfBoundsException if the value
2270:             * <code>index</code> is negative or not less than
2271:             * {@link CharSequence#length() seq.length()}.
2272:             * @since  1.5
2273:             */
2274:            public static int codePointAt(CharSequence seq, int index) {
2275:                char c1 = seq.charAt(index++);
2276:                if (isHighSurrogate(c1)) {
2277:                    if (index < seq.length()) {
2278:                        char c2 = seq.charAt(index);
2279:                        if (isLowSurrogate(c2)) {
2280:                            return toCodePoint(c1, c2);
2281:                        }
2282:                    }
2283:                }
2284:                return c1;
2285:            }
2286:
2287:            /**
2288:             * Returns the code point at the given index of the
2289:             * <code>char</code> array. If the <code>char</code> value at
2290:             * the given index in the <code>char</code> array is in the
2291:             * high-surrogate range, the following index is less than the
2292:             * length of the <code>char</code> array, and the
2293:             * <code>char</code> value at the following index is in the
2294:             * low-surrogate range, then the supplementary code point
2295:             * corresponding to this surrogate pair is returned. Otherwise,
2296:             * the <code>char</code> value at the given index is returned.
2297:             *
2298:             * @param a the <code>char</code> array
2299:             * @param index the index to the <code>char</code> values (Unicode
2300:             * code units) in the <code>char</code> array to be converted
2301:             * @return the Unicode code point at the given index
2302:             * @exception NullPointerException if <code>a</code> is null.
2303:             * @exception IndexOutOfBoundsException if the value
2304:             * <code>index</code> is negative or not less than
2305:             * the length of the <code>char</code> array.
2306:             * @since  1.5
2307:             */
2308:            public static int codePointAt(char[] a, int index) {
2309:                return codePointAtImpl(a, index, a.length);
2310:            }
2311:
2312:            /**
2313:             * Returns the code point at the given index of the
2314:             * <code>char</code> array, where only array elements with
2315:             * <code>index</code> less than <code>limit</code> can be used. If
2316:             * the <code>char</code> value at the given index in the
2317:             * <code>char</code> array is in the high-surrogate range, the
2318:             * following index is less than the <code>limit</code>, and the
2319:             * <code>char</code> value at the following index is in the
2320:             * low-surrogate range, then the supplementary code point
2321:             * corresponding to this surrogate pair is returned. Otherwise,
2322:             * the <code>char</code> value at the given index is returned.
2323:             *
2324:             * @param a the <code>char</code> array
2325:             * @param index the index to the <code>char</code> values (Unicode
2326:             * code units) in the <code>char</code> array to be converted
2327:             * @param limit the index after the last array element that can be used in the
2328:             * <code>char</code> array
2329:             * @return the Unicode code point at the given index
2330:             * @exception NullPointerException if <code>a</code> is null.
2331:             * @exception IndexOutOfBoundsException if the <code>index</code>
2332:             * argument is negative or not less than the <code>limit</code>
2333:             * argument, or if the <code>limit</code> argument is negative or
2334:             * greater than the length of the <code>char</code> array.
2335:             * @since  1.5
2336:             */
2337:            public static int codePointAt(char[] a, int index, int limit) {
2338:                if (index >= limit || limit < 0 || limit > a.length) {
2339:                    throw new IndexOutOfBoundsException();
2340:                }
2341:                return codePointAtImpl(a, index, limit);
2342:            }
2343:
2344:            static int codePointAtImpl(char[] a, int index, int limit) {
2345:                char c1 = a[index++];
2346:                if (isHighSurrogate(c1)) {
2347:                    if (index < limit) {
2348:                        char c2 = a[index];
2349:                        if (isLowSurrogate(c2)) {
2350:                            return toCodePoint(c1, c2);
2351:                        }
2352:                    }
2353:                }
2354:                return c1;
2355:            }
2356:
2357:            /**
2358:             * Returns the code point preceding the given index of the
2359:             * <code>CharSequence</code>. If the <code>char</code> value at
2360:             * <code>(index - 1)</code> in the <code>CharSequence</code> is in
2361:             * the low-surrogate range, <code>(index - 2)</code> is not
2362:             * negative, and the <code>char</code> value at <code>(index -
2363:             * 2)</code> in the <code>CharSequence</code> is in the
2364:             * high-surrogate range, then the supplementary code point
2365:             * corresponding to this surrogate pair is returned. Otherwise,
2366:             * the <code>char</code> value at <code>(index - 1)</code> is
2367:             * returned.
2368:             *
2369:             * @param seq the <code>CharSequence</code> instance
2370:             * @param index the index following the code point that should be returned
2371:             * @return the Unicode code point value before the given index.
2372:             * @exception NullPointerException if <code>seq</code> is null.
2373:             * @exception IndexOutOfBoundsException if the <code>index</code>
2374:             * argument is less than 1 or greater than {@link
2375:             * CharSequence#length() seq.length()}.
2376:             * @since  1.5
2377:             */
2378:            public static int codePointBefore(CharSequence seq, int index) {
2379:                char c2 = seq.charAt(--index);
2380:                if (isLowSurrogate(c2)) {
2381:                    if (index > 0) {
2382:                        char c1 = seq.charAt(--index);
2383:                        if (isHighSurrogate(c1)) {
2384:                            return toCodePoint(c1, c2);
2385:                        }
2386:                    }
2387:                }
2388:                return c2;
2389:            }
2390:
2391:            /**
2392:             * Returns the code point preceding the given index of the
2393:             * <code>char</code> array. If the <code>char</code> value at
2394:             * <code>(index - 1)</code> in the <code>char</code> array is in
2395:             * the low-surrogate range, <code>(index - 2)</code> is not
2396:             * negative, and the <code>char</code> value at <code>(index -
2397:             * 2)</code> in the <code>char</code> array is in the
2398:             * high-surrogate range, then the supplementary code point
2399:             * corresponding to this surrogate pair is returned. Otherwise,
2400:             * the <code>char</code> value at <code>(index - 1)</code> is
2401:             * returned.
2402:             *
2403:             * @param a the <code>char</code> array
2404:             * @param index the index following the code point that should be returned
2405:             * @return the Unicode code point value before the given index.
2406:             * @exception NullPointerException if <code>a</code> is null.
2407:             * @exception IndexOutOfBoundsException if the <code>index</code>
2408:             * argument is less than 1 or greater than the length of the
2409:             * <code>char</code> array
2410:             * @since  1.5
2411:             */
2412:            public static int codePointBefore(char[] a, int index) {
2413:                return codePointBeforeImpl(a, index, 0);
2414:            }
2415:
2416:            /**
2417:             * Returns the code point preceding the given index of the
2418:             * <code>char</code> array, where only array elements with
2419:             * <code>index</code> greater than or equal to <code>start</code>
2420:             * can be used. If the <code>char</code> value at <code>(index -
2421:             * 1)</code> in the <code>char</code> array is in the
2422:             * low-surrogate range, <code>(index - 2)</code> is not less than
2423:             * <code>start</code>, and the <code>char</code> value at
2424:             * <code>(index - 2)</code> in the <code>char</code> array is in
2425:             * the high-surrogate range, then the supplementary code point
2426:             * corresponding to this surrogate pair is returned. Otherwise,
2427:             * the <code>char</code> value at <code>(index - 1)</code> is
2428:             * returned.
2429:             *
2430:             * @param a the <code>char</code> array
2431:             * @param index the index following the code point that should be returned
2432:             * @param start the index of the first array element in the
2433:             * <code>char</code> array
2434:             * @return the Unicode code point value before the given index.
2435:             * @exception NullPointerException if <code>a</code> is null.
2436:             * @exception IndexOutOfBoundsException if the <code>index</code>
2437:             * argument is not greater than the <code>start</code> argument or
2438:             * is greater than the length of the <code>char</code> array, or
2439:             * if the <code>start</code> argument is negative or not less than
2440:             * the length of the <code>char</code> array.
2441:             * @since  1.5
2442:             */
2443:            public static int codePointBefore(char[] a, int index, int start) {
2444:                if (index <= start || start < 0 || start >= a.length) {
2445:                    throw new IndexOutOfBoundsException();
2446:                }
2447:                return codePointBeforeImpl(a, index, start);
2448:            }
2449:
2450:            static int codePointBeforeImpl(char[] a, int index, int start) {
2451:                char c2 = a[--index];
2452:                if (isLowSurrogate(c2)) {
2453:                    if (index > start) {
2454:                        char c1 = a[--index];
2455:                        if (isHighSurrogate(c1)) {
2456:                            return toCodePoint(c1, c2);
2457:                        }
2458:                    }
2459:                }
2460:                return c2;
2461:            }
2462:
2463:            /**
2464:             * Converts the specified character (Unicode code point) to its
2465:             * UTF-16 representation. If the specified code point is a BMP
2466:             * (Basic Multilingual Plane or Plane 0) value, the same value is
2467:             * stored in <code>dst[dstIndex]</code>, and 1 is returned. If the
2468:             * specified code point is a supplementary character, its
2469:             * surrogate values are stored in <code>dst[dstIndex]</code>
2470:             * (high-surrogate) and <code>dst[dstIndex+1]</code>
2471:             * (low-surrogate), and 2 is returned.
2472:             *
2473:             * @param  codePoint the character (Unicode code point) to be converted.
2474:             * @param  dst an array of <code>char</code> in which the
2475:             * <code>codePoint</code>'s UTF-16 value is stored.
2476:             * @param dstIndex the start index into the <code>dst</code>
2477:             * array where the converted value is stored.
2478:             * @return 1 if the code point is a BMP code point, 2 if the
2479:             * code point is a supplementary code point.
2480:             * @exception IllegalArgumentException if the specified
2481:             * <code>codePoint</code> is not a valid Unicode code point.
2482:             * @exception NullPointerException if the specified <code>dst</code> is null.
2483:             * @exception IndexOutOfBoundsException if <code>dstIndex</code>
2484:             * is negative or not less than <code>dst.length</code>, or if
2485:             * <code>dst</code> at <code>dstIndex</code> doesn't have enough
2486:             * array element(s) to store the resulting <code>char</code>
2487:             * value(s). (If <code>dstIndex</code> is equal to
2488:             * <code>dst.length-1</code> and the specified
2489:             * <code>codePoint</code> is a supplementary character, the
2490:             * high-surrogate value is not stored in
2491:             * <code>dst[dstIndex]</code>.)
2492:             * @since  1.5
2493:             */
2494:            public static int toChars(int codePoint, char[] dst, int dstIndex) {
2495:                if (codePoint < 0 || codePoint > MAX_CODE_POINT) {
2496:                    throw new IllegalArgumentException();
2497:                }
2498:                if (codePoint < MIN_SUPPLEMENTARY_CODE_POINT) {
2499:                    dst[dstIndex] = (char) codePoint;
2500:                    return 1;
2501:                }
2502:                toSurrogates(codePoint, dst, dstIndex);
2503:                return 2;
2504:            }
2505:
2506:            /**
2507:             * Converts the specified character (Unicode code point) to its
2508:             * UTF-16 representation stored in a <code>char</code> array. If
2509:             * the specified code point is a BMP (Basic Multilingual Plane or
2510:             * Plane 0) value, the resulting <code>char</code> array has
2511:             * the same value as <code>codePoint</code>. If the specified code
2512:             * point is a supplementary code point, the resulting
2513:             * <code>char</code> array has the corresponding surrogate pair.
2514:             *
2515:             * @param  codePoint a Unicode code point
2516:             * @return a <code>char</code> array having
2517:             *         <code>codePoint</code>'s UTF-16 representation.
2518:             * @exception IllegalArgumentException if the specified
2519:             * <code>codePoint</code> is not a valid Unicode code point.
2520:             * @since  1.5
2521:             */
2522:            public static char[] toChars(int codePoint) {
2523:                if (codePoint < 0 || codePoint > MAX_CODE_POINT) {
2524:                    throw new IllegalArgumentException();
2525:                }
2526:                if (codePoint < MIN_SUPPLEMENTARY_CODE_POINT) {
2527:                    return new char[] { (char) codePoint };
2528:                }
2529:                char[] result = new char[2];
2530:                toSurrogates(codePoint, result, 0);
2531:                return result;
2532:            }
2533:
2534:            static void toSurrogates(int codePoint, char[] dst, int index) {
2535:                int offset = codePoint - MIN_SUPPLEMENTARY_CODE_POINT;
2536:                dst[index + 1] = (char) ((offset & 0x3ff) + MIN_LOW_SURROGATE);
2537:                dst[index] = (char) ((offset >>> 10) + MIN_HIGH_SURROGATE);
2538:            }
2539:
2540:            /**
2541:             * Returns the number of Unicode code points in the text range of
2542:             * the specified char sequence. The text range begins at the
2543:             * specified <code>beginIndex</code> and extends to the
2544:             * <code>char</code> at index <code>endIndex - 1</code>. Thus the
2545:             * length (in <code>char</code>s) of the text range is
2546:             * <code>endIndex-beginIndex</code>. Unpaired surrogates within
2547:             * the text range count as one code point each.
2548:             *
2549:             * @param seq the char sequence
2550:             * @param beginIndex the index to the first <code>char</code> of
2551:             * the text range.
2552:             * @param endIndex the index after the last <code>char</code> of
2553:             * the text range.
2554:             * @return the number of Unicode code points in the specified text
2555:             * range
2556:             * @exception NullPointerException if <code>seq</code> is null.
2557:             * @exception IndexOutOfBoundsException if the
2558:             * <code>beginIndex</code> is negative, or <code>endIndex</code>
2559:             * is larger than the length of the given sequence, or
2560:             * <code>beginIndex</code> is larger than <code>endIndex</code>.
2561:             * @since  1.5
2562:             */
2563:            public static int codePointCount(CharSequence seq, int beginIndex,
2564:                    int endIndex) {
2565:                int length = seq.length();
2566:                if (beginIndex < 0 || endIndex > length
2567:                        || beginIndex > endIndex) {
2568:                    throw new IndexOutOfBoundsException();
2569:                }
2570:                int n = 0;
2571:                for (int i = beginIndex; i < endIndex;) {
2572:                    n++;
2573:                    if (isHighSurrogate(seq.charAt(i++))) {
2574:                        if (i < endIndex && isLowSurrogate(seq.charAt(i))) {
2575:                            i++;
2576:                        }
2577:                    }
2578:                }
2579:                return n;
2580:            }
2581:
2582:            /**
2583:             * Returns the number of Unicode code points in a subarray of the
2584:             * <code>char</code> array argument. The <code>offset</code>
2585:             * argument is the index of the first <code>char</code> of the
2586:             * subarray and the <code>count</code> argument specifies the
2587:             * length of the subarray in <code>char</code>s. Unpaired
2588:             * surrogates within the subarray count as one code point each.
2589:             *
2590:             * @param a the <code>char</code> array
2591:             * @param offset the index of the first <code>char</code> in the
2592:             * given <code>char</code> array
2593:             * @param count the length of the subarray in <code>char</code>s
2594:             * @return the number of Unicode code points in the specified subarray
2595:             * @exception NullPointerException if <code>a</code> is null.
2596:             * @exception IndexOutOfBoundsException if <code>offset</code> or
2597:             * <code>count</code> is negative, or if <code>offset +
2598:             * count</code> is larger than the length of the given array.
2599:             * @since  1.5
2600:             */
2601:            public static int codePointCount(char[] a, int offset, int count) {
2602:                if (count > a.length - offset || offset < 0 || count < 0) {
2603:                    throw new IndexOutOfBoundsException();
2604:                }
2605:                return codePointCountImpl(a, offset, count);
2606:            }
2607:
2608:            static int codePointCountImpl(char[] a, int offset, int count) {
2609:                int endIndex = offset + count;
2610:                int n = 0;
2611:                for (int i = offset; i < endIndex;) {
2612:                    n++;
2613:                    if (isHighSurrogate(a[i++])) {
2614:                        if (i < endIndex && isLowSurrogate(a[i])) {
2615:                            i++;
2616:                        }
2617:                    }
2618:                }
2619:                return n;
2620:            }
2621:
2622:            /**
2623:             * Returns the index within the given char sequence that is offset
2624:             * from the given <code>index</code> by <code>codePointOffset</code>
2625:             * code points. Unpaired surrogates within the text range given by
2626:             * <code>index</code> and <code>codePointOffset</code> count as
2627:             * one code point each.
2628:             *
2629:             * @param seq the char sequence
2630:             * @param index the index to be offset
2631:             * @param codePointOffset the offset in code points
2632:             * @return the index within the char sequence
2633:             * @exception NullPointerException if <code>seq</code> is null.
2634:             * @exception IndexOutOfBoundsException if <code>index</code>
2635:             *   is negative or larger then the length of the char sequence,
2636:             *   or if <code>codePointOffset</code> is positive and the
2637:             *   subsequence starting with <code>index</code> has fewer than
2638:             *   <code>codePointOffset</code> code points, or if
2639:             *   <code>codePointOffset</code> is negative and the subsequence
2640:             *   before <code>index</code> has fewer than the absolute value
2641:             *   of <code>codePointOffset</code> code points.
2642:             * @since 1.5
2643:             */
2644:            public static int offsetByCodePoints(CharSequence seq, int index,
2645:                    int codePointOffset) {
2646:                int length = seq.length();
2647:                if (index < 0 || index > length) {
2648:                    throw new IndexOutOfBoundsException();
2649:                }
2650:
2651:                int x = index;
2652:                if (codePointOffset >= 0) {
2653:                    int i;
2654:                    for (i = 0; x < length && i < codePointOffset; i++) {
2655:                        if (isHighSurrogate(seq.charAt(x++))) {
2656:                            if (x < length && isLowSurrogate(seq.charAt(x))) {
2657:                                x++;
2658:                            }
2659:                        }
2660:                    }
2661:                    if (i < codePointOffset) {
2662:                        throw new IndexOutOfBoundsException();
2663:                    }
2664:                } else {
2665:                    int i;
2666:                    for (i = codePointOffset; x > 0 && i < 0; i++) {
2667:                        if (isLowSurrogate(seq.charAt(--x))) {
2668:                            if (x > 0 && isHighSurrogate(seq.charAt(x - 1))) {
2669:                                x--;
2670:                            }
2671:                        }
2672:                    }
2673:                    if (i < 0) {
2674:                        throw new IndexOutOfBoundsException();
2675:                    }
2676:                }
2677:                return x;
2678:            }
2679:
2680:            /**
2681:             * Returns the index within the given <code>char</code> subarray
2682:             * that is offset from the given <code>index</code> by
2683:             * <code>codePointOffset</code> code points. The
2684:             * <code>start</code> and <code>count</code> arguments specify a
2685:             * subarray of the <code>char</code> array. Unpaired surrogates
2686:             * within the text range given by <code>index</code> and
2687:             * <code>codePointOffset</code> count as one code point each.
2688:             *
2689:             * @param a the <code>char</code> array
2690:             * @param start the index of the first <code>char</code> of the
2691:             * subarray
2692:             * @param count the length of the subarray in <code>char</code>s
2693:             * @param index the index to be offset
2694:             * @param codePointOffset the offset in code points
2695:             * @return the index within the subarray
2696:             * @exception NullPointerException if <code>a</code> is null.
2697:             * @exception IndexOutOfBoundsException 
2698:             *   if <code>start</code> or <code>count</code> is negative,
2699:             *   or if <code>start + count</code> is larger than the length of
2700:             *   the given array,
2701:             *   or if <code>index</code> is less than <code>start</code> or
2702:             *   larger then <code>start + count</code>,
2703:             *   or if <code>codePointOffset</code> is positive and the text range
2704:             *   starting with <code>index</code> and ending with <code>start
2705:             *   + count - 1</code> has fewer than <code>codePointOffset</code> code
2706:             *   points,
2707:             *   or if <code>codePointOffset</code> is negative and the text range
2708:             *   starting with <code>start</code> and ending with <code>index
2709:             *   - 1</code> has fewer than the absolute value of
2710:             *   <code>codePointOffset</code> code points.
2711:             * @since 1.5
2712:             */
2713:            public static int offsetByCodePoints(char[] a, int start,
2714:                    int count, int index, int codePointOffset) {
2715:                if (count > a.length - start || start < 0 || count < 0
2716:                        || index < start || index > start + count) {
2717:                    throw new IndexOutOfBoundsException();
2718:                }
2719:                return offsetByCodePointsImpl(a, start, count, index,
2720:                        codePointOffset);
2721:            }
2722:
2723:            static int offsetByCodePointsImpl(char[] a, int start, int count,
2724:                    int index, int codePointOffset) {
2725:                int x = index;
2726:                if (codePointOffset >= 0) {
2727:                    int limit = start + count;
2728:                    int i;
2729:                    for (i = 0; x < limit && i < codePointOffset; i++) {
2730:                        if (isHighSurrogate(a[x++])) {
2731:                            if (x < limit && isLowSurrogate(a[x])) {
2732:                                x++;
2733:                            }
2734:                        }
2735:                    }
2736:                    if (i < codePointOffset) {
2737:                        throw new IndexOutOfBoundsException();
2738:                    }
2739:                } else {
2740:                    int i;
2741:                    for (i = codePointOffset; x > start && i < 0; i++) {
2742:                        if (isLowSurrogate(a[--x])) {
2743:                            if (x > start && isHighSurrogate(a[x - 1])) {
2744:                                x--;
2745:                            }
2746:                        }
2747:                    }
2748:                    if (i < 0) {
2749:                        throw new IndexOutOfBoundsException();
2750:                    }
2751:                }
2752:                return x;
2753:            }
2754:
2755:            /**
2756:             * Determines if the specified character is a lowercase character.
2757:             * <p>
2758:             * A character is lowercase if its general category type, provided
2759:             * by <code>Character.getType(ch)</code>, is
2760:             * <code>LOWERCASE_LETTER</code>.
2761:             * <p>
2762:             * The following are examples of lowercase characters:
2763:             * <p><blockquote><pre>
2764:             * a b c d e f g h i j k l m n o p q r s t u v w x y z
2765:             * '&#92;u00DF' '&#92;u00E0' '&#92;u00E1' '&#92;u00E2' '&#92;u00E3' '&#92;u00E4' '&#92;u00E5' '&#92;u00E6' 
2766:             * '&#92;u00E7' '&#92;u00E8' '&#92;u00E9' '&#92;u00EA' '&#92;u00EB' '&#92;u00EC' '&#92;u00ED' '&#92;u00EE'
2767:             * '&#92;u00EF' '&#92;u00F0' '&#92;u00F1' '&#92;u00F2' '&#92;u00F3' '&#92;u00F4' '&#92;u00F5' '&#92;u00F6'
2768:             * '&#92;u00F8' '&#92;u00F9' '&#92;u00FA' '&#92;u00FB' '&#92;u00FC' '&#92;u00FD' '&#92;u00FE' '&#92;u00FF'
2769:             * </pre></blockquote>
2770:             * <p> Many other Unicode characters are lowercase too.
2771:             *
2772:             * <p><b>Note:</b> This method cannot handle <a
2773:             * href="#supplementary"> supplementary characters</a>. To support
2774:             * all Unicode characters, including supplementary characters, use
2775:             * the {@link #isLowerCase(int)} method.
2776:             *
2777:             * @param   ch   the character to be tested.
2778:             * @return  <code>true</code> if the character is lowercase;
2779:             *          <code>false</code> otherwise.
2780:             * @see     java.lang.Character#isLowerCase(char)
2781:             * @see     java.lang.Character#isTitleCase(char)
2782:             * @see     java.lang.Character#toLowerCase(char)
2783:             * @see     java.lang.Character#getType(char)
2784:             */
2785:            public static boolean isLowerCase(char ch) {
2786:                return isLowerCase((int) ch);
2787:            }
2788:
2789:            /**
2790:             * Determines if the specified character (Unicode code point) is a
2791:             * lowercase character.
2792:             * <p>
2793:             * A character is lowercase if its general category type, provided
2794:             * by {@link Character#getType getType(codePoint)}, is
2795:             * <code>LOWERCASE_LETTER</code>.
2796:             * <p>
2797:             * The following are examples of lowercase characters:
2798:             * <p><blockquote><pre>
2799:             * a b c d e f g h i j k l m n o p q r s t u v w x y z
2800:             * '&#92;u00DF' '&#92;u00E0' '&#92;u00E1' '&#92;u00E2' '&#92;u00E3' '&#92;u00E4' '&#92;u00E5' '&#92;u00E6' 
2801:             * '&#92;u00E7' '&#92;u00E8' '&#92;u00E9' '&#92;u00EA' '&#92;u00EB' '&#92;u00EC' '&#92;u00ED' '&#92;u00EE'
2802:             * '&#92;u00EF' '&#92;u00F0' '&#92;u00F1' '&#92;u00F2' '&#92;u00F3' '&#92;u00F4' '&#92;u00F5' '&#92;u00F6'
2803:             * '&#92;u00F8' '&#92;u00F9' '&#92;u00FA' '&#92;u00FB' '&#92;u00FC' '&#92;u00FD' '&#92;u00FE' '&#92;u00FF'
2804:             * </pre></blockquote>
2805:             * <p> Many other Unicode characters are lowercase too.
2806:             *
2807:             * @param   codePoint the character (Unicode code point) to be tested.
2808:             * @return  <code>true</code> if the character is lowercase;
2809:             *          <code>false</code> otherwise.
2810:             * @see     java.lang.Character#isLowerCase(int)
2811:             * @see     java.lang.Character#isTitleCase(int)
2812:             * @see     java.lang.Character#toLowerCase(int)
2813:             * @see     java.lang.Character#getType(int)
2814:             * @since   1.5
2815:             */
2816:            public static boolean isLowerCase(int codePoint) {
2817:                return getType(codePoint) == Character.LOWERCASE_LETTER;
2818:            }
2819:
2820:            /**
2821:             * Determines if the specified character is an uppercase character.
2822:             * <p>
2823:             * A character is uppercase if its general category type, provided by
2824:             * <code>Character.getType(ch)</code>, is <code>UPPERCASE_LETTER</code>.
2825:             * <p>
2826:             * The following are examples of uppercase characters:
2827:             * <p><blockquote><pre>
2828:             * A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
2829:             * '&#92;u00C0' '&#92;u00C1' '&#92;u00C2' '&#92;u00C3' '&#92;u00C4' '&#92;u00C5' '&#92;u00C6' '&#92;u00C7'
2830:             * '&#92;u00C8' '&#92;u00C9' '&#92;u00CA' '&#92;u00CB' '&#92;u00CC' '&#92;u00CD' '&#92;u00CE' '&#92;u00CF'
2831:             * '&#92;u00D0' '&#92;u00D1' '&#92;u00D2' '&#92;u00D3' '&#92;u00D4' '&#92;u00D5' '&#92;u00D6' '&#92;u00D8'
2832:             * '&#92;u00D9' '&#92;u00DA' '&#92;u00DB' '&#92;u00DC' '&#92;u00DD' '&#92;u00DE'
2833:             * </pre></blockquote>
2834:             * <p> Many other Unicode characters are uppercase too.<p>
2835:             *
2836:             * <p><b>Note:</b> This method cannot handle <a
2837:             * href="#supplementary"> supplementary characters</a>. To support
2838:             * all Unicode characters, including supplementary characters, use
2839:             * the {@link #isUpperCase(int)} method.
2840:             *
2841:             * @param   ch   the character to be tested.
2842:             * @return  <code>true</code> if the character is uppercase;
2843:             *          <code>false</code> otherwise.
2844:             * @see     java.lang.Character#isLowerCase(char)
2845:             * @see     java.lang.Character#isTitleCase(char)
2846:             * @see     java.lang.Character#toUpperCase(char)
2847:             * @see     java.lang.Character#getType(char)
2848:             * @since   1.0
2849:             */
2850:            public static boolean isUpperCase(char ch) {
2851:                return isUpperCase((int) ch);
2852:            }
2853:
2854:            /**
2855:             * Determines if the specified character (Unicode code point) is an uppercase character.
2856:             * <p>
2857:             * A character is uppercase if its general category type, provided by
2858:             * {@link Character#getType(int) getType(codePoint)}, is <code>UPPERCASE_LETTER</code>.
2859:             * <p>
2860:             * The following are examples of uppercase characters:
2861:             * <p><blockquote><pre>
2862:             * A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
2863:             * '&#92;u00C0' '&#92;u00C1' '&#92;u00C2' '&#92;u00C3' '&#92;u00C4' '&#92;u00C5' '&#92;u00C6' '&#92;u00C7'
2864:             * '&#92;u00C8' '&#92;u00C9' '&#92;u00CA' '&#92;u00CB' '&#92;u00CC' '&#92;u00CD' '&#92;u00CE' '&#92;u00CF'
2865:             * '&#92;u00D0' '&#92;u00D1' '&#92;u00D2' '&#92;u00D3' '&#92;u00D4' '&#92;u00D5' '&#92;u00D6' '&#92;u00D8'
2866:             * '&#92;u00D9' '&#92;u00DA' '&#92;u00DB' '&#92;u00DC' '&#92;u00DD' '&#92;u00DE'
2867:             * </pre></blockquote>
2868:             * <p> Many other Unicode characters are uppercase too.<p>
2869:             *
2870:             * @param   codePoint the character (Unicode code point) to be tested.
2871:             * @return  <code>true</code> if the character is uppercase;
2872:             *          <code>false</code> otherwise.
2873:             * @see     java.lang.Character#isLowerCase(int)
2874:             * @see     java.lang.Character#isTitleCase(int)
2875:             * @see     java.lang.Character#toUpperCase(int)
2876:             * @see     java.lang.Character#getType(int)
2877:             * @since   1.5
2878:             */
2879:            public static boolean isUpperCase(int codePoint) {
2880:                return getType(codePoint) == Character.UPPERCASE_LETTER;
2881:            }
2882:
2883:            /**
2884:             * Determines if the specified character is a titlecase character.
2885:             * <p> 
2886:             * A character is a titlecase character if its general
2887:             * category type, provided by <code>Character.getType(ch)</code>,
2888:             * is <code>TITLECASE_LETTER</code>.
2889:             * <p>
2890:             * Some characters look like pairs of Latin letters. For example, there
2891:             * is an uppercase letter that looks like "LJ" and has a corresponding
2892:             * lowercase letter that looks like "lj". A third form, which looks like "Lj",
2893:             * is the appropriate form to use when rendering a word in lowercase
2894:             * with initial capitals, as for a book title.
2895:             * <p>
2896:             * These are some of the Unicode characters for which this method returns
2897:             * <code>true</code>:
2898:             * <ul>
2899:             * <li><code>LATIN CAPITAL LETTER D WITH SMALL LETTER Z WITH CARON</code>
2900:             * <li><code>LATIN CAPITAL LETTER L WITH SMALL LETTER J</code>
2901:             * <li><code>LATIN CAPITAL LETTER N WITH SMALL LETTER J</code>
2902:             * <li><code>LATIN CAPITAL LETTER D WITH SMALL LETTER Z</code>
2903:             * </ul>
2904:             * <p> Many other Unicode characters are titlecase too.<p>
2905:             *
2906:             * <p><b>Note:</b> This method cannot handle <a
2907:             * href="#supplementary"> supplementary characters</a>. To support
2908:             * all Unicode characters, including supplementary characters, use
2909:             * the {@link #isTitleCase(int)} method.
2910:             *
2911:             * @param   ch   the character to be tested.
2912:             * @return  <code>true</code> if the character is titlecase;
2913:             *          <code>false</code> otherwise.
2914:             * @see     java.lang.Character#isLowerCase(char)
2915:             * @see     java.lang.Character#isUpperCase(char)
2916:             * @see     java.lang.Character#toTitleCase(char)
2917:             * @see     java.lang.Character#getType(char)
2918:             * @since   1.0.2
2919:             */
2920:            public static boolean isTitleCase(char ch) {
2921:                return isTitleCase((int) ch);
2922:            }
2923:
2924:            /**
2925:             * Determines if the specified character (Unicode code point) is a titlecase character.
2926:             * <p> 
2927:             * A character is a titlecase character if its general
2928:             * category type, provided by {@link Character#getType(int) getType(codePoint)},
2929:             * is <code>TITLECASE_LETTER</code>.
2930:             * <p>
2931:             * Some characters look like pairs of Latin letters. For example, there
2932:             * is an uppercase letter that looks like "LJ" and has a corresponding
2933:             * lowercase letter that looks like "lj". A third form, which looks like "Lj",
2934:             * is the appropriate form to use when rendering a word in lowercase
2935:             * with initial capitals, as for a book title.
2936:             * <p>
2937:             * These are some of the Unicode characters for which this method returns
2938:             * <code>true</code>:
2939:             * <ul>
2940:             * <li><code>LATIN CAPITAL LETTER D WITH SMALL LETTER Z WITH CARON</code>
2941:             * <li><code>LATIN CAPITAL LETTER L WITH SMALL LETTER J</code>
2942:             * <li><code>LATIN CAPITAL LETTER N WITH SMALL LETTER J</code>
2943:             * <li><code>LATIN CAPITAL LETTER D WITH SMALL LETTER Z</code>
2944:             * </ul>
2945:             * <p> Many other Unicode characters are titlecase too.<p>
2946:             *
2947:             * @param   codePoint the character (Unicode code point) to be tested.
2948:             * @return  <code>true</code> if the character is titlecase;
2949:             *          <code>false</code> otherwise.
2950:             * @see     java.lang.Character#isLowerCase(int)
2951:             * @see     java.lang.Character#isUpperCase(int)
2952:             * @see     java.lang.Character#toTitleCase(int)
2953:             * @see     java.lang.Character#getType(int)
2954:             * @since   1.5
2955:             */
2956:            public static boolean isTitleCase(int codePoint) {
2957:                return getType(codePoint) == Character.TITLECASE_LETTER;
2958:            }
2959:
2960:            /**
2961:             * Determines if the specified character is a digit.
2962:             * <p>
2963:             * A character is a digit if its general category type, provided
2964:             * by <code>Character.getType(ch)</code>, is
2965:             * <code>DECIMAL_DIGIT_NUMBER</code>.
2966:             * <p>
2967:             * Some Unicode character ranges that contain digits:
2968:             * <ul>
2969:             * <li><code>'&#92;u0030'</code> through <code>'&#92;u0039'</code>, 
2970:             *     ISO-LATIN-1 digits (<code>'0'</code> through <code>'9'</code>)
2971:             * <li><code>'&#92;u0660'</code> through <code>'&#92;u0669'</code>,
2972:             *     Arabic-Indic digits
2973:             * <li><code>'&#92;u06F0'</code> through <code>'&#92;u06F9'</code>,
2974:             *     Extended Arabic-Indic digits
2975:             * <li><code>'&#92;u0966'</code> through <code>'&#92;u096F'</code>,
2976:             *     Devanagari digits
2977:             * <li><code>'&#92;uFF10'</code> through <code>'&#92;uFF19'</code>,
2978:             *     Fullwidth digits
2979:             * </ul>
2980:             *
2981:             * Many other character ranges contain digits as well.
2982:             *
2983:             * <p><b>Note:</b> This method cannot handle <a
2984:             * href="#supplementary"> supplementary characters</a>. To support
2985:             * all Unicode characters, including supplementary characters, use
2986:             * the {@link #isDigit(int)} method.
2987:             *
2988:             * @param   ch   the character to be tested.
2989:             * @return  <code>true</code> if the character is a digit;
2990:             *          <code>false</code> otherwise.
2991:             * @see     java.lang.Character#digit(char, int)
2992:             * @see     java.lang.Character#forDigit(int, int)
2993:             * @see     java.lang.Character#getType(char)
2994:             */
2995:            public static boolean isDigit(char ch) {
2996:                return isDigit((int) ch);
2997:            }
2998:
2999:            /**
3000:             * Determines if the specified character (Unicode code point) is a digit.
3001:             * <p>
3002:             * A character is a digit if its general category type, provided
3003:             * by {@link Character#getType(int) getType(codePoint)}, is
3004:             * <code>DECIMAL_DIGIT_NUMBER</code>.
3005:             * <p>
3006:             * Some Unicode character ranges that contain digits:
3007:             * <ul>
3008:             * <li><code>'&#92;u0030'</code> through <code>'&#92;u0039'</code>, 
3009:             *     ISO-LATIN-1 digits (<code>'0'</code> through <code>'9'</code>)
3010:             * <li><code>'&#92;u0660'</code> through <code>'&#92;u0669'</code>,
3011:             *     Arabic-Indic digits
3012:             * <li><code>'&#92;u06F0'</code> through <code>'&#92;u06F9'</code>,
3013:             *     Extended Arabic-Indic digits
3014:             * <li><code>'&#92;u0966'</code> through <code>'&#92;u096F'</code>,
3015:             *     Devanagari digits
3016:             * <li><code>'&#92;uFF10'</code> through <code>'&#92;uFF19'</code>,
3017:             *     Fullwidth digits
3018:             * </ul>
3019:             *
3020:             * Many other character ranges contain digits as well.
3021:             *
3022:             * @param   codePoint the character (Unicode code point) to be tested.
3023:             * @return  <code>true</code> if the character is a digit;
3024:             *          <code>false</code> otherwise.
3025:             * @see     java.lang.Character#forDigit(int, int)
3026:             * @see     java.lang.Character#getType(int)
3027:             * @since   1.5
3028:             */
3029:            public static boolean isDigit(int codePoint) {
3030:                return getType(codePoint) == Character.DECIMAL_DIGIT_NUMBER;
3031:            }
3032:
3033:            /**
3034:             * Determines if a character is defined in Unicode.
3035:             * <p>
3036:             * A character is defined if at least one of the following is true:
3037:             * <ul>
3038:             * <li>It has an entry in the UnicodeData file.
3039:             * <li>It has a value in a range defined by the UnicodeData file.
3040:             * </ul>
3041:             *
3042:             * <p><b>Note:</b> This method cannot handle <a
3043:             * href="#supplementary"> supplementary characters</a>. To support
3044:             * all Unicode characters, including supplementary characters, use
3045:             * the {@link #isDefined(int)} method.
3046:             *
3047:             * @param   ch   the character to be tested
3048:             * @return  <code>true</code> if the character has a defined meaning
3049:             *          in Unicode; <code>false</code> otherwise.
3050:             * @see     java.lang.Character#isDigit(char)
3051:             * @see     java.lang.Character#isLetter(char)
3052:             * @see     java.lang.Character#isLetterOrDigit(char)
3053:             * @see     java.lang.Character#isLowerCase(char)
3054:             * @see     java.lang.Character#isTitleCase(char)
3055:             * @see     java.lang.Character#isUpperCase(char)
3056:             * @since   1.0.2
3057:             */
3058:            public static boolean isDefined(char ch) {
3059:                return isDefined((int) ch);
3060:            }
3061:
3062:            /**
3063:             * Determines if a character (Unicode code point) is defined in Unicode.
3064:             * <p>
3065:             * A character is defined if at least one of the following is true:
3066:             * <ul>
3067:             * <li>It has an entry in the UnicodeData file.
3068:             * <li>It has a value in a range defined by the UnicodeData file.
3069:             * </ul>
3070:             *
3071:             * @param   codePoint the character (Unicode code point) to be tested.
3072:             * @return  <code>true</code> if the character has a defined meaning
3073:             *          in Unicode; <code>false</code> otherwise.
3074:             * @see     java.lang.Character#isDigit(int)
3075:             * @see     java.lang.Character#isLetter(int)
3076:             * @see     java.lang.Character#isLetterOrDigit(int)
3077:             * @see     java.lang.Character#isLowerCase(int)
3078:             * @see     java.lang.Character#isTitleCase(int)
3079:             * @see     java.lang.Character#isUpperCase(int)
3080:             * @since   1.5
3081:             */
3082:            public static boolean isDefined(int codePoint) {
3083:                return getType(codePoint) != Character.UNASSIGNED;
3084:            }
3085:
3086:            /**
3087:             * Determines if the specified character is a letter.
3088:             * <p>
3089:             * A character is considered to be a letter if its general
3090:             * category type, provided by <code>Character.getType(ch)</code>,
3091:             * is any of the following:
3092:             * <ul>
3093:             * <li> <code>UPPERCASE_LETTER</code>
3094:             * <li> <code>LOWERCASE_LETTER</code>
3095:             * <li> <code>TITLECASE_LETTER</code>
3096:             * <li> <code>MODIFIER_LETTER</code>
3097:             * <li> <code>OTHER_LETTER</code>
3098:             * </ul>
3099:             *
3100:             * Not all letters have case. Many characters are
3101:             * letters but are neither uppercase nor lowercase nor titlecase.
3102:             *
3103:             * <p><b>Note:</b> This method cannot handle <a
3104:             * href="#supplementary"> supplementary characters</a>. To support
3105:             * all Unicode characters, including supplementary characters, use
3106:             * the {@link #isLetter(int)} method.
3107:             *
3108:             * @param   ch   the character to be tested.
3109:             * @return  <code>true</code> if the character is a letter;
3110:             *          <code>false</code> otherwise.
3111:             * @see     java.lang.Character#isDigit(char)
3112:             * @see     java.lang.Character#isJavaIdentifierStart(char)
3113:             * @see     java.lang.Character#isJavaLetter(char)
3114:             * @see     java.lang.Character#isJavaLetterOrDigit(char)
3115:             * @see     java.lang.Character#isLetterOrDigit(char)
3116:             * @see     java.lang.Character#isLowerCase(char)
3117:             * @see     java.lang.Character#isTitleCase(char)
3118:             * @see     java.lang.Character#isUnicodeIdentifierStart(char)
3119:             * @see     java.lang.Character#isUpperCase(char)
3120:             */
3121:            public static boolean isLetter(char ch) {
3122:                return isLetter((int) ch);
3123:            }
3124:
3125:            /**
3126:             * Determines if the specified character (Unicode code point) is a letter.
3127:             * <p>
3128:             * A character is considered to be a letter if its general
3129:             * category type, provided by {@link Character#getType(int) getType(codePoint)},
3130:             * is any of the following:
3131:             * <ul>
3132:             * <li> <code>UPPERCASE_LETTER</code>
3133:             * <li> <code>LOWERCASE_LETTER</code>
3134:             * <li> <code>TITLECASE_LETTER</code>
3135:             * <li> <code>MODIFIER_LETTER</code>
3136:             * <li> <code>OTHER_LETTER</code>
3137:             * </ul>
3138:             *
3139:             * Not all letters have case. Many characters are
3140:             * letters but are neither uppercase nor lowercase nor titlecase.
3141:             *
3142:             * @param   codePoint the character (Unicode code point) to be tested.
3143:             * @return  <code>true</code> if the character is a letter;
3144:             *          <code>false</code> otherwise.
3145:             * @see     java.lang.Character#isDigit(int)
3146:             * @see     java.lang.Character#isJavaIdentifierStart(int)
3147:             * @see     java.lang.Character#isLetterOrDigit(int)
3148:             * @see     java.lang.Character#isLowerCase(int)
3149:             * @see     java.lang.Character#isTitleCase(int)
3150:             * @see     java.lang.Character#isUnicodeIdentifierStart(int)
3151:             * @see     java.lang.Character#isUpperCase(int)
3152:             * @since   1.5
3153:             */
3154:            public static boolean isLetter(int codePoint) {
3155:                return ((((1 << Character.UPPERCASE_LETTER)
3156:                        | (1 << Character.LOWERCASE_LETTER)
3157:                        | (1 << Character.TITLECASE_LETTER)
3158:                        | (1 << Character.MODIFIER_LETTER) | (1 << Character.OTHER_LETTER)) >> getType(codePoint)) & 1) != 0;
3159:            }
3160:
3161:            /**
3162:             * Determines if the specified character is a letter or digit.
3163:             * <p>
3164:             * A character is considered to be a letter or digit if either
3165:             * <code>Character.isLetter(char ch)</code> or
3166:             * <code>Character.isDigit(char ch)</code> returns
3167:             * <code>true</code> for the character.
3168:             *
3169:             * <p><b>Note:</b> This method cannot handle <a
3170:             * href="#supplementary"> supplementary characters</a>. To support
3171:             * all Unicode characters, including supplementary characters, use
3172:             * the {@link #isLetterOrDigit(int)} method.
3173:             *
3174:             * @param   ch   the character to be tested.
3175:             * @return  <code>true</code> if the character is a letter or digit;
3176:             *          <code>false</code> otherwise.
3177:             * @see     java.lang.Character#isDigit(char)
3178:             * @see     java.lang.Character#isJavaIdentifierPart(char)
3179:             * @see     java.lang.Character#isJavaLetter(char)
3180:             * @see     java.lang.Character#isJavaLetterOrDigit(char)
3181:             * @see     java.lang.Character#isLetter(char)
3182:             * @see     java.lang.Character#isUnicodeIdentifierPart(char)
3183:             * @since   1.0.2
3184:             */
3185:            public static boolean isLetterOrDigit(char ch) {
3186:                return isLetterOrDigit((int) ch);
3187:            }
3188:
3189:            /**
3190:             * Determines if the specified character (Unicode code point) is a letter or digit.
3191:             * <p>
3192:             * A character is considered to be a letter or digit if either
3193:             * {@link #isLetter(int) isLetter(codePoint)} or
3194:             * {@link #isDigit(int) isDigit(codePoint)} returns
3195:             * <code>true</code> for the character.
3196:             *
3197:             * @param   codePoint the character (Unicode code point) to be tested.
3198:             * @return  <code>true</code> if the character is a letter or digit;
3199:             *          <code>false</code> otherwise.
3200:             * @see     java.lang.Character#isDigit(int)
3201:             * @see     java.lang.Character#isJavaIdentifierPart(int)
3202:             * @see     java.lang.Character#isLetter(int)
3203:             * @see     java.lang.Character#isUnicodeIdentifierPart(int)
3204:             * @since   1.5
3205:             */
3206:            public static boolean isLetterOrDigit(int codePoint) {
3207:                return ((((1 << Character.UPPERCASE_LETTER)
3208:                        | (1 << Character.LOWERCASE_LETTER)
3209:                        | (1 << Character.TITLECASE_LETTER)
3210:                        | (1 << Character.MODIFIER_LETTER)
3211:                        | (1 << Character.OTHER_LETTER) | (1 << Character.DECIMAL_DIGIT_NUMBER)) >> getType(codePoint)) & 1) != 0;
3212:            }
3213:
3214:            /**
3215:             * Determines if the specified character is permissible as the first
3216:             * character in a Java identifier.
3217:             * <p>
3218:             * A character may start a Java identifier if and only if
3219:             * one of the following is true:
3220:             * <ul>
3221:             * <li> {@link #isLetter(char) isLetter(ch)} returns <code>true</code>
3222:             * <li> {@link #getType(char) getType(ch)} returns <code>LETTER_NUMBER</code>
3223:             * <li> ch is a currency symbol (such as "$")
3224:             * <li> ch is a connecting punctuation character (such as "_").
3225:             * </ul>
3226:             *
3227:             * @param   ch the character to be tested.
3228:             * @return  <code>true</code> if the character may start a Java
3229:             *          identifier; <code>false</code> otherwise.
3230:             * @see     java.lang.Character#isJavaLetterOrDigit(char)
3231:             * @see     java.lang.Character#isJavaIdentifierStart(char)
3232:             * @see     java.lang.Character#isJavaIdentifierPart(char)
3233:             * @see     java.lang.Character#isLetter(char)
3234:             * @see     java.lang.Character#isLetterOrDigit(char)
3235:             * @see     java.lang.Character#isUnicodeIdentifierStart(char)
3236:             * @since   1.02
3237:             * @deprecated Replaced by isJavaIdentifierStart(char).
3238:             */
3239:            @Deprecated
3240:            public static boolean isJavaLetter(char ch) {
3241:                return isJavaIdentifierStart(ch);
3242:            }
3243:
3244:            /**
3245:             * Determines if the specified character may be part of a Java
3246:             * identifier as other than the first character.
3247:             * <p>
3248:             * A character may be part of a Java identifier if and only if any
3249:             * of the following are true:
3250:             * <ul>
3251:             * <li>  it is a letter
3252:             * <li>  it is a currency symbol (such as <code>'$'</code>)
3253:             * <li>  it is a connecting punctuation character (such as <code>'_'</code>)
3254:             * <li>  it is a digit
3255:             * <li>  it is a numeric letter (such as a Roman numeral character)
3256:             * <li>  it is a combining mark
3257:             * <li>  it is a non-spacing mark
3258:             * <li> <code>isIdentifierIgnorable</code> returns
3259:             * <code>true</code> for the character.
3260:             * </ul>
3261:             *
3262:             * @param   ch the character to be tested.
3263:             * @return  <code>true</code> if the character may be part of a
3264:             *          Java identifier; <code>false</code> otherwise.
3265:             * @see     java.lang.Character#isJavaLetter(char)
3266:             * @see     java.lang.Character#isJavaIdentifierStart(char)
3267:             * @see     java.lang.Character#isJavaIdentifierPart(char)
3268:             * @see     java.lang.Character#isLetter(char)
3269:             * @see     java.lang.Character#isLetterOrDigit(char)
3270:             * @see     java.lang.Character#isUnicodeIdentifierPart(char)
3271:             * @see     java.lang.Character#isIdentifierIgnorable(char)
3272:             * @since   1.02
3273:             * @deprecated Replaced by isJavaIdentifierPart(char).
3274:             */
3275:            @Deprecated
3276:            public static boolean isJavaLetterOrDigit(char ch) {
3277:                return isJavaIdentifierPart(ch);
3278:            }
3279:
3280:            /**
3281:             * Determines if the specified character is
3282:             * permissible as the first character in a Java identifier.
3283:             * <p>
3284:             * A character may start a Java identifier if and only if
3285:             * one of the following conditions is true:
3286:             * <ul>
3287:             * <li> {@link #isLetter(char) isLetter(ch)} returns <code>true</code>
3288:             * <li> {@link #getType(char) getType(ch)} returns <code>LETTER_NUMBER</code>
3289:             * <li> ch is a currency symbol (such as "$")
3290:             * <li> ch is a connecting punctuation character (such as "_").
3291:             * </ul>
3292:             *
3293:             * <p><b>Note:</b> This method cannot handle <a
3294:             * href="#supplementary"> supplementary characters</a>. To support
3295:             * all Unicode characters, including supplementary characters, use
3296:             * the {@link #isJavaIdentifierStart(int)} method.
3297:             *
3298:             * @param   ch the character to be tested.
3299:             * @return  <code>true</code> if the character may start a Java identifier;
3300:             *          <code>false</code> otherwise.
3301:             * @see     java.lang.Character#isJavaIdentifierPart(char)
3302:             * @see     java.lang.Character#isLetter(char)
3303:             * @see     java.lang.Character#isUnicodeIdentifierStart(char)
3304:             * @see     javax.lang.model.SourceVersion#isIdentifier(CharSequence)
3305:             * @since   1.1
3306:             */
3307:            public static boolean isJavaIdentifierStart(char ch) {
3308:                return isJavaIdentifierStart((int) ch);
3309:            }
3310:
3311:            /**
3312:             * Determines if the character (Unicode code point) is
3313:             * permissible as the first character in a Java identifier.
3314:             * <p>
3315:             * A character may start a Java identifier if and only if
3316:             * one of the following conditions is true:
3317:             * <ul>
3318:             * <li> {@link #isLetter(int) isLetter(codePoint)}
3319:             *      returns <code>true</code>
3320:             * <li> {@link #getType(int) getType(codePoint)}
3321:             *      returns <code>LETTER_NUMBER</code>
3322:             * <li> the referenced character is a currency symbol (such as "$")
3323:             * <li> the referenced character is a connecting punctuation character
3324:             *      (such as "_").
3325:             * </ul>
3326:             *
3327:             * @param   codePoint the character (Unicode code point) to be tested.
3328:             * @return  <code>true</code> if the character may start a Java identifier;
3329:             *          <code>false</code> otherwise.
3330:             * @see     java.lang.Character#isJavaIdentifierPart(int)
3331:             * @see     java.lang.Character#isLetter(int)
3332:             * @see     java.lang.Character#isUnicodeIdentifierStart(int)
3333:             * @see     javax.lang.model.SourceVersion#isIdentifier(CharSequence)
3334:             * @since   1.5
3335:             */
3336:            public static boolean isJavaIdentifierStart(int codePoint) {
3337:                return CharacterData.of(codePoint).isJavaIdentifierStart(
3338:                        codePoint);
3339:            }
3340:
3341:            /**
3342:             * Determines if the specified character may be part of a Java
3343:             * identifier as other than the first character.
3344:             * <p>
3345:             * A character may be part of a Java identifier if any of the following
3346:             * are true:
3347:             * <ul>
3348:             * <li>  it is a letter
3349:             * <li>  it is a currency symbol (such as <code>'$'</code>)
3350:             * <li>  it is a connecting punctuation character (such as <code>'_'</code>)
3351:             * <li>  it is a digit
3352:             * <li>  it is a numeric letter (such as a Roman numeral character)
3353:             * <li>  it is a combining mark
3354:             * <li>  it is a non-spacing mark
3355:             * <li> <code>isIdentifierIgnorable</code> returns
3356:             * <code>true</code> for the character
3357:             * </ul>
3358:             *
3359:             * <p><b>Note:</b> This method cannot handle <a
3360:             * href="#supplementary"> supplementary characters</a>. To support
3361:             * all Unicode characters, including supplementary characters, use
3362:             * the {@link #isJavaIdentifierPart(int)} method.
3363:             *
3364:             * @param   ch      the character to be tested.
3365:             * @return <code>true</code> if the character may be part of a
3366:             *          Java identifier; <code>false</code> otherwise.
3367:             * @see     java.lang.Character#isIdentifierIgnorable(char)
3368:             * @see     java.lang.Character#isJavaIdentifierStart(char)
3369:             * @see     java.lang.Character#isLetterOrDigit(char)
3370:             * @see     java.lang.Character#isUnicodeIdentifierPart(char)
3371:             * @see     javax.lang.model.SourceVersion#isIdentifier(CharSequence)
3372:             * @since   1.1
3373:             */
3374:            public static boolean isJavaIdentifierPart(char ch) {
3375:                return isJavaIdentifierPart((int) ch);
3376:            }
3377:
3378:            /**
3379:             * Determines if the character (Unicode code point) may be part of a Java
3380:             * identifier as other than the first character.
3381:             * <p>
3382:             * A character may be part of a Java identifier if any of the following
3383:             * are true:
3384:             * <ul>
3385:             * <li>  it is a letter
3386:             * <li>  it is a currency symbol (such as <code>'$'</code>)
3387:             * <li>  it is a connecting punctuation character (such as <code>'_'</code>)
3388:             * <li>  it is a digit
3389:             * <li>  it is a numeric letter (such as a Roman numeral character)
3390:             * <li>  it is a combining mark
3391:             * <li>  it is a non-spacing mark
3392:             * <li> {@link #isIdentifierIgnorable(int)
3393:             * isIdentifierIgnorable(codePoint)} returns <code>true</code> for
3394:             * the character
3395:             * </ul>
3396:             *
3397:             * @param   codePoint the character (Unicode code point) to be tested.
3398:             * @return <code>true</code> if the character may be part of a
3399:             *          Java identifier; <code>false</code> otherwise.
3400:             * @see     java.lang.Character#isIdentifierIgnorable(int)
3401:             * @see     java.lang.Character#isJavaIdentifierStart(int)
3402:             * @see     java.lang.Character#isLetterOrDigit(int)
3403:             * @see     java.lang.Character#isUnicodeIdentifierPart(int)
3404:             * @see     javax.lang.model.SourceVersion#isIdentifier(CharSequence)
3405:             * @since   1.5
3406:             */
3407:            public static boolean isJavaIdentifierPart(int codePoint) {
3408:                return CharacterData.of(codePoint).isJavaIdentifierPart(
3409:                        codePoint);
3410:            }
3411:
3412:            /**
3413:             * Determines if the specified character is permissible as the
3414:             * first character in a Unicode identifier.
3415:             * <p>
3416:             * A character may start a Unicode identifier if and only if
3417:             * one of the following conditions is true:
3418:             * <ul>
3419:             * <li> {@link #isLetter(char) isLetter(ch)} returns <code>true</code>
3420:             * <li> {@link #getType(char) getType(ch)} returns 
3421:             *      <code>LETTER_NUMBER</code>.
3422:             * </ul>
3423:             *
3424:             * <p><b>Note:</b> This method cannot handle <a
3425:             * href="#supplementary"> supplementary characters</a>. To support
3426:             * all Unicode characters, including supplementary characters, use
3427:             * the {@link #isUnicodeIdentifierStart(int)} method.
3428:             *
3429:             * @param   ch      the character to be tested.
3430:             * @return  <code>true</code> if the character may start a Unicode 
3431:             *          identifier; <code>false</code> otherwise.
3432:             * @see     java.lang.Character#isJavaIdentifierStart(char)
3433:             * @see     java.lang.Character#isLetter(char)
3434:             * @see     java.lang.Character#isUnicodeIdentifierPart(char)
3435:             * @since   1.1
3436:             */
3437:            public static boolean isUnicodeIdentifierStart(char ch) {
3438:                return isUnicodeIdentifierStart((int) ch);
3439:            }
3440:
3441:            /**
3442:             * Determines if the specified character (Unicode code point) is permissible as the
3443:             * first character in a Unicode identifier.
3444:             * <p>
3445:             * A character may start a Unicode identifier if and only if
3446:             * one of the following conditions is true:
3447:             * <ul>
3448:             * <li> {@link #isLetter(int) isLetter(codePoint)}
3449:             *      returns <code>true</code>
3450:             * <li> {@link #getType(int) getType(codePoint)}
3451:             *      returns <code>LETTER_NUMBER</code>.
3452:             * </ul>
3453:             * @param   codePoint the character (Unicode code point) to be tested.
3454:             * @return  <code>true</code> if the character may start a Unicode 
3455:             *          identifier; <code>false</code> otherwise.
3456:             * @see     java.lang.Character#isJavaIdentifierStart(int)
3457:             * @see     java.lang.Character#isLetter(int)
3458:             * @see     java.lang.Character#isUnicodeIdentifierPart(int)
3459:             * @since   1.5
3460:             */
3461:            public static boolean isUnicodeIdentifierStart(int codePoint) {
3462:                return CharacterData.of(codePoint).isUnicodeIdentifierStart(
3463:                        codePoint);
3464:            }
3465:
3466:            /**
3467:             * Determines if the specified character may be part of a Unicode
3468:             * identifier as other than the first character.
3469:             * <p>
3470:             * A character may be part of a Unicode identifier if and only if
3471:             * one of the following statements is true:
3472:             * <ul>
3473:             * <li>  it is a letter
3474:             * <li>  it is a connecting punctuation character (such as <code>'_'</code>)
3475:             * <li>  it is a digit
3476:             * <li>  it is a numeric letter (such as a Roman numeral character)
3477:             * <li>  it is a combining mark
3478:             * <li>  it is a non-spacing mark
3479:             * <li> <code>isIdentifierIgnorable</code> returns
3480:             * <code>true</code> for this character.
3481:             * </ul>
3482:             * 
3483:             * <p><b>Note:</b> This method cannot handle <a
3484:             * href="#supplementary"> supplementary characters</a>. To support
3485:             * all Unicode characters, including supplementary characters, use
3486:             * the {@link #isUnicodeIdentifierPart(int)} method.
3487:             *
3488:             * @param   ch      the character to be tested.
3489:             * @return  <code>true</code> if the character may be part of a 
3490:             *          Unicode identifier; <code>false</code> otherwise.
3491:             * @see     java.lang.Character#isIdentifierIgnorable(char)
3492:             * @see     java.lang.Character#isJavaIdentifierPart(char)
3493:             * @see     java.lang.Character#isLetterOrDigit(char)
3494:             * @see     java.lang.Character#isUnicodeIdentifierStart(char)
3495:             * @since   1.1
3496:             */
3497:            public static boolean isUnicodeIdentifierPart(char ch) {
3498:                return isUnicodeIdentifierPart((int) ch);
3499:            }
3500:
3501:            /**
3502:             * Determines if the specified character (Unicode code point) may be part of a Unicode
3503:             * identifier as other than the first character.
3504:             * <p>
3505:             * A character may be part of a Unicode identifier if and only if
3506:             * one of the following statements is true:
3507:             * <ul>
3508:             * <li>  it is a letter
3509:             * <li>  it is a connecting punctuation character (such as <code>'_'</code>)
3510:             * <li>  it is a digit
3511:             * <li>  it is a numeric letter (such as a Roman numeral character)
3512:             * <li>  it is a combining mark
3513:             * <li>  it is a non-spacing mark
3514:             * <li> <code>isIdentifierIgnorable</code> returns
3515:             * <code>true</code> for this character.
3516:             * </ul>
3517:             * @param   codePoint the character (Unicode code point) to be tested.
3518:             * @return  <code>true</code> if the character may be part of a 
3519:             *          Unicode identifier; <code>false</code> otherwise.
3520:             * @see     java.lang.Character#isIdentifierIgnorable(int)
3521:             * @see     java.lang.Character#isJavaIdentifierPart(int)
3522:             * @see     java.lang.Character#isLetterOrDigit(int)
3523:             * @see     java.lang.Character#isUnicodeIdentifierStart(int)
3524:             * @since   1.5
3525:             */
3526:            public static boolean isUnicodeIdentifierPart(int codePoint) {
3527:                return CharacterData.of(codePoint).isUnicodeIdentifierPart(
3528:                        codePoint);
3529:            }
3530:
3531:            /**
3532:             * Determines if the specified character should be regarded as
3533:             * an ignorable character in a Java identifier or a Unicode identifier.
3534:             * <p>
3535:             * The following Unicode characters are ignorable in a Java identifier
3536:             * or a Unicode identifier:
3537:             * <ul>
3538:             * <li>ISO control characters that are not whitespace
3539:             * <ul>
3540:             * <li><code>'&#92;u0000'</code> through <code>'&#92;u0008'</code>
3541:             * <li><code>'&#92;u000E'</code> through <code>'&#92;u001B'</code>
3542:             * <li><code>'&#92;u007F'</code> through <code>'&#92;u009F'</code>
3543:             * </ul>
3544:             *
3545:             * <li>all characters that have the <code>FORMAT</code> general
3546:             * category value
3547:             * </ul>
3548:             *
3549:             * <p><b>Note:</b> This method cannot handle <a
3550:             * href="#supplementary"> supplementary characters</a>. To support
3551:             * all Unicode characters, including supplementary characters, use
3552:             * the {@link #isIdentifierIgnorable(int)} method.
3553:             *
3554:             * @param   ch      the character to be tested.
3555:             * @return  <code>true</code> if the character is an ignorable control 
3556:             *          character that may be part of a Java or Unicode identifier;
3557:             *           <code>false</code> otherwise.
3558:             * @see     java.lang.Character#isJavaIdentifierPart(char)
3559:             * @see     java.lang.Character#isUnicodeIdentifierPart(char)
3560:             * @since   1.1
3561:             */
3562:            public static boolean isIdentifierIgnorable(char ch) {
3563:                return isIdentifierIgnorable((int) ch);
3564:            }
3565:
3566:            /**
3567:             * Determines if the specified character (Unicode code point) should be regarded as
3568:             * an ignorable character in a Java identifier or a Unicode identifier.
3569:             * <p>
3570:             * The following Unicode characters are ignorable in a Java identifier
3571:             * or a Unicode identifier:
3572:             * <ul>
3573:             * <li>ISO control characters that are not whitespace
3574:             * <ul>
3575:             * <li><code>'&#92;u0000'</code> through <code>'&#92;u0008'</code>
3576:             * <li><code>'&#92;u000E'</code> through <code>'&#92;u001B'</code>
3577:             * <li><code>'&#92;u007F'</code> through <code>'&#92;u009F'</code>
3578:             * </ul>
3579:             *
3580:             * <li>all characters that have the <code>FORMAT</code> general
3581:             * category value
3582:             * </ul>
3583:             *
3584:             * @param   codePoint the character (Unicode code point) to be tested.
3585:             * @return  <code>true</code> if the character is an ignorable control 
3586:             *          character that may be part of a Java or Unicode identifier;
3587:             *          <code>false</code> otherwise.
3588:             * @see     java.lang.Character#isJavaIdentifierPart(int)
3589:             * @see     java.lang.Character#isUnicodeIdentifierPart(int)
3590:             * @since   1.5
3591:             */
3592:            public static boolean isIdentifierIgnorable(int codePoint) {
3593:                return CharacterData.of(codePoint).isIdentifierIgnorable(
3594:                        codePoint);
3595:            }
3596:
3597:            /**
3598:             * Converts the character argument to lowercase using case
3599:             * mapping information from the UnicodeData file.
3600:             * <p>
3601:             * Note that
3602:             * <code>Character.isLowerCase(Character.toLowerCase(ch))</code>
3603:             * does not always return <code>true</code> for some ranges of
3604:             * characters, particularly those that are symbols or ideographs.
3605:             *
3606:             * <p>In general, {@link java.lang.String#toLowerCase()} should be used to map
3607:             * characters to lowercase. <code>String</code> case mapping methods
3608:             * have several benefits over <code>Character</code> case mapping methods.
3609:             * <code>String</code> case mapping methods can perform locale-sensitive
3610:             * mappings, context-sensitive mappings, and 1:M character mappings, whereas
3611:             * the <code>Character</code> case mapping methods cannot.
3612:             *
3613:             * <p><b>Note:</b> This method cannot handle <a
3614:             * href="#supplementary"> supplementary characters</a>. To support
3615:             * all Unicode characters, including supplementary characters, use
3616:             * the {@link #toLowerCase(int)} method.
3617:             *
3618:             * @param   ch   the character to be converted.
3619:             * @return  the lowercase equivalent of the character, if any;
3620:             *          otherwise, the character itself.
3621:             * @see     java.lang.Character#isLowerCase(char)
3622:             * @see     java.lang.String#toLowerCase()
3623:             */
3624:            public static char toLowerCase(char ch) {
3625:                return (char) toLowerCase((int) ch);
3626:            }
3627:
3628:            /**
3629:             * Converts the character (Unicode code point) argument to
3630:             * lowercase using case mapping information from the UnicodeData
3631:             * file.
3632:             *
3633:             * <p> Note that
3634:             * <code>Character.isLowerCase(Character.toLowerCase(codePoint))</code>
3635:             * does not always return <code>true</code> for some ranges of
3636:             * characters, particularly those that are symbols or ideographs.
3637:             *
3638:             * <p>In general, {@link java.lang.String#toLowerCase()} should be used to map
3639:             * characters to lowercase. <code>String</code> case mapping methods
3640:             * have several benefits over <code>Character</code> case mapping methods.
3641:             * <code>String</code> case mapping methods can perform locale-sensitive
3642:             * mappings, context-sensitive mappings, and 1:M character mappings, whereas
3643:             * the <code>Character</code> case mapping methods cannot.
3644:             *
3645:             * @param   codePoint   the character (Unicode code point) to be converted.
3646:             * @return  the lowercase equivalent of the character (Unicode code
3647:             *          point), if any; otherwise, the character itself.
3648:             * @see     java.lang.Character#isLowerCase(int)
3649:             * @see     java.lang.String#toLowerCase()
3650:             *
3651:             * @since   1.5
3652:             */
3653:            public static int toLowerCase(int codePoint) {
3654:                return CharacterData.of(codePoint).toLowerCase(codePoint);
3655:            }
3656:
3657:            /**
3658:             * Converts the character argument to uppercase using case mapping
3659:             * information from the UnicodeData file.
3660:             * <p>
3661:             * Note that
3662:             * <code>Character.isUpperCase(Character.toUpperCase(ch))</code>
3663:             * does not always return <code>true</code> for some ranges of
3664:             * characters, particularly those that are symbols or ideographs.
3665:             *
3666:             * <p>In general, {@link java.lang.String#toUpperCase()} should be used to map
3667:             * characters to uppercase. <code>String</code> case mapping methods
3668:             * have several benefits over <code>Character</code> case mapping methods.
3669:             * <code>String</code> case mapping methods can perform locale-sensitive
3670:             * mappings, context-sensitive mappings, and 1:M character mappings, whereas
3671:             * the <code>Character</code> case mapping methods cannot.
3672:             *
3673:             * <p><b>Note:</b> This method cannot handle <a
3674:             * href="#supplementary"> supplementary characters</a>. To support
3675:             * all Unicode characters, including supplementary characters, use
3676:             * the {@link #toUpperCase(int)} method.
3677:             *
3678:             * @param   ch   the character to be converted.
3679:             * @return  the uppercase equivalent of the character, if any;
3680:             *          otherwise, the character itself.
3681:             * @see     java.lang.Character#isUpperCase(char)
3682:             * @see     java.lang.String#toUpperCase()
3683:             */
3684:            public static char toUpperCase(char ch) {
3685:                return (char) toUpperCase((int) ch);
3686:            }
3687:
3688:            /**
3689:             * Converts the character (Unicode code point) argument to
3690:             * uppercase using case mapping information from the UnicodeData
3691:             * file.
3692:             * 
3693:             * <p>Note that
3694:             * <code>Character.isUpperCase(Character.toUpperCase(codePoint))</code>
3695:             * does not always return <code>true</code> for some ranges of
3696:             * characters, particularly those that are symbols or ideographs.
3697:             *
3698:             * <p>In general, {@link java.lang.String#toUpperCase()} should be used to map
3699:             * characters to uppercase. <code>String</code> case mapping methods
3700:             * have several benefits over <code>Character</code> case mapping methods.
3701:             * <code>String</code> case mapping methods can perform locale-sensitive
3702:             * mappings, context-sensitive mappings, and 1:M character mappings, whereas
3703:             * the <code>Character</code> case mapping methods cannot.
3704:             *
3705:             * @param   codePoint   the character (Unicode code point) to be converted.
3706:             * @return  the uppercase equivalent of the character, if any;
3707:             *          otherwise, the character itself.
3708:             * @see     java.lang.Character#isUpperCase(int)
3709:             * @see     java.lang.String#toUpperCase()
3710:             * 
3711:             * @since   1.5
3712:             */
3713:            public static int toUpperCase(int codePoint) {
3714:                return CharacterData.of(codePoint).toUpperCase(codePoint);
3715:            }
3716:
3717:            /**
3718:             * Converts the character argument to titlecase using case mapping
3719:             * information from the UnicodeData file. If a character has no
3720:             * explicit titlecase mapping and is not itself a titlecase char
3721:             * according to UnicodeData, then the uppercase mapping is
3722:             * returned as an equivalent titlecase mapping. If the
3723:             * <code>char</code> argument is already a titlecase
3724:             * <code>char</code>, the same <code>char</code> value will be
3725:             * returned.
3726:             * <p>
3727:             * Note that
3728:             * <code>Character.isTitleCase(Character.toTitleCase(ch))</code>
3729:             * does not always return <code>true</code> for some ranges of
3730:             * characters.
3731:             *
3732:             * <p><b>Note:</b> This method cannot handle <a
3733:             * href="#supplementary"> supplementary characters</a>. To support
3734:             * all Unicode characters, including supplementary characters, use
3735:             * the {@link #toTitleCase(int)} method.
3736:             *
3737:             * @param   ch   the character to be converted.
3738:             * @return  the titlecase equivalent of the character, if any;
3739:             *          otherwise, the character itself.
3740:             * @see     java.lang.Character#isTitleCase(char)
3741:             * @see     java.lang.Character#toLowerCase(char)
3742:             * @see     java.lang.Character#toUpperCase(char)
3743:             * @since   1.0.2
3744:             */
3745:            public static char toTitleCase(char ch) {
3746:                return (char) toTitleCase((int) ch);
3747:            }
3748:
3749:            /**
3750:             * Converts the character (Unicode code point) argument to titlecase using case mapping
3751:             * information from the UnicodeData file. If a character has no
3752:             * explicit titlecase mapping and is not itself a titlecase char
3753:             * according to UnicodeData, then the uppercase mapping is
3754:             * returned as an equivalent titlecase mapping. If the
3755:             * character argument is already a titlecase
3756:             * character, the same character value will be
3757:             * returned.
3758:             * 
3759:             * <p>Note that
3760:             * <code>Character.isTitleCase(Character.toTitleCase(codePoint))</code>
3761:             * does not always return <code>true</code> for some ranges of
3762:             * characters.
3763:             *
3764:             * @param   codePoint   the character (Unicode code point) to be converted.
3765:             * @return  the titlecase equivalent of the character, if any;
3766:             *          otherwise, the character itself.
3767:             * @see     java.lang.Character#isTitleCase(int)
3768:             * @see     java.lang.Character#toLowerCase(int)
3769:             * @see     java.lang.Character#toUpperCase(int)
3770:             * @since   1.5
3771:             */
3772:            public static int toTitleCase(int codePoint) {
3773:                return CharacterData.of(codePoint).toTitleCase(codePoint);
3774:            }
3775:
3776:            /**
3777:             * Returns the numeric value of the character <code>ch</code> in the
3778:             * specified radix.
3779:             * <p>
3780:             * If the radix is not in the range <code>MIN_RADIX</code>&nbsp;&lt;=
3781:             * <code>radix</code>&nbsp;&lt;= <code>MAX_RADIX</code> or if the
3782:             * value of <code>ch</code> is not a valid digit in the specified
3783:             * radix, <code>-1</code> is returned. A character is a valid digit
3784:             * if at least one of the following is true:
3785:             * <ul>
3786:             * <li>The method <code>isDigit</code> is <code>true</code> of the character
3787:             *     and the Unicode decimal digit value of the character (or its
3788:             *     single-character decomposition) is less than the specified radix.
3789:             *     In this case the decimal digit value is returned.
3790:             * <li>The character is one of the uppercase Latin letters
3791:             *     <code>'A'</code> through <code>'Z'</code> and its code is less than
3792:             *     <code>radix&nbsp;+ 'A'&nbsp;-&nbsp;10</code>.
3793:             *     In this case, <code>ch&nbsp;- 'A'&nbsp;+&nbsp;10</code>
3794:             *     is returned.
3795:             * <li>The character is one of the lowercase Latin letters
3796:             *     <code>'a'</code> through <code>'z'</code> and its code is less than
3797:             *     <code>radix&nbsp;+ 'a'&nbsp;-&nbsp;10</code>.
3798:             *     In this case, <code>ch&nbsp;- 'a'&nbsp;+&nbsp;10</code>
3799:             *     is returned.
3800:             * </ul>
3801:             *
3802:             * <p><b>Note:</b> This method cannot handle <a
3803:             * href="#supplementary"> supplementary characters</a>. To support
3804:             * all Unicode characters, including supplementary characters, use
3805:             * the {@link #digit(int, int)} method.
3806:             *
3807:             * @param   ch      the character to be converted.
3808:             * @param   radix   the radix.
3809:             * @return  the numeric value represented by the character in the
3810:             *          specified radix.
3811:             * @see     java.lang.Character#forDigit(int, int)
3812:             * @see     java.lang.Character#isDigit(char)
3813:             */
3814:            public static int digit(char ch, int radix) {
3815:                return digit((int) ch, radix);
3816:            }
3817:
3818:            /**
3819:             * Returns the numeric value of the specified character (Unicode
3820:             * code point) in the specified radix.
3821:             * 
3822:             * <p>If the radix is not in the range <code>MIN_RADIX</code>&nbsp;&lt;=
3823:             * <code>radix</code>&nbsp;&lt;= <code>MAX_RADIX</code> or if the
3824:             * character is not a valid digit in the specified
3825:             * radix, <code>-1</code> is returned. A character is a valid digit
3826:             * if at least one of the following is true:
3827:             * <ul>
3828:             * <li>The method {@link #isDigit(int) isDigit(codePoint)} is <code>true</code> of the character
3829:             *     and the Unicode decimal digit value of the character (or its
3830:             *     single-character decomposition) is less than the specified radix.
3831:             *     In this case the decimal digit value is returned.
3832:             * <li>The character is one of the uppercase Latin letters
3833:             *     <code>'A'</code> through <code>'Z'</code> and its code is less than
3834:             *     <code>radix&nbsp;+ 'A'&nbsp;-&nbsp;10</code>.
3835:             *     In this case, <code>ch&nbsp;- 'A'&nbsp;+&nbsp;10</code>
3836:             *     is returned.
3837:             * <li>The character is one of the lowercase Latin letters
3838:             *     <code>'a'</code> through <code>'z'</code> and its code is less than
3839:             *     <code>radix&nbsp;+ 'a'&nbsp;-&nbsp;10</code>.
3840:             *     In this case, <code>ch&nbsp;- 'a'&nbsp;+&nbsp;10</code>
3841:             *     is returned.
3842:             * </ul>
3843:             *
3844:             * @param   codePoint the character (Unicode code point) to be converted.
3845:             * @param   radix   the radix.
3846:             * @return  the numeric value represented by the character in the
3847:             *          specified radix.
3848:             * @see     java.lang.Character#forDigit(int, int)
3849:             * @see     java.lang.Character#isDigit(int)
3850:             * @since   1.5
3851:             */
3852:            public static int digit(int codePoint, int radix) {
3853:                return CharacterData.of(codePoint).digit(codePoint, radix);
3854:            }
3855:
3856:            /**
3857:             * Returns the <code>int</code> value that the specified Unicode
3858:             * character represents. For example, the character
3859:             * <code>'&#92;u216C'</code> (the roman numeral fifty) will return
3860:             * an int with a value of 50.
3861:             * <p>
3862:             * The letters A-Z in their uppercase (<code>'&#92;u0041'</code> through
3863:             * <code>'&#92;u005A'</code>), lowercase
3864:             * (<code>'&#92;u0061'</code> through <code>'&#92;u007A'</code>), and
3865:             * full width variant (<code>'&#92;uFF21'</code> through
3866:             * <code>'&#92;uFF3A'</code> and <code>'&#92;uFF41'</code> through
3867:             * <code>'&#92;uFF5A'</code>) forms have numeric values from 10
3868:             * through 35. This is independent of the Unicode specification,
3869:             * which does not assign numeric values to these <code>char</code>
3870:             * values.
3871:             * <p>
3872:             * If the character does not have a numeric value, then -1 is returned.
3873:             * If the character has a numeric value that cannot be represented as a
3874:             * nonnegative integer (for example, a fractional value), then -2
3875:             * is returned.
3876:             *
3877:             * <p><b>Note:</b> This method cannot handle <a
3878:             * href="#supplementary"> supplementary characters</a>. To support
3879:             * all Unicode characters, including supplementary characters, use
3880:             * the {@link #getNumericValue(int)} method.
3881:             *
3882:             * @param   ch      the character to be converted.
3883:             * @return  the numeric value of the character, as a nonnegative <code>int</code>
3884:             *           value; -2 if the character has a numeric value that is not a
3885:             *          nonnegative integer; -1 if the character has no numeric value.
3886:             * @see     java.lang.Character#forDigit(int, int)
3887:             * @see     java.lang.Character#isDigit(char)
3888:             * @since   1.1
3889:             */
3890:            public static int getNumericValue(char ch) {
3891:                return getNumericValue((int) ch);
3892:            }
3893:
3894:            /**
3895:             * Returns the <code>int</code> value that the specified 
3896:             * character (Unicode code point) represents. For example, the character
3897:             * <code>'&#92;u216C'</code> (the Roman numeral fifty) will return
3898:             * an <code>int</code> with a value of 50.
3899:             * <p>
3900:             * The letters A-Z in their uppercase (<code>'&#92;u0041'</code> through
3901:             * <code>'&#92;u005A'</code>), lowercase
3902:             * (<code>'&#92;u0061'</code> through <code>'&#92;u007A'</code>), and
3903:             * full width variant (<code>'&#92;uFF21'</code> through
3904:             * <code>'&#92;uFF3A'</code> and <code>'&#92;uFF41'</code> through
3905:             * <code>'&#92;uFF5A'</code>) forms have numeric values from 10
3906:             * through 35. This is independent of the Unicode specification,
3907:             * which does not assign numeric values to these <code>char</code>
3908:             * values.
3909:             * <p>
3910:             * If the character does not have a numeric value, then -1 is returned.
3911:             * If the character has a numeric value that cannot be represented as a
3912:             * nonnegative integer (for example, a fractional value), then -2
3913:             * is returned.
3914:             *
3915:             * @param   codePoint the character (Unicode code point) to be converted.
3916:             * @return  the numeric value of the character, as a nonnegative <code>int</code>
3917:             *          value; -2 if the character has a numeric value that is not a
3918:             *          nonnegative integer; -1 if the character has no numeric value.
3919:             * @see     java.lang.Character#forDigit(int, int)
3920:             * @see     java.lang.Character#isDigit(int)
3921:             * @since   1.5
3922:             */
3923:            public static int getNumericValue(int codePoint) {
3924:                return CharacterData.of(codePoint).getNumericValue(codePoint);
3925:            }
3926:
3927:            /**
3928:             * Determines if the specified character is ISO-LATIN-1 white space.
3929:             * This method returns <code>true</code> for the following five
3930:             * characters only:
3931:             * <table>
3932:             * <tr><td><code>'\t'</code></td>            <td><code>'&#92;u0009'</code></td>
3933:             *     <td><code>HORIZONTAL TABULATION</code></td></tr>
3934:             * <tr><td><code>'\n'</code></td>            <td><code>'&#92;u000A'</code></td>
3935:             *     <td><code>NEW LINE</code></td></tr>
3936:             * <tr><td><code>'\f'</code></td>            <td><code>'&#92;u000C'</code></td>
3937:             *     <td><code>FORM FEED</code></td></tr>
3938:             * <tr><td><code>'\r'</code></td>            <td><code>'&#92;u000D'</code></td>
3939:             *     <td><code>CARRIAGE RETURN</code></td></tr>
3940:             * <tr><td><code>'&nbsp;'</code></td>  <td><code>'&#92;u0020'</code></td>
3941:             *     <td><code>SPACE</code></td></tr>
3942:             * </table>
3943:             *
3944:             * @param      ch   the character to be tested.
3945:             * @return     <code>true</code> if the character is ISO-LATIN-1 white
3946:             *             space; <code>false</code> otherwise.
3947:             * @see        java.lang.Character#isSpaceChar(char)
3948:             * @see        java.lang.Character#isWhitespace(char)
3949:             * @deprecated Replaced by isWhitespace(char).
3950:             */
3951:            @Deprecated
3952:            public static boolean isSpace(char ch) {
3953:                return (ch <= 0x0020)
3954:                        && (((((1L << 0x0009) | (1L << 0x000A) | (1L << 0x000C)
3955:                                | (1L << 0x000D) | (1L << 0x0020)) >> ch) & 1L) != 0);
3956:            }
3957:
3958:            /**
3959:             * Determines if the specified character is a Unicode space character.
3960:             * A character is considered to be a space character if and only if
3961:             * it is specified to be a space character by the Unicode standard. This
3962:             * method returns true if the character's general category type is any of
3963:             * the following:
3964:             * <ul>
3965:             * <li> <code>SPACE_SEPARATOR</code>
3966:             * <li> <code>LINE_SEPARATOR</code>
3967:             * <li> <code>PARAGRAPH_SEPARATOR</code>
3968:             * </ul>
3969:             *
3970:             * <p><b>Note:</b> This method cannot handle <a
3971:             * href="#supplementary"> supplementary characters</a>. To support
3972:             * all Unicode characters, including supplementary characters, use
3973:             * the {@link #isSpaceChar(int)} method.
3974:             *
3975:             * @param   ch      the character to be tested.
3976:             * @return  <code>true</code> if the character is a space character; 
3977:             *          <code>false</code> otherwise.
3978:             * @see     java.lang.Character#isWhitespace(char)
3979:             * @since   1.1
3980:             */
3981:            public static boolean isSpaceChar(char ch) {
3982:                return isSpaceChar((int) ch);
3983:            }
3984:
3985:            /**
3986:             * Determines if the specified character (Unicode code point) is a
3987:             * Unicode space character.  A character is considered to be a
3988:             * space character if and only if it is specified to be a space
3989:             * character by the Unicode standard. This method returns true if
3990:             * the character's general category type is any of the following:
3991:             *
3992:             * <ul>
3993:             * <li> {@link #SPACE_SEPARATOR}
3994:             * <li> {@link #LINE_SEPARATOR}
3995:             * <li> {@link #PARAGRAPH_SEPARATOR}
3996:             * </ul>
3997:             *
3998:             * @param   codePoint the character (Unicode code point) to be tested.
3999:             * @return  <code>true</code> if the character is a space character; 
4000:             *          <code>false</code> otherwise.
4001:             * @see     java.lang.Character#isWhitespace(int)
4002:             * @since   1.5
4003:             */
4004:            public static boolean isSpaceChar(int codePoint) {
4005:                return ((((1 << Character.SPACE_SEPARATOR)
4006:                        | (1 << Character.LINE_SEPARATOR) | (1 << Character.PARAGRAPH_SEPARATOR)) >> getType(codePoint)) & 1) != 0;
4007:            }
4008:
4009:            /**
4010:             * Determines if the specified character is white space according to Java.
4011:             * A character is a Java whitespace character if and only if it satisfies
4012:             * one of the following criteria:
4013:             * <ul>
4014:             * <li> It is a Unicode space character (<code>SPACE_SEPARATOR</code>,
4015:             *      <code>LINE_SEPARATOR</code>, or <code>PARAGRAPH_SEPARATOR</code>) 
4016:             *      but is not also a non-breaking space (<code>'&#92;u00A0'</code>,
4017:             *      <code>'&#92;u2007'</code>, <code>'&#92;u202F'</code>).
4018:             * <li> It is <code>'&#92;u0009'</code>, HORIZONTAL TABULATION.
4019:             * <li> It is <code>'&#92;u000A'</code>, LINE FEED.
4020:             * <li> It is <code>'&#92;u000B'</code>, VERTICAL TABULATION.
4021:             * <li> It is <code>'&#92;u000C'</code>, FORM FEED.
4022:             * <li> It is <code>'&#92;u000D'</code>, CARRIAGE RETURN.
4023:             * <li> It is <code>'&#92;u001C'</code>, FILE SEPARATOR.
4024:             * <li> It is <code>'&#92;u001D'</code>, GROUP SEPARATOR.
4025:             * <li> It is <code>'&#92;u001E'</code>, RECORD SEPARATOR.
4026:             * <li> It is <code>'&#92;u001F'</code>, UNIT SEPARATOR.
4027:             * </ul>
4028:             *
4029:             * <p><b>Note:</b> This method cannot handle <a
4030:             * href="#supplementary"> supplementary characters</a>. To support
4031:             * all Unicode characters, including supplementary characters, use
4032:             * the {@link #isWhitespace(int)} method.
4033:             *
4034:             * @param   ch the character to be tested.
4035:             * @return  <code>true</code> if the character is a Java whitespace
4036:             *          character; <code>false</code> otherwise.
4037:             * @see     java.lang.Character#isSpaceChar(char)
4038:             * @since   1.1
4039:             */
4040:            public static boolean isWhitespace(char ch) {
4041:                return isWhitespace((int) ch);
4042:            }
4043:
4044:            /**
4045:             * Determines if the specified character (Unicode code point) is
4046:             * white space according to Java.  A character is a Java
4047:             * whitespace character if and only if it satisfies one of the
4048:             * following criteria:
4049:             * <ul>
4050:             * <li> It is a Unicode space character ({@link #SPACE_SEPARATOR},
4051:             *      {@link #LINE_SEPARATOR}, or {@link #PARAGRAPH_SEPARATOR}) 
4052:             *      but is not also a non-breaking space (<code>'&#92;u00A0'</code>,
4053:             *      <code>'&#92;u2007'</code>, <code>'&#92;u202F'</code>).
4054:             * <li> It is <code>'&#92;u0009'</code>, HORIZONTAL TABULATION.
4055:             * <li> It is <code>'&#92;u000A'</code>, LINE FEED.
4056:             * <li> It is <code>'&#92;u000B'</code>, VERTICAL TABULATION.
4057:             * <li> It is <code>'&#92;u000C'</code>, FORM FEED.
4058:             * <li> It is <code>'&#92;u000D'</code>, CARRIAGE RETURN.
4059:             * <li> It is <code>'&#92;u001C'</code>, FILE SEPARATOR.
4060:             * <li> It is <code>'&#92;u001D'</code>, GROUP SEPARATOR.
4061:             * <li> It is <code>'&#92;u001E'</code>, RECORD SEPARATOR.
4062:             * <li> It is <code>'&#92;u001F'</code>, UNIT SEPARATOR.
4063:             * </ul>
4064:             * <p>
4065:             *
4066:             * @param   codePoint the character (Unicode code point) to be tested.
4067:             * @return  <code>true</code> if the character is a Java whitespace
4068:             *          character; <code>false</code> otherwise.
4069:             * @see     java.lang.Character#isSpaceChar(int)
4070:             * @since   1.5
4071:             */
4072:            public static boolean isWhitespace(int codePoint) {
4073:                return CharacterData.of(codePoint).isWhitespace(codePoint);
4074:            }
4075:
4076:            /**
4077:             * Determines if the specified character is an ISO control
4078:             * character.  A character is considered to be an ISO control
4079:             * character if its code is in the range <code>'&#92;u0000'</code>
4080:             * through <code>'&#92;u001F'</code> or in the range
4081:             * <code>'&#92;u007F'</code> through <code>'&#92;u009F'</code>.
4082:             *
4083:             * <p><b>Note:</b> This method cannot handle <a
4084:             * href="#supplementary"> supplementary characters</a>. To support
4085:             * all Unicode characters, including supplementary characters, use
4086:             * the {@link #isISOControl(int)} method.
4087:             *
4088:             * @param   ch      the character to be tested.
4089:             * @return  <code>true</code> if the character is an ISO control character;
4090:             *          <code>false</code> otherwise.
4091:             *
4092:             * @see     java.lang.Character#isSpaceChar(char)
4093:             * @see     java.lang.Character#isWhitespace(char)
4094:             * @since   1.1
4095:             */
4096:            public static boolean isISOControl(char ch) {
4097:                return isISOControl((int) ch);
4098:            }
4099:
4100:            /**
4101:             * Determines if the referenced character (Unicode code point) is an ISO control
4102:             * character.  A character is considered to be an ISO control
4103:             * character if its code is in the range <code>'&#92;u0000'</code>
4104:             * through <code>'&#92;u001F'</code> or in the range
4105:             * <code>'&#92;u007F'</code> through <code>'&#92;u009F'</code>.
4106:             *
4107:             * @param   codePoint the character (Unicode code point) to be tested.
4108:             * @return  <code>true</code> if the character is an ISO control character;
4109:             *          <code>false</code> otherwise.
4110:             * @see     java.lang.Character#isSpaceChar(int)
4111:             * @see     java.lang.Character#isWhitespace(int)
4112:             * @since   1.5
4113:             */
4114:            public static boolean isISOControl(int codePoint) {
4115:                return (codePoint >= 0x0000 && codePoint <= 0x001F)
4116:                        || (codePoint >= 0x007F && codePoint <= 0x009F);
4117:            }
4118:
4119:            /**
4120:             * Returns a value indicating a character's general category.
4121:             *
4122:             * <p><b>Note:</b> This method cannot handle <a
4123:             * href="#supplementary"> supplementary characters</a>. To support
4124:             * all Unicode characters, including supplementary characters, use
4125:             * the {@link #getType(int)} method.
4126:             *
4127:             * @param   ch      the character to be tested.
4128:             * @return  a value of type <code>int</code> representing the 
4129:             *          character's general category.
4130:             * @see     java.lang.Character#COMBINING_SPACING_MARK
4131:             * @see     java.lang.Character#CONNECTOR_PUNCTUATION
4132:             * @see     java.lang.Character#CONTROL
4133:             * @see     java.lang.Character#CURRENCY_SYMBOL
4134:             * @see     java.lang.Character#DASH_PUNCTUATION
4135:             * @see     java.lang.Character#DECIMAL_DIGIT_NUMBER
4136:             * @see     java.lang.Character#ENCLOSING_MARK
4137:             * @see     java.lang.Character#END_PUNCTUATION
4138:             * @see     java.lang.Character#FINAL_QUOTE_PUNCTUATION
4139:             * @see     java.lang.Character#FORMAT
4140:             * @see     java.lang.Character#INITIAL_QUOTE_PUNCTUATION
4141:             * @see     java.lang.Character#LETTER_NUMBER
4142:             * @see     java.lang.Character#LINE_SEPARATOR
4143:             * @see     java.lang.Character#LOWERCASE_LETTER
4144:             * @see     java.lang.Character#MATH_SYMBOL
4145:             * @see     java.lang.Character#MODIFIER_LETTER
4146:             * @see     java.lang.Character#MODIFIER_SYMBOL
4147:             * @see     java.lang.Character#NON_SPACING_MARK
4148:             * @see     java.lang.Character#OTHER_LETTER
4149:             * @see     java.lang.Character#OTHER_NUMBER
4150:             * @see     java.lang.Character#OTHER_PUNCTUATION
4151:             * @see     java.lang.Character#OTHER_SYMBOL
4152:             * @see     java.lang.Character#PARAGRAPH_SEPARATOR
4153:             * @see     java.lang.Character#PRIVATE_USE
4154:             * @see     java.lang.Character#SPACE_SEPARATOR
4155:             * @see     java.lang.Character#START_PUNCTUATION
4156:             * @see     java.lang.Character#SURROGATE
4157:             * @see     java.lang.Character#TITLECASE_LETTER
4158:             * @see     java.lang.Character#UNASSIGNED
4159:             * @see     java.lang.Character#UPPERCASE_LETTER
4160:             * @since   1.1
4161:             */
4162:            public static int getType(char ch) {
4163:                return getType((int) ch);
4164:            }
4165:
4166:            /**
4167:             * Returns a value indicating a character's general category.
4168:             *
4169:             * @param   codePoint the character (Unicode code point) to be tested.
4170:             * @return  a value of type <code>int</code> representing the 
4171:             *          character's general category.
4172:             * @see     Character#COMBINING_SPACING_MARK COMBINING_SPACING_MARK
4173:             * @see     Character#CONNECTOR_PUNCTUATION CONNECTOR_PUNCTUATION
4174:             * @see     Character#CONTROL CONTROL
4175:             * @see     Character#CURRENCY_SYMBOL CURRENCY_SYMBOL
4176:             * @see     Character#DASH_PUNCTUATION DASH_PUNCTUATION
4177:             * @see     Character#DECIMAL_DIGIT_NUMBER DECIMAL_DIGIT_NUMBER
4178:             * @see     Character#ENCLOSING_MARK ENCLOSING_MARK
4179:             * @see     Character#END_PUNCTUATION END_PUNCTUATION
4180:             * @see     Character#FINAL_QUOTE_PUNCTUATION FINAL_QUOTE_PUNCTUATION
4181:             * @see     Character#FORMAT FORMAT
4182:             * @see     Character#INITIAL_QUOTE_PUNCTUATION INITIAL_QUOTE_PUNCTUATION
4183:             * @see     Character#LETTER_NUMBER LETTER_NUMBER
4184:             * @see     Character#LINE_SEPARATOR LINE_SEPARATOR
4185:             * @see     Character#LOWERCASE_LETTER LOWERCASE_LETTER
4186:             * @see     Character#MATH_SYMBOL MATH_SYMBOL
4187:             * @see     Character#MODIFIER_LETTER MODIFIER_LETTER
4188:             * @see     Character#MODIFIER_SYMBOL MODIFIER_SYMBOL
4189:             * @see     Character#NON_SPACING_MARK NON_SPACING_MARK
4190:             * @see     Character#OTHER_LETTER OTHER_LETTER
4191:             * @see     Character#OTHER_NUMBER OTHER_NUMBER
4192:             * @see     Character#OTHER_PUNCTUATION OTHER_PUNCTUATION
4193:             * @see     Character#OTHER_SYMBOL OTHER_SYMBOL
4194:             * @see     Character#PARAGRAPH_SEPARATOR PARAGRAPH_SEPARATOR
4195:             * @see     Character#PRIVATE_USE PRIVATE_USE
4196:             * @see     Character#SPACE_SEPARATOR SPACE_SEPARATOR
4197:             * @see     Character#START_PUNCTUATION START_PUNCTUATION
4198:             * @see     Character#SURROGATE SURROGATE
4199:             * @see     Character#TITLECASE_LETTER TITLECASE_LETTER
4200:             * @see     Character#UNASSIGNED UNASSIGNED
4201:             * @see     Character#UPPERCASE_LETTER UPPERCASE_LETTER
4202:             * @since   1.5
4203:             */
4204:            public static int getType(int codePoint) {
4205:                return CharacterData.of(codePoint).getType(codePoint);
4206:            }
4207:
4208:            /**
4209:             * Determines the character representation for a specific digit in
4210:             * the specified radix. If the value of <code>radix</code> is not a
4211:             * valid radix, or the value of <code>digit</code> is not a valid
4212:             * digit in the specified radix, the null character
4213:             * (<code>'&#92;u0000'</code>) is returned.
4214:             * <p>
4215:             * The <code>radix</code> argument is valid if it is greater than or
4216:             * equal to <code>MIN_RADIX</code> and less than or equal to
4217:             * <code>MAX_RADIX</code>. The <code>digit</code> argument is valid if
4218:             * <code>0&nbsp;&lt;=digit&nbsp;&lt;&nbsp;radix</code>.
4219:             * <p>
4220:             * If the digit is less than 10, then
4221:             * <code>'0'&nbsp;+ digit</code> is returned. Otherwise, the value
4222:             * <code>'a'&nbsp;+ digit&nbsp;-&nbsp;10</code> is returned.
4223:             *
4224:             * @param   digit   the number to convert to a character.
4225:             * @param   radix   the radix.
4226:             * @return  the <code>char</code> representation of the specified digit
4227:             *          in the specified radix.
4228:             * @see     java.lang.Character#MIN_RADIX
4229:             * @see     java.lang.Character#MAX_RADIX
4230:             * @see     java.lang.Character#digit(char, int)
4231:             */
4232:            public static char forDigit(int digit, int radix) {
4233:                if ((digit >= radix) || (digit < 0)) {
4234:                    return '\0';
4235:                }
4236:                if ((radix < Character.MIN_RADIX)
4237:                        || (radix > Character.MAX_RADIX)) {
4238:                    return '\0';
4239:                }
4240:                if (digit < 10) {
4241:                    return (char) ('0' + digit);
4242:                }
4243:                return (char) ('a' - 10 + digit);
4244:            }
4245:
4246:            /**
4247:             * Returns the Unicode directionality property for the given
4248:             * character.  Character directionality is used to calculate the
4249:             * visual ordering of text. The directionality value of undefined
4250:             * <code>char</code> values is <code>DIRECTIONALITY_UNDEFINED</code>.
4251:             *
4252:             * <p><b>Note:</b> This method cannot handle <a
4253:             * href="#supplementary"> supplementary characters</a>. To support
4254:             * all Unicode characters, including supplementary characters, use
4255:             * the {@link #getDirectionality(int)} method.
4256:             *
4257:             * @param  ch <code>char</code> for which the directionality property 
4258:             *            is requested.
4259:             * @return the directionality property of the <code>char</code> value.
4260:             *
4261:             * @see Character#DIRECTIONALITY_UNDEFINED
4262:             * @see Character#DIRECTIONALITY_LEFT_TO_RIGHT
4263:             * @see Character#DIRECTIONALITY_RIGHT_TO_LEFT
4264:             * @see Character#DIRECTIONALITY_RIGHT_TO_LEFT_ARABIC
4265:             * @see Character#DIRECTIONALITY_EUROPEAN_NUMBER
4266:             * @see Character#DIRECTIONALITY_EUROPEAN_NUMBER_SEPARATOR
4267:             * @see Character#DIRECTIONALITY_EUROPEAN_NUMBER_TERMINATOR
4268:             * @see Character#DIRECTIONALITY_ARABIC_NUMBER
4269:             * @see Character#DIRECTIONALITY_COMMON_NUMBER_SEPARATOR
4270:             * @see Character#DIRECTIONALITY_NONSPACING_MARK
4271:             * @see Character#DIRECTIONALITY_BOUNDARY_NEUTRAL
4272:             * @see Character#DIRECTIONALITY_PARAGRAPH_SEPARATOR
4273:             * @see Character#DIRECTIONALITY_SEGMENT_SEPARATOR
4274:             * @see Character#DIRECTIONALITY_WHITESPACE
4275:             * @see Character#DIRECTIONALITY_OTHER_NEUTRALS
4276:             * @see Character#DIRECTIONALITY_LEFT_TO_RIGHT_EMBEDDING
4277:             * @see Character#DIRECTIONALITY_LEFT_TO_RIGHT_OVERRIDE
4278:             * @see Character#DIRECTIONALITY_RIGHT_TO_LEFT_EMBEDDING
4279:             * @see Character#DIRECTIONALITY_RIGHT_TO_LEFT_OVERRIDE
4280:             * @see Character#DIRECTIONALITY_POP_DIRECTIONAL_FORMAT
4281:             * @since 1.4
4282:             */
4283:            public static byte getDirectionality(char ch) {
4284:                return getDirectionality((int) ch);
4285:            }
4286:
4287:            /**
4288:             * Returns the Unicode directionality property for the given
4289:             * character (Unicode code point).  Character directionality is
4290:             * used to calculate the visual ordering of text. The
4291:             * directionality value of undefined character is {@link
4292:             * #DIRECTIONALITY_UNDEFINED}.
4293:             *
4294:             * @param   codePoint the character (Unicode code point) for which
4295:             *          the directionality property is requested.
4296:             * @return the directionality property of the character.
4297:             *
4298:             * @see Character#DIRECTIONALITY_UNDEFINED DIRECTIONALITY_UNDEFINED
4299:             * @see Character#DIRECTIONALITY_LEFT_TO_RIGHT DIRECTIONALITY_LEFT_TO_RIGHT
4300:             * @see Character#DIRECTIONALITY_RIGHT_TO_LEFT DIRECTIONALITY_RIGHT_TO_LEFT
4301:             * @see Character#DIRECTIONALITY_RIGHT_TO_LEFT_ARABIC DIRECTIONALITY_RIGHT_TO_LEFT_ARABIC
4302:             * @see Character#DIRECTIONALITY_EUROPEAN_NUMBER DIRECTIONALITY_EUROPEAN_NUMBER
4303:             * @see Character#DIRECTIONALITY_EUROPEAN_NUMBER_SEPARATOR DIRECTIONALITY_EUROPEAN_NUMBER_SEPARATOR
4304:             * @see Character#DIRECTIONALITY_EUROPEAN_NUMBER_TERMINATOR DIRECTIONALITY_EUROPEAN_NUMBER_TERMINATOR
4305:             * @see Character#DIRECTIONALITY_ARABIC_NUMBER DIRECTIONALITY_ARABIC_NUMBER
4306:             * @see Character#DIRECTIONALITY_COMMON_NUMBER_SEPARATOR DIRECTIONALITY_COMMON_NUMBER_SEPARATOR
4307:             * @see Character#DIRECTIONALITY_NONSPACING_MARK DIRECTIONALITY_NONSPACING_MARK
4308:             * @see Character#DIRECTIONALITY_BOUNDARY_NEUTRAL DIRECTIONALITY_BOUNDARY_NEUTRAL
4309:             * @see Character#DIRECTIONALITY_PARAGRAPH_SEPARATOR DIRECTIONALITY_PARAGRAPH_SEPARATOR
4310:             * @see Character#DIRECTIONALITY_SEGMENT_SEPARATOR DIRECTIONALITY_SEGMENT_SEPARATOR
4311:             * @see Character#DIRECTIONALITY_WHITESPACE DIRECTIONALITY_WHITESPACE
4312:             * @see Character#DIRECTIONALITY_OTHER_NEUTRALS DIRECTIONALITY_OTHER_NEUTRALS
4313:             * @see Character#DIRECTIONALITY_LEFT_TO_RIGHT_EMBEDDING DIRECTIONALITY_LEFT_TO_RIGHT_EMBEDDING
4314:             * @see Character#DIRECTIONALITY_LEFT_TO_RIGHT_OVERRIDE DIRECTIONALITY_LEFT_TO_RIGHT_OVERRIDE
4315:             * @see Character#DIRECTIONALITY_RIGHT_TO_LEFT_EMBEDDING DIRECTIONALITY_RIGHT_TO_LEFT_EMBEDDING
4316:             * @see Character#DIRECTIONALITY_RIGHT_TO_LEFT_OVERRIDE DIRECTIONALITY_RIGHT_TO_LEFT_OVERRIDE
4317:             * @see Character#DIRECTIONALITY_POP_DIRECTIONAL_FORMAT DIRECTIONALITY_POP_DIRECTIONAL_FORMAT
4318:             * @since    1.5
4319:             */
4320:            public static byte getDirectionality(int codePoint) {
4321:                return CharacterData.of(codePoint).getDirectionality(codePoint);
4322:            }
4323:
4324:            /**
4325:             * Determines whether the character is mirrored according to the
4326:             * Unicode specification.  Mirrored characters should have their
4327:             * glyphs horizontally mirrored when displayed in text that is
4328:             * right-to-left.  For example, <code>'&#92;u0028'</code> LEFT
4329:             * PARENTHESIS is semantically defined to be an <i>opening
4330:             * parenthesis</i>.  This will appear as a "(" in text that is
4331:             * left-to-right but as a ")" in text that is right-to-left.
4332:             *
4333:             * <p><b>Note:</b> This method cannot handle <a
4334:             * href="#supplementary"> supplementary characters</a>. To support
4335:             * all Unicode characters, including supplementary characters, use
4336:             * the {@link #isMirrored(int)} method.
4337:             *
4338:             * @param  ch <code>char</code> for which the mirrored property is requested
4339:             * @return <code>true</code> if the char is mirrored, <code>false</code>
4340:             *         if the <code>char</code> is not mirrored or is not defined.
4341:             * @since 1.4
4342:             */
4343:            public static boolean isMirrored(char ch) {
4344:                return isMirrored((int) ch);
4345:            }
4346:
4347:            /**
4348:             * Determines whether the specified character (Unicode code point)
4349:             * is mirrored according to the Unicode specification.  Mirrored
4350:             * characters should have their glyphs horizontally mirrored when
4351:             * displayed in text that is right-to-left.  For example,
4352:             * <code>'&#92;u0028'</code> LEFT PARENTHESIS is semantically
4353:             * defined to be an <i>opening parenthesis</i>.  This will appear
4354:             * as a "(" in text that is left-to-right but as a ")" in text
4355:             * that is right-to-left.
4356:             *
4357:             * @param   codePoint the character (Unicode code point) to be tested.
4358:             * @return  <code>true</code> if the character is mirrored, <code>false</code>
4359:             *          if the character is not mirrored or is not defined.
4360:             * @since   1.5
4361:             */
4362:            public static boolean isMirrored(int codePoint) {
4363:                return CharacterData.of(codePoint).isMirrored(codePoint);
4364:            }
4365:
4366:            /**
4367:             * Compares two <code>Character</code> objects numerically.
4368:             *
4369:             * @param   anotherCharacter   the <code>Character</code> to be compared.
4370:
4371:             * @return  the value <code>0</code> if the argument <code>Character</code> 
4372:             *          is equal to this <code>Character</code>; a value less than 
4373:             *          <code>0</code> if this <code>Character</code> is numerically less 
4374:             *          than the <code>Character</code> argument; and a value greater than 
4375:             *          <code>0</code> if this <code>Character</code> is numerically greater 
4376:             *          than the <code>Character</code> argument (unsigned comparison).  
4377:             *          Note that this is strictly a numerical comparison; it is not 
4378:             *          locale-dependent.
4379:             * @since   1.2
4380:             */
4381:            public int compareTo(Character anotherCharacter) {
4382:                return this .value - anotherCharacter.value;
4383:            }
4384:
4385:            /**
4386:             * Converts the character (Unicode code point) argument to uppercase using
4387:             * information from the UnicodeData file.
4388:             * <p>
4389:             *
4390:             * @param   codePoint   the character (Unicode code point) to be converted.
4391:             * @return  either the uppercase equivalent of the character, if 
4392:             *          any, or an error flag (<code>Character.ERROR</code>) 
4393:             *          that indicates that a 1:M <code>char</code> mapping exists.
4394:             * @see     java.lang.Character#isLowerCase(char)
4395:             * @see     java.lang.Character#isUpperCase(char)
4396:             * @see     java.lang.Character#toLowerCase(char)
4397:             * @see     java.lang.Character#toTitleCase(char)
4398:             * @since 1.4
4399:             */
4400:            static int toUpperCaseEx(int codePoint) {
4401:                assert isValidCodePoint(codePoint);
4402:                return CharacterData.of(codePoint).toUpperCaseEx(codePoint);
4403:            }
4404:
4405:            /**
4406:             * Converts the character (Unicode code point) argument to uppercase using case
4407:             * mapping information from the SpecialCasing file in the Unicode
4408:             * specification. If a character has no explicit uppercase
4409:             * mapping, then the <code>char</code> itself is returned in the
4410:             * <code>char[]</code>.
4411:             *
4412:             * @param   codePoint   the character (Unicode code point) to be converted.
4413:             * @return a <code>char[]</code> with the uppercased character.
4414:             * @since 1.4
4415:             */
4416:            static char[] toUpperCaseCharArray(int codePoint) {
4417:                // As of Unicode 4.0, 1:M uppercasings only happen in the BMP.
4418:                assert isValidCodePoint(codePoint)
4419:                        && !isSupplementaryCodePoint(codePoint);
4420:                return CharacterData.of(codePoint).toUpperCaseCharArray(
4421:                        codePoint);
4422:            }
4423:
4424:            /**
4425:             * The number of bits used to represent a <tt>char</tt> value in unsigned
4426:             * binary form.
4427:             *
4428:             * @since 1.5
4429:             */
4430:            public static final int SIZE = 16;
4431:
4432:            /**
4433:             * Returns the value obtained by reversing the order of the bytes in the
4434:             * specified <tt>char</tt> value.
4435:             *
4436:             * @return the value obtained by reversing (or, equivalently, swapping)
4437:             *     the bytes in the specified <tt>char</tt> value.
4438:             * @since 1.5
4439:             */
4440:            public static char reverseBytes(char ch) {
4441:                return (char) (((ch & 0xFF00) >> 8) | (ch << 8));
4442:            }
4443:        }
w___w___w.__j_av__a___2___s___._c_om___
Home | Contact Us
All other trademarks are property of their respective owners.