0

I have try'd some algorithms but no luck solving this problem.

Lets have a further explanation of the behaviour with an example

we have a string: @"example example"

So if i call rangeOfWordAtIndex:10 on the string.

the result would be: well the word @"example" at location 9 with a length 7.

It should not give @"example" at index 0 with a length of 7.

Here is the code i have produced so far:

#define unicode_space 32 // this is correct printed it out from code
@implementation NSString (wordAt)

- (NSRange) rangeOfWordAtIndex:(NSInteger) index
{
    NSInteger beginIndex = index;
    while(beginIndex > 0 && [self characterAtIndex:beginIndex-1] != unicode_space)
    {
        beginIndex--;
    }
    NSInteger endIndex = index;
    NSInteger sLenght = [self length];
    while (endIndex < sLenght && [self characterAtIndex:endIndex+1] != unicode_space)
    {
        endIndex++;
    }
    return NSMakeRange(beginIndex, endIndex - beginIndex);
}

@end

But it just doesn't work. without the +1 and -1 it keeps a space as part of a word.

And with it forgets the first character of the word.

Can someone please give some useful suggestion.

3
  • 2
    If I remove the + 1 from the endIndex condition (which causes an NSRangeException), this correctly gives the answer {8, 7} for your example string. Remember that the first index is 0, not 1.
    – jscs
    Sep 14 '13 at 0:53
  • thank you very much! I think I became confused with the numbers after modifying the code. Sep 14 '13 at 9:04
  • Do you want to write an algorithm yourself or you are okay with using Foundation's NSLinguisticTagger?
    – Tricertops
    Aug 18 '14 at 12:48
3

Detecting words is a bit more complex than finding U+0020 SPACE character. Fortunately, Foundation provides NSLinguisticTagger class that with full Unicode support. Here is how you find word and it's range at given index:

Objective-C

NSLinguisticTagger *tagger = [[NSLinguisticTagger alloc] initWithTagSchemes:@[ NSLinguisticTagSchemeTokenType ] options:kNilOptions];
tagger.string = @"Hello, World!";
NSRange range = NSMakeRange(0, 0);
NSString *tag = [tagger tagAtIndex:10 scheme:NSLinguisticTagSchemeTokenType tokenRange:&range sentenceRange:nil];

if ([tag isEqualToString:NSLinguisticTagWord]) {
    NSString *word = [tagger.string substringWithRange:range];
    // You have the word: "World"
}
else {
    // Punctuation, whitespace or other.
}

Swift

let tagger = NSLinguisticTagger(tagSchemes: [NSLinguisticTagSchemeTokenType], options: 0)
tagger.string = "Hello, World!"
var range : NSRange = NSRange(location: 0, length: 0)
let tag = tagger.tagAtIndex(10, scheme: NSLinguisticTagSchemeTokenType, tokenRange: &range, sentenceRange: nil)

if let string = tagger.string where tag == NSLinguisticTagWord {
    let word = (string as NSString).substringWithRange(range)
    // You have the word: "World"
}
else {
    // Punctuation, whitespace or other.
}

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service, privacy policy and cookie policy

Not the answer you're looking for? Browse other questions tagged or ask your own question.