I attempted Apple’s personal instance:
import NaturalLanguage
let textual content = "The American Crimson Cross was established in Washington, D.C., by Clara Barton."
let tagger = NLTagger(tagSchemes: [.nameType])
tagger.string = textual content
let choices: NLTagger.Choices = [.omitPunctuation, .omitWhitespace, .joinNames]
let tags: [NLTag] = [.personalName, .placeName, .organizationName]
tagger.enumerateTags(in: textual content.startIndex..<textual content.endIndex, unit: .phrase, scheme: .nameType, choices: choices) { tag, tokenRange in
// Get the almost definitely tag, and print it if it is a named entity.
if let tag = tag, tags.comprises(tag) {
print("(textual content[tokenRange]): (tag.rawValue)")
}
// Get a number of attainable tags with their related confidence scores.
let (hypotheses, _) = tagger.tagHypotheses(at: tokenRange.lowerBound, unit: .phrase, scheme: .nameType, maximumCount: 1)
print(hypotheses)
return true
}
However it returns all title tags as Different
. I additionally tried one other instance of tagging the sentence with lexical class, and it additionally tags each phrase as OtherWord
:
var textual content = "The American Crimson Cross was established in Washington, D.C., by Clara Barton."
let tagger = NLTagger(tagSchemes: [.lexicalClass])
tagger.string = textual content
let choices: NLTagger.Choices = [.omitPunctuation, .omitWhitespace, .joinNames]
print("language", tagger.dominantLanguage)
tagger.enumerateTags(in: textual content.startIndex..<textual content.endIndex, unit: .phrase, scheme: .lexicalClass, choices: choices) { tag, tokenRange in
// Get the almost definitely tag, and print it if it is a named entity.
if let tag = tag {
print("(textual content[tokenRange]): (tag.rawValue)")
}
return true
}
I attempted the reply for this query by setting language orthography but it surely did not assist:
//tagger.setOrthography(NSOrthography(dominantScript: "Latn", languageMap: ["Latn": ["en"]]), vary: textual content.startIndex..<textual content.endIndex)
tagger.setOrthography(NSOrthography.defaultOrthography(forLanguage: "en-US"), vary: textual content.startIndex..<textual content.endIndex)
Anyone has a clue why is it like this?
By the way in which, my Xcode model is the most recent one as of right now, 14.3.