import java.util.regex.*;
* * *
private String capitalizeAllInitials(String text)
{
//capitalize the first letter of every word in the passed text
//(including little words like "a," "the," "and," "to")
//also capitalize words in quoted and hyphenated phrases
//NOTES: The pattern requires a leading space; I have found that
//ingested stories already capitalize the first word of the title.
Pattern p = Pattern.compile("(-|( (`|\\\"|\\\')?))([a-z])");
Matcher m = p.matcher(text);
StringBuffer sb = new StringBuffer();
while (m.find())
{
m.appendReplacement(sb, m.group(1) + m.group(4).toUpperCase());
}
m.appendTail(sb);
return sb.toString();
}
As the comments note, this code will handle ordinary words (The quick brown fox jumps over the lazy dog becomes The Quick Brown Fox Jumps Over The Lazy Dog), hyphenated phrases (Senator proposes pay-as-you-go plan becomes Senator Proposes Pay-As-You-Go Plan), and quoted phrases (Accused confesses, "we did it" becomes Accused Confesses, "We Did It") with single, double, or backquotes. The code assumes that the first word is already capitalized, so if that's not the case with you, you would need to add a ^ to the regular expression.
This code also doesn't handle the common forms of title casing, whereby articles, prepositions, and other small words are not capitalized. Also, this code capitalizes proper names and trademarks indiscriminately.
1 comment:
It's probably also worth noting that this code only works for 7-bit ASCII letters--no accents, no Unicode.
Post a Comment