From the Twitter API docs ( http://dev.twitter.com/pag开发者_高级运维es/counting_characters ):
the 140 chars tweet limit doesn't really count the characters but rather the bytes of the string.
How would I be able to count the bytes in a string using Javascript or does every character in my string always use 2 bytes since I set the encoding of my page to UTF-8?
Perhaps there is already a nice counter function for me to use?
Actually, because of the t.co url shortener, just counting characters doesn't work anymore. Check out these two Twitter references to see how to handle shortened links:
https://support.twitter.com/articles/78124-how-to-shorten-links-urls
https://dev.twitter.com/docs/tco-url-wrapper/how-twitter-wrap-urls
If you're looking for help on the client-side, you'll have to make a new friend with twitter-text.js
https://github.com/twitter/twitter-text-js
I also posted a walk-through of a function I use to count the remaining characters in a tweet
http://blog.pay4tweet.com/2012/04/27/twitter-lifts-140-character-limit/
The function looks like this
function charactersleft(tweet) {
var url, i, lenUrlArr;
var virtualTweet = tweet;
var filler = "01234567890123456789";
var extractedUrls = twttr.txt.extractUrlsWithIndices(tweet);
var remaining = 140;
lenUrlArr = extractedUrls.length;
if ( lenUrlArr > 0 ) {
for (var i = 0; i < lenUrlArr; i++) {
url = extractedUrls[i].url;
virtualTweet = virtualTweet.replace(url,filler);
}
}
remaining = remaining - virtualTweet.length;
return remaining;
}
The function returns the number of characters remaining, assuming that all URLs, including those shortened to less than 20 characters, have been "shortened" by t.co to 19 characters plus a space.
It assumes that twitter-text.js is being included.
Thanks moluv00 for your answer that save me some search and put me on the right track. I just wanted to share the way I proceeded to deal with twitter characters counting (due to tiny urls) in my app.
A pull request as been merged on the github repository on 2012-05-31 introducing the twttr.txt.getTweetLength(text, options) function that is taking consideration to t.co URLs and defined as follow :
twttr.txt.getTweetLength = function(text, options) {
if (!options) {
options = {
short_url_length: 22,
short_url_length_https: 23
};
}
var textLength = text.length;
var urlsWithIndices = twttr.txt.extractUrlsWithIndices(text);
for (var i = 0; i < urlsWithIndices.length; i++) {
// Subtract the length of the original URL
textLength += urlsWithIndices[i].indices[0] - urlsWithIndices[i].indices[1];
// Add 21 characters for URL starting with https://
// Otherwise add 20 characters
if (urlsWithIndices[i].url.toLowerCase().match(/^https:\/\//)) {
textLength += options.short_url_length_https;
} else {
textLength += options.short_url_length;
}
}
return textLength;
};
So your function will just become :
function charactersleft(tweet) {
return 140 - twttr.txt.getTweetLength(tweet);
}
Plus, regarding the best practices with t.co we should retrieve the short_url_length and short_url_length_https values from twitter and pass them as the options parameter in the twttr.txt.getTweetLength function :
Request GET help/configuration once daily in your application and cache the "short_url_length" (t.co's current maximum length value) for 24 hours. Cache "short_url_length_https" (the maximum length for HTTPS-based t.co links) and use it as the length of HTTPS-based URLs.
Especially knowing that some changes in the t.co urls length will be effective on 2013-02-20 as described in the twitter developer blog
As others mentioned, twitter counts links as a string with length of 20. In our small project we ended up using following code piece :
function getTweetLength(input) {
var tmp = "";
for(var i = 0; i < 20; i++){tmp+="o"}
return input.replace(/(http[s]?:\/\/[\S]*)/g, tmp).length;
};
In case you are using angular.js, here is a small filter you can use in your angular.js app:
app.filter('tweetLength', function() {
return function(input) {
var tmp = "";
for(var i = 0; i < 20; i++){tmp+="o"}
return input.replace(/(http[s]?:\/\/[\S]*)/g, tmp).length;
};
});
And usage is as simple as :
Tweet length is {{tweet|tweetLength}}
How would I be able to count the bytes in a string using Javascript or does every character in my string always use 2 bytes since I set the encoding of my page to UTF-8?
JavaScript counts characters and not bytes. You don't have a problem at all.
"嘰嘰喳喳".length == 4
"Twitter".length == 7
Update: The above only is correct for strings that contain nothing but characters in the Basic Multilingual Plane (BMP).
Determining string length is not quite so simple when the string contains characters from outside the BMP (like Emoji) or combining marks. The following blog post discusses the matter exhaustively, reading it is highly recommended: https://mathiasbynens.be/notes/javascript-unicode
精彩评论