开发者

read characters and words in HTML text using javascript

开发者 https://www.devze.com 2022-12-13 23:20 出处:网络
consider that i am getting a HTML format string and want to read the number of words & characters,

consider that i am getting a HTML format string and want to read the number of words & characters, Consider, i am getting,

var HTML =  '<p>BODY&nbsp;Article By Archie(Used By Story Tool)</p>';

now i want to get number of words and characters

above html will look like:

BODY Article By Archie(Used By Story Tool)

IMPORTANT

  1. i want to avoid html tags while counting words or character
  2. avoid keywo开发者_StackOverflow中文版rds like **&nbsp;** etc..
  3. Ex. words and character should be counted of : (for current example)

    BODY Article By Archie(Used By Story Tool)

please help,

Thank You.


To give an example for adamantium's suggestion:

var e = document.createElement("span");
e.innerHTML = '<p>BODY&nbsp;Article By Archie(Used By Story Tool)</p>';
var text = e.textContent || e.innerText;

var characterCount = text.length;
var wordCount = text.split(/[\s\.\(\),]+/).length;

Update: Added other word-stop characters


  1. Use a hidden HTML element that can render text like span or p

  2. Assign the string to the innerHTML of the hidden element.

  3. Count the characters using length property of innerText/textContent.

To read the word count you can

  1. Split the innerText/textContent using empty space

  2. Get the length of the returned array.


Algorithm:

  • Sweep through the entire html
  • Perform regex replaces
    • replace <.*> (regex for anyting tat stays withing <>)by nothing
    • replace /&nbsp/ by nothing
  • tip: can be done by replace function in javascript. hunt on w3schools.com

Now you have the clutter out!

then perform a simple word/character count

0

精彩评论

暂无评论...
验证码 换一张
取 消