开发者

Stripping out whitespace from a HTML email

开发者 https://www.devze.com 2022-12-20 13:23 出处:网络
I have a HTML file which has a lot of whitespace.My question is, is it wort开发者_JS百科h removing this whitespace in order to reduce file size before I send it?If so, what would be the quickest way t

I have a HTML file which has a lot of whitespace. My question is, is it wort开发者_JS百科h removing this whitespace in order to reduce file size before I send it? If so, what would be the quickest way to remove the whitespace?

Currently this is all in C#.

Due to my comment below not working properly, I've done it here:

<html>
   <head>
       <title>test title</title>
   </head>
</html>

It is the spacing before the opening tags that I'm wanting to remove, if it's worth it.


If it is really quite a lot of white space, removing it will be good - you end up trasmitting less over the wire.

Assuming this is mostly spaces, tabs and carriage returns, I would use a regular expression and the replace with a space:

RegEx reg = new RegEx("\s");
string result = reg.Repalce(myHTML, " ");

This also assumes you are in control of the input HTML, as you shouldn't use regular expressions for parsing HTML.


I guess you mean removing the tabs and spaces on the beginning of each row. You can use regular expressions for this. Check http://www.regular-expressions.info/examples.html for a example (Under 'Trimming Whitespace')

Before you do this, I would check if there is really a big difference in file-size.


You mean &nbsp;?
If yes so use the string.Replace function


It's not worth the trouble. You are basicly ruining any formating that the file may have. That formating may be desired.

The first time you have to debug the file, when someone sits and reformats it to read the thing, you'll have just wasted any time you saved.

You will have wasted the money it costs for someone to spend 30 minutes formating the thing to read.

You will also be wasting your time creating a potentially buggy step that may accidentially remove valid spacing, because using regex for html is not reliable.

What will you gain? a few spaces and newlines removed?

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号