开发者

Writing a file with UTF8 encoding in PHP

开发者 https://www.devze.com 2023-01-05 04:48 出处:网络
I am writing a function to dynamically generate my sitemap and sitemap index. According to 开发者_运维问答the docs on sitemap.org, the file should be encoded in UTF-8.

I am writing a function to dynamically generate my sitemap and sitemap index.

According to 开发者_运维问答the docs on sitemap.org, the file should be encoded in UTF-8.

My function for writing the file is a rather simplistic one, something along the lines of:

function generateFile()
{
  $xml = create_xml();
  $fp = @fopen('sitemap', 'w');
  fwrite($fp, $xml);
  fclose($fp);
}

[Edit - added after comments ]

The create_xml() is simplistic, like so:

function create_xml()
{
return '<?xml version='1.0' encoding='UTF-8'?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9
                http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd">
    <url>
        <loc>http://example.com/</loc>
        <lastmod>2006-11-18</lastmod>
        <changefreq>daily</changefreq>
        <priority>0.8</priority>
    </url>
</urlset>';
}

Is there anything in particular I need to do to ensure that the file is encoded in UTF-8?

Additionally, I would like to gzip the file, rather than leaving it uncompressed. I know how to compress the file AFTER I have saved it to disk. I want to know if (how?), can I compress the file BEFORE writing to disk?


Yes, you need to make sure your content (the output of create_xml() is encoded as UTF-8. To ensure this, you can use utf8_encode(). You need to make sure the XML file specifies <?xml version="1.0" encoding="UTF-8"?>. And I'd suggest to fopen in the 'wb' mode, the b meaning binary. This will ensure the data gets written exactly as-is.


Your PHP script files should be saved as utf-8.

Also, it's hard to say more without seeing what create_xml() does


If you are using only ASCII characters, your file will be always in UTF-8.

0

精彩评论

暂无评论...
验证码 换一张
取 消