开发者

UTF-8 in a Perl module name

开发者 https://www.devze.com 2023-02-27 05:26 出处:网络
How can I write a Perl module with UTF-8 in its name and filename? My current try yields \"Can\'t locate Täst.pm in @INC\", but the file does exist. I\'m on Windows, and haven\'t tried this on Linux

How can I write a Perl module with UTF-8 in its name and filename? My current try yields "Can't locate Täst.pm in @INC", but the file does exist. I'm on Windows, and haven't tried this on Linux yet.

test.pl:

use strict;
use warnings;
use utf8;
use Täst;

Täst.pm:

package Täst开发者_运维技巧;
use utf8;

Update: My current work-around it so use Tast (ASCII) and put package Täst (Unicode) in Tast.pm (ASCII). It's confusing, though.


Unfortunately, Perl, Windows, and Unicode filenames really don't go together at the moment. My advice is to save yourself a lot of hassle and stick with plain ASCII for your module names. This blog post mentions a few of the problems.


The use utf8 needs to appear before the package Täst, so that the latter can be correctly interpreted. On my Mac:

test.pl:

#!/usr/bin/perl

use strict;
use warnings;

use utf8;
use Tëst;

# 'use utf8' only indicates the code's encoding, but we also want stdout to be utf8
use encoding "utf8";

Tëst::hëllö();

Tëst.pm:

use utf8;
package Tëst;

sub Tëst::hëllö() {
    print "Hëllö, wörld!\n";
}

1;

Output:

Macintosh:Desktop sherm$ ./test.pl 
Hëllö, wörld!

As I said though - I ran this on my Mac. As cjm said above, your mileage may vary on Windows.


Unicode support often fails at the boundaries. Package and subroutine names need to map cleanly onto filenames, which is problematic on some operating systems. Not only does the OS have to create the filename that you expect, but you also have to be able to find it later as the same name.

We talked a little about the filename issue in Effective Perl Programming, but I also summarized much more in How do I create then use long Windows paths from Perl?. Jeff Atwood mentions this as part of his post on his Filesystem Paths: How Long is Too Long?.


I wouldn't recommend this approach if this is software you plan to release, to be honest. Even if you get it working fine for you, it's likely to be somewhat fragile on machines where UTF-8 isn't configured quite right, and/or filenames may not contain UTF-8 characters, etc.

0

精彩评论

暂无评论...
验证码 换一张
取 消