Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.

Last year (2009), the Google Code Jam featured an interesting problem as the first problem in Round 1B: Decision Tree

As the problem seemed tailored for Lisp-like languages, we spontaneously had an exciting codegolf here on SO, in which a few languages managed to solve the problem in fewer characters than any Lisp variety, using quite a number of different techniques.

This year's Round 1B Problem A (File Fix-it) also seems tailored for a particular family of languages, Unix shell scripts. So continuing the "1B-A tradition" would be appropriate. :p But which language will end up with the shortest code? Let us codegolf and see!

Problem description (adapted from official page):

You are given T test cases. Each test case contains N lines that list the full path of all directories currently existing on your computer. For example:

/home
/home/awesome
/home/awesome/wheeeeeee
/home/awesome/wheeeeeee/codegolfrocks
/home/thecakeisalie

Next, you are given M lines that list the full path of directories you would like to create. They are in the same format as the previous examples. You can create a directory using the mkdir command, but you can only do so if the parent directory already exists. For example, to create the directories /pyonpyon/fumufumu/yeahyeah and /pyonpyon/fumufumu/yeahyeahyeah, you would need to use mkdir four times:

mkdir /pyonpyon
mkdir /pyonpyon/fumufumu
mkdir /pyonpyon/fumufumu/yeahyeah
mkdir /pyonpyon/fumufumu/yeahyeahyeah

For each test case, return the number of times you have to call mkdir to create all the directories you would like to create.

Input

Input consists of a text file whose first line contains the integer T, the number of test cases. The rest of the file contains the test cases.

Each test case begins with a line containing the integers N and M, separated by a space.

The next N lines contain the path of each directory currently existing on your computer (not including the root directory /). This is a concatenation of one or more non-empty lowercase alphanumeric strings, each preceded by a single /.

The following M lines contain the path of each directory you would like to create.

Output

For each case, print one line containing Case #X: Y, where X is the case number and开发者_开发百科 Y is the solution.

Limits

1 ≤ T ≤ 100.

0 ≤ N ≤ 100.

1 ≤ M ≤ 100.

Each path contains at most 100 characters.

Every path appears only once in the list of directories already on your computer, or in the list of desired directories. However, a path may appear on both lists, as in example case #3 below.

If a directory is in the list of directories already on your computer, its parent directory will also be listed, with the exception of the root directory /.

The input file is at most 100,000 bytes long.

Example

Larger sample test cases may be downloaded here.

Input:

3
0 2
/home/sparkle/pyon
/home/sparkle/cakes
1 3
/z
/z/y
/z/x
/y/y
2 1
/moo
/moo/wheeeee
/moo

Output:

Case #1: 4
Case #2: 4
Case #3: 0

Code Golf

Please post your shortest code in any language that solves this problem. Input and output may be handled via stdin and stdout or by other files of your choice. Please include a disclaimer if your code has the potential to modify or delete existing files when executed.

Winner will be the shortest solution (by byte count) in a language with an implementation existing prior to the start of Round 1B 2010. So while you are free to use a language you've just made up in order to submit a 0-byte solution, it won't count, and you'll probably get downvotes. ^_^

Standings

65 GolfScript
100 Perl
121 AWK (not including interpreter options)
128 Bash
144 Ruby
145 Python
158 PyonScript
159 J
172 Lua
182 PostScript
189 Scala
218 Haskell
284 Java
398 C#

Python, 193 175 173 171 166 165 164 156 151 149 147 146 145

r=raw_input;c=0;r()
while 1:N,M=map(int,r().split());d={};c+=1;exec"e=r()\nwhile e:d[e]=e,_=e.rsplit('/',1)\n"*(N+M);print'Case #%d:'%c,len(d)-N

This solution throws an EOFError, but since this is output to stderr it is within the rules. Since the output we're interested in all goes to stdout, we can pipe that and ignore stderr.

You might read the above (or some of the other posts) and think it shouldn't work. There's a bit of a trick to why, and I will explain this here. If you read the question carefully, it tells us that for each directory in the first list, all it's parent directories are also included in the list (e.g. if /home/marcog is present, so is /home) [1]. The question also guarantees us each list will have no duplicates [2]. This lets us throw all directories in the first list into the same set into which we throw the directories from the second list. Since the question guarantees no duplicates per list, we know that the first list will result in exactly N entries in the set. Therefore we can get the final answer by subtracting N from the size of the final set.

[1] "The next N lines each give the path of one directory that already exists on your computer. This list will include every directory already on your computer other than the root directory."

[2] "No path will appear twice in the list of directories already on your computer, or in the list of directories you wish to create."

Golfscript, 74 69 65

Now on a single line!
This solution is 63 characters, but will not run in a reasonable amount of time for test cases with thousands of paths (over 8 minutes), so I have opted to not count it.

n%(~,{'Case #'\)@': '\(~1$+@[]@{\('/':!%@\{[n!++:!]|}/}*,@-n@}/

The faster 65 character solution:

n%(~,{'Case #'\)@': '\(~1$+@[]@{\('/':!%@\{[n!++:!]+}/.&}*,@-n@}/

Kudos to marcog for the algorithm!

Perl, 111 106 100

perl -naE 'BEGIN{<>}++$i;($n,$m,%d)=@F;for(1..$n+$m){$_=<>;$d{$`}++while/\w\K\b/g}say"Case #$i: ",keys(%d)-$n'

As ever with golf programs dependent on interpreter command-line options, the options' bytecount difference with respect to the default has been added to the actual code length.

Indented, commented

#! perl -na      # activate line mode and autosplit
BEGIN { <> }     # skip test case count

# now for each test case:

++$i;            # increment test counter
($n,$m,%d) = @F; # read N and M;
                 # clear out directory hash

for (1..$n+$m) { # for each expected pathname:
  $_ = <>;          # read it
  $d{$`}++          # add to hash...
    while /\w\K\b/g # ...all branches from root
}

say "Case #$i: ", keys(%d) - $n

Most of the magic is the branch-from-root extraction. The trick is to uses a regex to locate the interesting cutting points (namely: before each slash, and the end of the string), but to extract the string using Perl's $PREMATCH (dollar-backtick; markdown won't let me include that properly) instead of the usual pattern-matching facilities.

\b locates a word boundary, which resolves to all words' (directories') start and end. We only want their end, so we prepend a \w. But that would cut the last character from the path, which is a problem if we get both /foo/bar and /foo/baz in the same dataset. So we exclude said character from the match with \K. None of this would be necessary if Perl had a \>-like (Emacs regexes) metacharacter.

As a stand-alone program (106)

for$i(1..<>){($n,$m,%d)=split$",<>;
for(1..$n+$m){$_=<>;$d{$`}++while/\w\K\b/g}
say"Case #$i: ",keys(%d)-$n}

Newlines for legibility; none of them is necessary or counted in.

It uses features found only in the latest versions of Perl, so run with perl -M5.010 or later.

Lua solution, 172 bytes:

r,n=io.read,"*n"for i=1,r(n)do a,c,s=r(n),0,{}for i=1,a+r()do k=""for x in r():gmatch"[^/]+"do k=k.."/"..x c=c+(s[k]or 1);s[k]=0 end end print("Case #"..i..": "..(c-a))end

Haskell, 218

import Data.List
main=interact$(!1).tail.lines
(l:j)!i=let{[n,m]=map read$words l;(p,k)=splitAt(n+m)j}in"Case #"++show i++": "++show(length(nub$(>>=s)p)-n)++"\n"++k!(i+1)
s(_:p)=map(flip take p)$elemIndices '/'(p++"/")

Similar (but way shorter :P) to the other Haskell solution.

Ends up in error, and that's expected. Whether or not that follows the rules is a bit more prone to debate than for other solutions. The output and error streams are indeed not mixed up. But if stdout is buffered, the results are never sent. That's ok for interactive use (then just copy-paste console output to a file), but it mostly rules out redirection. To make it short, ./filefixit <input 2>/dev/null does the trick.

The problem can be avoided by inserting line noise in line 3: []!_="" (8 more bytes, 226 in total)

For clarity, the exact same semantics with full indentation and meaningful identifiers:

import Data.List
main = interact $ testsStartingAt 1 . tail . lines
testsStartingAt _   []   = "" -- this line optional
testsStartingAt i (l:ls) = testOutput ++ testsStartingAt (i+1) ls'
    where [n,m]       = map read $ words l
          (paths,ls') = splitAt (n+m) ls
          testResult  = length (nub $ concatMap splitPath paths) - n
          testOutput  = "Case #" ++ show i ++ ": " ++ show testResult ++ "\n"
splitPath (_:p) = map (flip take p) $ elemIndices '/' (p ++ "/")

Bash, 175 169 168 135 130 128

WARNING: Be sure to run in an empty directory, as this will wipe out its contents first thing per test.

read t
for((;x++<t;));do
rm -r *
read n m
for((i=n+m;i--;));do
read d
mkdir -p .$d
done
echo Case \#$x: $[`find|wc -l`-n-1]
done

Ruby, 151 149 144

Direct translation to Ruby of marcog's Python solution:

gets.to_i.times{|i|n,m=gets.split.map &:to_i
d={}
(n+m).times{gets.strip.split('/').inject{|a,p|d[a+='/'+p]=a}}
puts"Case ##{i+1}: #{d.size-n}"}

PostScript

182 ~~212~~ ~~247~~ ~~262~~ ~~278~~ bytes

1[([){exch}/C{concatstrings}/R{(%lineedit)run}>>begin 1
R{(: )[(Case #)3{=only}repeat<<>>R 1 index add{()<<R]{99 string
cvs C<C>C dup 3 index[<>put}forall pop}repeat[length[sub =}for

Usage: $ gs -q -dNODISPLAY -dNOPROMPT thisfile.ps <input >output

C#, 489 442 400 398

Just for fun, no chance of ever being very short of course. Count is after removing insignificant white space, which I left in here for readability.

namespace System
{
    class P
    {
        static void Main(string[]a)
        {
            int c = 0, n, m, d, l = 1;
            var f = IO.File.ReadAllLines(a[0]);

            while (c < int.Parse(f[0]))
            {
                var o = f[l++].Split(' ');
                n = int.Parse(o[0]);
                m = int.Parse(o[1]);
                var p = new Collections.Generic.HashSet<string>();

                while (n-- > 0)
                    p.Add(f[l++]);

                while (m-- > 0)
                    for (o = f[l++].Split('/'), d = 0; d++ < o.Length; )
                        if (p.Add(string.Join("/", o, 0, d)))
                            n++;

                Console.Write("Case #{0}: {1}\n", ++c, n);
            }
        }
    }
}

This latest version has a quirk. It mistakenly counts the root directory as having to be created once, to compensate for variable n being -1 at the start of the loop, instead of the desired 0. It works because the root directory is guaranteed not to be in N.

Haskell: 299

import Data.List
import Text.Printf
a[]=[]
a(x:z)=(r-n):next where{;[n,m]=map read$words x;t=n+m;r=length$nub$concatMap(f.reverse)$take t z;next=a$drop t z;f""=[];f y=y:f z where(a,b:z)=span(/='/')y}
main=do{z<-getContents;putStr$unlines[printf"Case #%d: %d"x v|(x::Int,v)<-zip[1..]$a$tail$lines z]}

This requires GHC's -XScopedTypeVariables switch.

Readable version:

import Data.List
import Text.Printf

a [] = []
a (x:xs) = (r-n) : next where
    [n,m] = map read $ words x
    t = n+m
    r = length $ nub $ concatMap (f . reverse) $ take t xs
    next = a $ drop t xs
    f "" = []
    f y = y : f bs where
        (a,b:bs) = span (/= '/') y

cleanUp a = unlines [printf "Case #%d: %d" x v | (x::Int,v::Int) <- zip [1..] a]

main = do
    z<-getContents
    putStr$cleanUp$a$tail$lines z

Scala, 190 189

Yet another version of marcog's solution, this time in Scala. Runs with scala, but would need to be put into a class to allow compilation with scalac.

for(c←1 to readInt){val I=readLine split" "map(_.toInt)
var d=Set("")
for(i←1-I(0)to I(1)){var a="";for(w←readLine split"/"){a+="/"+w;d+=a}}
printf("Case #%d: %d\n",c,d.size-I(0)-2)}

AWK - 123 121 chars

Another adaptation of marcog python version. Run with awk -F'[ \]' -f fixit.awk

function p(){if(c++>1)print"Case #"c-2": "length(k)-n}
/\//{for(y=i=1;i<NF;)k[y=y"/"$++i];next}
{p();n=$1;delete k}
END{p()}

To run it without the -F option, it grows by 15 chars, since it then needs this part:

BEGIN{FS=" |/"}

PyonScript

158 ~~159~~ bytes

1({[([){exch}/R{(%lineedit)run}>>begin R{(: )[(Case #)3{=only}repeat<<>>R 1
index +{<><<R]{str cat<C>cat dup 3 index[<>put}forall pop}repeat[length[-
=}for}1)

Usage: $ pyon thisfile.pyon <input >output

Based on the PostScript solution.

Since PyonScript development is still in progress, this code works on the implementation as it existed at the start of Round 1B 2010: http://github.com/KirarinSnow/PyonScript

J - ^{205 159} 140 characters

c=:0
f=:3 :0
c=:>:c
'a b'=:({.,+/)".>{.y
('Case #',(":c),': ',":a-~3 :'#~.,/>\"1<;.1"1 y'>(>:i.b){y)1!:2<2
(>:b)}.y
)
f^:_(}.<;._2 (1!:1)<3)

run:

script.ijs < gcj-input

Still, it outputs one extra line :/

Java, 277

import java.util.*;enum A{_;{Scanner s=new Scanner(System.in);for(int
t=s.nextInt(),i=0,n,j;i++<t;){Set x=new
HashSet();n=s.nextInt();for(j=s.nextInt();j-->-n;){String b="";for(String
c:s.next().split("/"))x.add(b+=c+'/');}System.out.println("Case #"+i+": "+(x.size()-n-1));}}}

Note: it outputs an error message but still works correctly.

File Fix-it codegolf (GCJ 2010 1B-A)

Python, 193 175 173 171 166 165 164 156 151 149 147 146 145

Golfscript, 74 69 65

Perl, 111 106 100

Indented, commented

As a stand-alone program (106)

Haskell, 218

Bash, 175 169 168 135 130 128

Ruby, 151 149 144

PostScript

C#, 489 442 400 398

Haskell: 299

Scala, 190 189

AWK - 123 121 chars

PyonScript

J - ^{205 159} 140 characters

Java, 277

精彩评论

关注公众号

热门标签

图文推荐

File Fix-it codegolf (GCJ 2010 1B-A)

Python, 193 175 173 171 166 165 164 156 151 149 147 146 145

Golfscript, 74 69 65

Perl, 111 106 100

Indented, commented

As a stand-alone program (106)

Haskell, 218

Bash, 175 169 168 135 130 128

Ruby, 151 149 144

PostScript

C#, 489 442 400 398

Haskell: 299

Scala, 190 189

AWK - 123 121 chars

PyonScript

J - 205 159 140 characters

Java, 277

更多 问答 相关资讯：

精彩评论

关注公众号

热门标签

图文推荐

J - ^{205 159} 140 characters

更多问答相关资讯：