Last year (2009), the Google Code Jam featured an interesting problem as the first problem in Round 1B: Decision Tree
As the problem seemed tailored for Lisp-like languages, we spontaneously had an exciting codegolf here on SO, in which a few languages managed to solve the problem in fewer characters than any Lisp variety, using quite a number of different techniques.
This year's Round 1B Problem A (File Fix-it) also seems tailored for a particular family of languages, Unix shell scripts. So continuing the "1B-A tradition" would be appropriate. :p But which language will end up with the shortest code? Let us codegolf and see!
Problem description (adapted from official page):
You are given T test cases. Each test case contains N lines that list the full path of all directories currently existing on your computer. For example:
/home
/home/awesome
/home/awesome/wheeeeeee
/home/awesome/wheeeeeee/codegolfrocks
/home/thecakeisalie
Next, you are given M lines that list the full path of directories you would like to create. They are in the same format as the previous examples. You can create a directory using the mkdir
command, but you can only do so if the parent directory already exists. For example, to create the directories /pyonpyon/fumufumu/yeahyeah
and /pyonpyon/fumufumu/yeahyeahyeah
, you would need to use mkdir
four times:
mkdir /pyonpyon
mkdir /pyonpyon/fumufumu
mkdir /pyonpyon/fumufumu/yeahyeah
mkdir /pyonpyon/fumufumu/yeahyeahyeah
For each test case, return the number of times you have to call mkdir
to create all the directories you would like to create.
Input
Input consists of a text file whose first line contains the integer T, the number of test cases. The rest of the file contains the test cases.
Each test case begins with a line containing the integers N and M, separated by a space.
The next N lines contain the path of each directory currently existing on your computer (not including the root directory /
). This is a concatenation of one or more non-empty lowercase alphanumeric strings, each preceded by a single /
.
The following M lines contain the path of each directory you would like to create.
Output
For each case, print one line containing Case #X: Y
, where X
is the case number and开发者_开发百科 Y
is the solution.
Limits
1 ≤ T ≤ 100.
0 ≤ N ≤ 100.
1 ≤ M ≤ 100.
Each path contains at most 100 characters.
Every path appears only once in the list of directories already on your computer, or in the list of desired directories. However, a path may appear on both lists, as in example case #3 below.
If a directory is in the list of directories already on your computer, its parent directory will also be listed, with the exception of the root directory /
.
The input file is at most 100,000 bytes long.
Example
Larger sample test cases may be downloaded here.
Input:
3
0 2
/home/sparkle/pyon
/home/sparkle/cakes
1 3
/z
/z/y
/z/x
/y/y
2 1
/moo
/moo/wheeeee
/moo
Output:
Case #1: 4
Case #2: 4
Case #3: 0
Code Golf
Please post your shortest code in any language that solves this problem. Input and output may be handled via stdin and stdout or by other files of your choice. Please include a disclaimer if your code has the potential to modify or delete existing files when executed.
Winner will be the shortest solution (by byte count) in a language with an implementation existing prior to the start of Round 1B 2010. So while you are free to use a language you've just made up in order to submit a 0-byte solution, it won't count, and you'll probably get downvotes. ^_^
Standings
- 65 GolfScript
- 100 Perl
- 121 AWK (not including interpreter options)
- 128 Bash
- 144 Ruby
- 145 Python
- 158 PyonScript
- 159 J
- 172 Lua
- 182 PostScript
- 189 Scala
- 218 Haskell
- 284 Java
- 398 C#
Python, 193 175 173 171 166 165 164 156 151 149 147 146 145
r=raw_input;c=0;r()
while 1:N,M=map(int,r().split());d={};c+=1;exec"e=r()\nwhile e:d[e]=e,_=e.rsplit('/',1)\n"*(N+M);print'Case #%d:'%c,len(d)-N
This solution throws an EOFError, but since this is output to stderr it is within the rules. Since the output we're interested in all goes to stdout, we can pipe that and ignore stderr.
You might read the above (or some of the other posts) and think it shouldn't work. There's a bit of a trick to why, and I will explain this here. If you read the question carefully, it tells us that for each directory in the first list, all it's parent directories are also included in the list (e.g. if /home/marcog is present, so is /home) [1]. The question also guarantees us each list will have no duplicates [2]. This lets us throw all directories in the first list into the same set into which we throw the directories from the second list. Since the question guarantees no duplicates per list, we know that the first list will result in exactly N entries in the set. Therefore we can get the final answer by subtracting N from the size of the final set.
[1] "The next N lines each give the path of one directory that already exists on your computer. This list will include every directory already on your computer other than the root directory."
[2] "No path will appear twice in the list of directories already on your computer, or in the list of directories you wish to create."
Golfscript, 74 69 65
Now on a single line!
This solution is 63 characters, but will not run in a reasonable amount of time for test cases with thousands of paths (over 8 minutes), so I have opted to not count it.
n%(~,{'Case #'\)@': '\(~1$+@[]@{\('/':!%@\{[n!++:!]|}/}*,@-n@}/
The faster 65 character solution:
n%(~,{'Case #'\)@': '\(~1$+@[]@{\('/':!%@\{[n!++:!]+}/.&}*,@-n@}/
Kudos to marcog for the algorithm!
Perl, 111 106 100
perl -naE 'BEGIN{<>}++$i;($n,$m,%d)=@F;for(1..$n+$m){$_=<>;$d{$`}++while/\w\K\b/g}say"Case #$i: ",keys(%d)-$n'
As ever with golf programs dependent on interpreter command-line options, the options' bytecount difference with respect to the default has been added to the actual code length.
Indented, commented
#! perl -na # activate line mode and autosplit
BEGIN { <> } # skip test case count
# now for each test case:
++$i; # increment test counter
($n,$m,%d) = @F; # read N and M;
# clear out directory hash
for (1..$n+$m) { # for each expected pathname:
$_ = <>; # read it
$d{$`}++ # add to hash...
while /\w\K\b/g # ...all branches from root
}
say "Case #$i: ", keys(%d) - $n
Most of the magic is the branch-from-root extraction. The trick is to uses a regex to locate the interesting cutting points (namely: before each slash, and the end of the string), but to extract the string using Perl's $PREMATCH
(dollar-backtick; markdown won't let me include that properly) instead of the usual pattern-matching facilities.
\b
locates a word boundary, which resolves to all words' (directories') start and end. We only want their end, so we prepend a \w
. But that would cut the last character from the path, which is a problem if we get both /foo/bar
and /foo/baz
in the same dataset. So we exclude said character from the match with \K
. None of this would be necessary if Perl had a \>
-like (Emacs regexes) metacharacter.
As a stand-alone program (106)
for$i(1..<>){($n,$m,%d)=split$",<>;
for(1..$n+$m){$_=<>;$d{$`}++while/\w\K\b/g}
say"Case #$i: ",keys(%d)-$n}
Newlines for legibility; none of them is necessary or counted in.
It uses features found only in the latest versions of Perl, so run with perl -M5.010 or later.
Lua solution, 172 bytes:
r,n=io.read,"*n"for i=1,r(n)do a,c,s=r(n),0,{}for i=1,a+r()do k=""for x in r():gmatch"[^/]+"do k=k.."/"..x c=c+(s[k]or 1);s[k]=0 end end print("Case #"..i..": "..(c-a))end
Haskell, 218
import Data.List
main=interact$(!1).tail.lines
(l:j)!i=let{[n,m]=map read$words l;(p,k)=splitAt(n+m)j}in"Case #"++show i++": "++show(length(nub$(>>=s)p)-n)++"\n"++k!(i+1)
s(_:p)=map(flip take p)$elemIndices '/'(p++"/")
Similar (but way shorter :P) to the other Haskell solution.
Ends up in error, and that's expected. Whether or not that follows the rules is a bit more prone to debate than for other solutions. The output and error streams are indeed not mixed up. But if stdout is buffered, the results are never sent. That's ok for interactive use (then just copy-paste console output to a file), but it mostly rules out redirection. To make it short, ./filefixit <input 2>/dev/null
does the trick.
The problem can be avoided by inserting line noise in line 3: []!_=""
(8 more bytes, 226 in total)
For clarity, the exact same semantics with full indentation and meaningful identifiers:
import Data.List
main = interact $ testsStartingAt 1 . tail . lines
testsStartingAt _ [] = "" -- this line optional
testsStartingAt i (l:ls) = testOutput ++ testsStartingAt (i+1) ls'
where [n,m] = map read $ words l
(paths,ls') = splitAt (n+m) ls
testResult = length (nub $ concatMap splitPath paths) - n
testOutput = "Case #" ++ show i ++ ": " ++ show testResult ++ "\n"
splitPath (_:p) = map (flip take p) $ elemIndices '/' (p ++ "/")
Bash, 175 169 168 135 130 128
WARNING: Be sure to run in an empty directory, as this will wipe out its contents first thing per test.
read t
for((;x++<t;));do
rm -r *
read n m
for((i=n+m;i--;));do
read d
mkdir -p .$d
done
echo Case \#$x: $[`find|wc -l`-n-1]
done
Ruby, 151 149 144
Direct translation to Ruby of marcog's Python solution:
gets.to_i.times{|i|n,m=gets.split.map &:to_i
d={}
(n+m).times{gets.strip.split('/').inject{|a,p|d[a+='/'+p]=a}}
puts"Case ##{i+1}: #{d.size-n}"}
PostScript
182 212 247 262 278 bytes
1[([){exch}/C{concatstrings}/R{(%lineedit)run}>>begin 1
R{(: )[(Case #)3{=only}repeat<<>>R 1 index add{()<<R]{99 string
cvs C<C>C dup 3 index[<>put}forall pop}repeat[length[sub =}for
Usage: $ gs -q -dNODISPLAY -dNOPROMPT thisfile.ps <input >output
C#, 489 442 400 398
Just for fun, no chance of ever being very short of course. Count is after removing insignificant white space, which I left in here for readability.
namespace System
{
class P
{
static void Main(string[]a)
{
int c = 0, n, m, d, l = 1;
var f = IO.File.ReadAllLines(a[0]);
while (c < int.Parse(f[0]))
{
var o = f[l++].Split(' ');
n = int.Parse(o[0]);
m = int.Parse(o[1]);
var p = new Collections.Generic.HashSet<string>();
while (n-- > 0)
p.Add(f[l++]);
while (m-- > 0)
for (o = f[l++].Split('/'), d = 0; d++ < o.Length; )
if (p.Add(string.Join("/", o, 0, d)))
n++;
Console.Write("Case #{0}: {1}\n", ++c, n);
}
}
}
}
This latest version has a quirk. It mistakenly counts the root directory as having to be created once, to compensate for variable n being -1 at the start of the loop, instead of the desired 0. It works because the root directory is guaranteed not to be in N.
Haskell: 299
import Data.List
import Text.Printf
a[]=[]
a(x:z)=(r-n):next where{;[n,m]=map read$words x;t=n+m;r=length$nub$concatMap(f.reverse)$take t z;next=a$drop t z;f""=[];f y=y:f z where(a,b:z)=span(/='/')y}
main=do{z<-getContents;putStr$unlines[printf"Case #%d: %d"x v|(x::Int,v)<-zip[1..]$a$tail$lines z]}
This requires GHC's -XScopedTypeVariables
switch.
Readable version:
import Data.List
import Text.Printf
a [] = []
a (x:xs) = (r-n) : next where
[n,m] = map read $ words x
t = n+m
r = length $ nub $ concatMap (f . reverse) $ take t xs
next = a $ drop t xs
f "" = []
f y = y : f bs where
(a,b:bs) = span (/= '/') y
cleanUp a = unlines [printf "Case #%d: %d" x v | (x::Int,v::Int) <- zip [1..] a]
main = do
z<-getContents
putStr$cleanUp$a$tail$lines z
Scala, 190 189
Yet another version of marcog's solution, this time in Scala.
Runs with scala
, but would need to be put into a class to
allow compilation with scalac
.
for(c←1 to readInt){val I=readLine split" "map(_.toInt)
var d=Set("")
for(i←1-I(0)to I(1)){var a="";for(w←readLine split"/"){a+="/"+w;d+=a}}
printf("Case #%d: %d\n",c,d.size-I(0)-2)}
AWK - 123 121 chars
Another adaptation of marcog python version. Run with awk -F'[ \]' -f fixit.awk
function p(){if(c++>1)print"Case #"c-2": "length(k)-n}
/\//{for(y=i=1;i<NF;)k[y=y"/"$++i];next}
{p();n=$1;delete k}
END{p()}
To run it without the -F
option, it grows by 15 chars, since it then needs this part:
BEGIN{FS=" |/"}
PyonScript
158 159 bytes
1({[([){exch}/R{(%lineedit)run}>>begin R{(: )[(Case #)3{=only}repeat<<>>R 1
index +{<><<R]{str cat<C>cat dup 3 index[<>put}forall pop}repeat[length[-
=}for}1)
Usage: $ pyon thisfile.pyon <input >output
Based on the PostScript solution.
Since PyonScript development is still in progress, this code works on the implementation as it existed at the start of Round 1B 2010: http://github.com/KirarinSnow/PyonScript
J - 205 159 140 characters
c=:0
f=:3 :0
c=:>:c
'a b'=:({.,+/)".>{.y
('Case #',(":c),': ',":a-~3 :'#~.,/>\"1<;.1"1 y'>(>:i.b){y)1!:2<2
(>:b)}.y
)
f^:_(}.<;._2 (1!:1)<3)
run:
script.ijs < gcj-input
Still, it outputs one extra line :/
Java, 277
import java.util.*;enum A{_;{Scanner s=new Scanner(System.in);for(int
t=s.nextInt(),i=0,n,j;i++<t;){Set x=new
HashSet();n=s.nextInt();for(j=s.nextInt();j-->-n;){String b="";for(String
c:s.next().split("/"))x.add(b+=c+'/');}System.out.println("Case #"+i+": "+(x.size()-n-1));}}}
Note: it outputs an error message but still works correctly.
精彩评论