Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this questionI'm looking for a shell script idea to work on for practice with shell scripting. Can you please suggest intermediate ideas to work on? I'm a develo开发者_C百科per and I prefer working on an idea that deals with files.
For shell scripting, think of a task that you do frequently - and think how you would automate that task.
You can start off with a basic script that just about does what you need. Then you realize that there are small variations on the task, and you start to allow the script to handle those. And it gently becomes more complex.
Almost all of the scripts I have (some hundreds of them) started off as "I've done that before; how can I avoid having to do it again?".
Can you give an example?
No - because I don't know what tasks you do sufficiently often to be (minor) irritants that could be salved by writing a script.
Yes - because I've got scripts that I wrote incrementally, in an attempt to work around some issue or other in my environment.
One task that I'm working on - still a work in progress - is:
Identify duplicate files
Starting at some nominated directory (default, $HOME), find all the files, and for each file, establish a checksum (MD5, SHA1, SHA256 - it is not critical which) for the file; record the file name and checksum (and maybe device number and inode number).
Establish which checksums are repeated - hence identifying identical files.
Eliminate the unique checksums.
Group the duplicate files together with appropriate identifying information.
This much is fairly easy - it requires some medium-grade shell scripting and you might have to find a command to generate the checksum (but you might be OK with sum
or cksum
, though neither of those reaches even the level of MD5). I've done this in both shell and Perl.
The hard part - where I've not yet gotten a good solution - is then dealing with the duplicates. I have some 8,500 duplicated hashes, with about 27,000 file names in total. Some of the duplicates are images like smileys used in chat transcripts - there are a lot of that particular image. Others are duplicate PDF files collected from various machines at various times; I need to organize them so I have one copy of the file on disk, with perhaps links in the other locations. But some of the other locations should go - they were convenient ways to get the material from retired machines onto my current machine.
I have not yet got a good solution to the second part.
Here are two scripts from my personal library. They are simple enough not to require a full blown programming language, but aren't trivial, particularly if you aim to get all the details right (support all flags, return same exit code, etc.).
cvsadd
Write a script to perform a recursive cvs add
so you don't have to manually add each sub-directory and its files. Make it so it detects the file types and adds the -kb
flag for binary files as needed.
For bonus points: Allow the user to optionally specify a list of directories or files to restrict the search to. Handle file names with spaces correctly. If you can't figure out if a file is text or binary, ask the user.
#!/bin/bash
#
# Usage: cvsadd [FILE]...
#
# Mass `cvs add' script. Adds files and directories recursively, automatically
# figuring out if they are text or binary. If no file names are specified, looks
# for unversioned files and directories in the current directory.
svnfind
Write a wrapper around find
which performs the same job, recursively finding files matching arbitrary criteria, but ignores .svn
directories.
For bonus points: Allow other actions besides the default -print
. Support the -H
, -L
, and -P
options. Don't erroneously filter out files which simply happen to contain the substring .svn
. Make usage identical to the regular find
command.
#!/bin/bash
#
# Usage: svnfind [-H] [-L] [-P] [path...] [expression]
#
# Attempts to behave identically to a plain `find' command while ignoring .svn
# directories. Usage is identical to `find'.
You could try some simple CGI scripting. It can be done in shell and involves a lot of here documents, parsing and extracting of form values, a bit of escaping and whatever you want to do as payload. (I do not recommend exposing such a script to the hostile internet, though.)
精彩评论