How can one delete the very last line of a file with python?
Input File example:
hello
world
foo
bar
Output File example:
hello
world
foo
I've created the following code to find the number of lines开发者_如何学编程 in the file - but I do not know how to delete the specific line number.
try:
file = open("file")
except IOError:
print "Failed to read file."
countLines = len(file.readlines())
Because I routinely work with many-gigabyte files, looping through as mentioned in the answers didn't work for me. The solution I use:
with open(sys.argv[1], "r+", encoding = "utf-8") as file:
# Move the pointer (similar to a cursor in a text editor) to the end of the file
file.seek(0, os.SEEK_END)
# This code means the following code skips the very last character in the file -
# i.e. in the case the last line is null we delete the last line
# and the penultimate one
pos = file.tell() - 1
# Read each character in the file one at a time from the penultimate
# character going backwards, searching for a newline character
# If we find a new line, exit the search
while pos > 0 and file.read(1) != "\n":
pos -= 1
file.seek(pos, os.SEEK_SET)
# So long as we're not at the start of the file, delete all the characters ahead
# of this position
if pos > 0:
file.seek(pos, os.SEEK_SET)
file.truncate()
You could use the above code and then:-
lines = file.readlines()
lines = lines[:-1]
This would give you an array of lines containing all lines but the last one.
This doesn't use python, but python's the wrong tool for the job if this is the only task you want. You can use the standard *nix utility head
, and run
head -n-1 filename > newfile
which will copy all but the last line of filename to newfile.
Assuming you have to do this in Python and that you have a large enough file that list slicing isn't sufficient, you can do it in a single pass over the file:
last_line = None
for line in file:
if last_line:
print last_line # or write to a file, call a function, etc.
last_line = line
Not the most elegant code in the world but it gets the job done.
Basically it buffers each line in a file through the last_line variable, each iteration outputs the previous iterations line.
here is my solution for linux users:
import os
file_path = 'test.txt'
os.system('sed -i "$ d" {0}'.format(file_path))
no need to read and iterate through the file in python.
On systems where file.truncate() works, you could do something like this:
file = open('file.txt', 'rb')
pos = next = 0
for line in file:
pos = next # position of beginning of this line
next += len(line) # compute position of beginning of next line
file = open('file.txt', 'ab')
file.truncate(pos)
According to my tests, file.tell() doesn't work when reading by line, presumably due to buffering confusing it. That's why this adds up the lengths of the lines to figure out positions. Note that this only works on systems where the line delimiter ends with '\n'.
Here's a more general memory-efficient solution allowing the last 'n' lines to be skipped (like the head
command):
import collections, fileinput
def head(filename, lines_to_delete=1):
queue = collections.deque()
lines_to_delete = max(0, lines_to_delete)
for line in fileinput.input(filename, inplace=True, backup='.bak'):
queue.append(line)
if lines_to_delete == 0:
print queue.popleft(),
else:
lines_to_delete -= 1
queue.clear()
Inspiring from previous posts, I propound this:
with open('file_name', 'r+') as f:
f.seek(0, os.SEEK_END)
while f.tell() and f.read(1) != '\n':
f.seek(-2, os.SEEK_CUR)
f.truncate()
Though I have not tested it (please, no hate for that) I believe that there's a faster way of going it. It's more of a C solution, but quite possible in Python. It's not Pythonic, either. It's a theory, I'd say.
First, you need to know the encoding of the file. Set a variable to the number of bytes a character in that encoding uses (1 byte in ASCII). CHARsize (why not). Probably going to be 1 byte with an ASCII file.
Then grab the size of the file, set FILEsize to it.
Assume you have the address of the file (in memory) in FILEadd.
Add FILEsize to FILEadd.
Move backwords (increment by -1***CHARsize**), testing each CHARsize bytes for a \n (or whatever newline your system uses). When you reach the first \n, you now have the position of the beginning of the first line of the file. Replace \n with \x1a (26, the ASCII for EOF, or whatever that is one your system/with the encoding).
Clean up however you need to (change the filesize, touch the file).
If this works as I suspect it would, you're going to save a lot of time, as you don't need to read through the whole file from the beginning, you read from the end.
here's another way, without slurping the whole file into memory
p=""
f=open("file")
for line in f:
line=line.strip()
print p
p=line
f.close()
精彩评论