开发者

inverted reindent.py (spaces to tabs)

开发者 https://www.devze.com 2023-03-06 23:44 出处:网络
afaik reindent.py (available in the standard python examples) has a tokenizer allowing it to do smart reindenti开发者_Go百科ng based on the indentation level rather than on the number of spaces osed p

afaik reindent.py (available in the standard python examples) has a tokenizer allowing it to do smart reindenti开发者_Go百科ng based on the indentation level rather than on the number of spaces osed per level (which can vary in bad code)

unfortunately it enforces 4-space indentation, but i want tabs, because 1 tab == 1 indentation level is more logical than x spaces.

this question has no suitable answer:

  • i don’t care about pep-8 (i know how to write my code)
  • vim is installed, but :retab! doesn’t handle inconsistent indentation
  • all tools convert spaces used for alignment (!= indentation) to tabs, too.

one way would be to use reindent.py and afterwards doing sth. like:

#!/usr/bin/env python3
from re import compile
from sys import argv

spaces = compile("^ +")
multistr = False
for line in open(argv[1]):
    num = 0
    if not multistr:
        try:
            num = len(spaces.search(line).group(0)) // 4
        except AttributeError:
            pass
    print("\t"*num + line[num*4:-1])
    if line.count('"""') % 2 == 1:
        multistr = not multistr

but that’s rather hacky. is there no non-zealot version of reindent.py?

PS: why suggests the highlighting that // 4 is a comment instead of a truncating division?


The following script should do the trick, but either i missed sth., or tokenize is buggy (or the example in the python documentation)

#!/usr/bin/env python3

from tokenize import *
from sys import argv

f = open(argv[1])
def readline():
    return bytes(f.readline(), "utf-8")

tokens = []
ilvl=0
for token in tokenize(readline):
    if token.type == INDENT:
        ilvl+=1
        tokens.append((INDENT, "\t"*ilvl))
    else:
        if token.type == DEDENT:
            ilvl-=1
        tokens.append(token)

print(untokenize(tokens).decode('utf-8'))


Using sed in unix you could get it with one line:

sed -r ':f; s|^(\t*)\s{4}|\1\t|g; t f' file

edit: this will work for spaces at beginning of the line only.

0

精彩评论

暂无评论...
验证码 换一张
取 消