Closed. This question is opinion-based. It is not currently accepting answers.

Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.

Closed 4 years ago.

Improve this question

I have written a epytext to reST markup converter, and now I want to convert all the docstrings in my entire library from epytext t开发者_高级运维o reST format.

Is there a smart way to read the all the docstrings in a module and write back the replacements?

ps: ast module perhaps?

Pyment is a tool that can convert Python docstrings and create missing ones skeletons. It can manage Google, Epydoc (javadoc style), Numpydoc, reStructuredText (reST, Sphinx default) docstring formats.

It accepts a single file or a folder (exploring also sub-folders). For each file, it will recognize each docstring format and convert it to the desired one. At the end, a patch will be generated to apply to the file.

To convert your project:

install Pyment

Type the following (you can use a virtualenv):

$ git clone https://github.com/dadadel/pyment.git
$ cd pyment
$ python setup.py install

convert from Epydoc to Sphinx

You can convert your project to Sphinx format (reST), which is the default output format, by doing:

$ pyment /my/folder/project

EDIT:

Install using pip:

$ pip install git+https://github.com/dadadel/pyment.git

It might be an overkill for this simple usage, but I'd look into using the machinery of 2to3 to do the editing. You just need to write a custom fixer. It's not well-documented, but Developer's Guide to Python 3.0: Python 2.6 and Migrating From 2 to 3: More about 2to3 and Implement Custom Fixers gives enough detail to get started...

Epydoc seems to contain a to_rst() method which might help you actually translate the docstrings. Don't know if it's any good...

Probably the most straightforward just to do it the old-fashioned way. Here's some initial code to get you going. It probably could be prettier but should give the basic idea:

def is_docstr_bound(line):
    return "'''" in line or  '"""' in line

# XXX: output using the same name to some other folder
output = open('output.py', 'w')

docstr_found = False
docstr = list()
with open('input.py') as f:
    for line in f.readlines():
        if docstr_found:
            if is_docstr_bound(line):
                # XXX: do conversion now
                # ...

                # and write to output
                output.write(''.join(docstr))

                output.write(line)

                docstr = list()
                docstr_found = False
            else:
                docstr.append(line)
        else:
            if is_docstr_bound(line):
                docstr_found = True

            output.write(line)

output.close()

To make it truly functional you need to hook it up with a file finder and output the files to some other directory. Check out the os.path module for reference.

I know the docstring bound check is potentially really weak. It's probably a good idea to beef it up a bit (strip line and check if it begins or ends with a docstring bound).

Hopefully that gives some idea how to possibly proceed. Perhaps there's a more elegant way to handle the problem. :)

I wonder about a combination of introspection and source processing. Here's some untested pseudocode:

import foo #where foo is your module

with open('foo.py',r) as f:
    src = f.readlines()

for pything in dir(foo):  #probably better ways to do this...
    try:
       docstring = pything.__doc__
    except AttributeError:
       #no docstring here
       pass

    #modify the docstring
    new_docstring = my_format_changer(docstring)

    #now replace it in the source
    src = src.replace(docstring, new_docstring)

#When done, write it out
with open('new_foo.py','w') as fout:
    fout.write(src)

Clearly you'd have to put some cleverness in the code that traverses the module looking for objects that have docstrings so it would recurse, but this gives you the general idea.