开发者

Is there a way to make `fill-paragraph` stop at the end of a sentence?

开发者 https://www.devze.com 2023-04-08 22:59 出处:网络
It is sometimes desirable to start each sentence in a paragraph on a separate line.For instance, this makes it easier to diff large text documents, because a change in one sentence will not affect the

It is sometimes desirable to start each sentence in a paragraph on a separate line. For instance, this makes it easier to diff large text documents, because a change in one sentence will not affect the entire paragraph. Some markup systems (e.g. *roff) also require each sentence to start on a new line.

Is there a way, e.g. by judicious redefinition of paragraph-separate and paragraph-start, to make fill-paragraph stop between sentences?

(note: I use Emacs 23.3.1)


Update: sample mdoc (*roff) markup:

The
.Nm
utility makes a series of passes with increasing block sizes.
In each pass, it either reads or writes (or both) a number of
non-consecutive blocks at increasing offsets relative to the ideal
alignment, which is assumed to be multiples of the block size.
The results are presented in terms of time elapsed, transactions per
second and kB per second.

This is a single paragraph with three sentences, each of which starts on a separate line even though there is room for the first word(s) on the previous line. Currently, fill-paragraph will transform this into

The
.Nm
utility makes a series of passes with increasing block sizes.  In each
pass, it either reads or writes (or both) a number of non-consecutive
blocks at increasing offsets relative to the ideal alignment, which is
assumed to be multiples of the block size.  The results are presented
in terms of time elapsed, transactions per second and kB per second.

which is what I want to avoid.


Update: in re sentences and paragraphs

I see that my question is a bit unclear, because I used the term "paragraph" to refer both to what Emacs calls a paragraph and to what ends up as a continuous block of text in the output of whatever processor I use (groff, latex etc.). To clarify,

  • I need to keep the sentences together without any blank lines between them; groff doesn't like blank lines, while latex sees them as paragraph separators.
  • I need fill-paragraph to operate on individual sentences, i.e. I want to redefine a paragraph as something that starts after either a blank line or the end of the previous paragraph, and ends with a period followed by either a newline character or at least two whitespace characters.
  • I would love to have fill-paragraph break a block of text apart into individual sentences, but I don't think it can be done easily.

For instance, if I type the following:

The
.Nm
utility makes a series of passes with increasing block sizes.
In each pass, it either reads or writes (or both) a number of non-consecutive blocks at increasing offset开发者_如何学Pythons relative to the ideal alignment, which is assumed to be multiples of the block size.
The results are presented in terms of time elapsed, transactions per second and kB per second.

then move the point to the line that starts with "In each pass" and press M-q, I should get the following:

The
.Nm
utility makes a series of passes with increasing block sizes.
In each pass, it either reads or writes (or both) a number of
non-consecutive blocks at increasing offsets relative to the ideal
alignment, which is assumed to be multiples of the block size.
The results are presented in terms of time elapsed, transactions per second and kB per second.

Note that the last sentence is untouched.


How about telling paragraph-start to look for any line that starts with a capital letter:

"\f\\|[     ]*$\\|^[A-Z]"

Note that the new part is \\^[A-Z]

That should work for most cases, you'd only have to watch for the rare cases where you have a capital mid-sentence, and that sentence happens to be long enough to break just before that mid-sentence word.

EDIT: you probably want to account for indentation too:

"\f\\|[     ]*$\\|^[    ]*[A-Z]"

The space between the square brackets contains a space and a tab.

EDIT: you need to turn off case-fold-search for this to work, otherwise capitals and lower case letters are not distinguished in the match!

EDIT: if you want to turn off case-fold-search for just this function, bind the following to M-q (which you can do locally or globally, as you see fit).

(defun my-fill-paragraph ()
  (interactive)
  (let ((case-fold-search nil))
    (fill-paragraph)))


Does this DTRT?

(defun separate-sentences (&optional beg end)
  "ensure each sentence ends with a new line.
When no region specified, use current paragraph."
  (interactive (when (use-region-p)
                   (list (region-beginning) (region-end))))
  (unless (and beg end)
    (save-excursion
      (forward-paragraph -1)
      (setq beg (point))
      (forward-paragraph 1)
      (setq end (point))))
  (setq end (if (markerp end)
                end
              (set-marker (make-marker) end)))
  (save-excursion
    (goto-char beg)
    (while (re-search-forward (sentence-end) end t)
      (unless (or (looking-at-p "[ \t]*$")
                  (looking-back "^[ \t]*"))
        (insert "\n")))))

(defun fill-paragraph-sentence-groups (justify)
  "Groups of sentences filled together.  A sentence ending with newline marks end of group."
  (save-excursion
    (save-restriction
      (narrow-to-region (progn (forward-paragraph -1) (point))
                        (progn (forward-paragraph 1) (point)))
      (goto-char (point-min))
      (skip-chars-forward " \t\n")
      (while (not (or (looking-at-p paragraph-separate)
                      (eobp)))
        (fill-region (point)
                     (progn
                       (loop do (forward-sentence 1)
                             until (looking-at "[ \t]*$"))
                       (point))
                     justify)
        (unless (looking-back "^[ \t]*")
          (forward-line 1)))
      t)))

(defun fill-paragraph-sentence-individual (justify)
  "Each sentence in paragraph is put on new line."
  (save-excursion
    (separate-sentences)
    (fill-paragraph-sentence-groups justify)))

;; deployment option 1: add to major-mode hook

(add-hook 'text-mode-hook (lambda ()
                            (set (make-local-variable fill-paragraph-function) 'fill-paragraph-sentence-individual)))

;; deployment option 2: call my-fill-paragraph any where

(defun my-fill-paragraph (arg)
  (interactive "*P")
  (let ((fill-paragraph-function 'fill-paragraph-sentence-individual))
    (fill-paragraph arg)))

Two paragraph filling functions are presented above. One grouping sentences that don't end on new line together. Another breaking every sentence into a new line.

I only show how to deploy the individual one because that's what the OP wants. Follow the model to deploy the groups version if you wish.


You can use fill-region which, unsurprisingly, only fills the current region. Based on that you could define a fill-sentence function. I guess a simplistic way to detect such sentences is to say:

  • If the line ends with a ., ?, or !, it's an end-of-sentence line.

  • A line starts a sentence if its predecessor line is either empty or an end-of-sentence line.

It's rather tricky to get it to work correctly in all cases, though.

0

精彩评论

暂无评论...
验证码 换一张
取 消