开发者

Unescape all nested quotes

开发者 https://www.devze.com 2023-02-14 23:42 出处:网络
I want to unescape all nested quotes within a string. The following examples are given as literal (C# or F#) style .NET strings, not surround by quotes:

I want to unescape all nested quotes within a string. The following examples are given as literal (C# or F#) style .NET strings, not surround by quotes:

  • [(\"hello world\", 2); (\"goodbye\", 3)] doesn't change
  • [(\"hello\"world\", 2); (\"go\"o\"d\"bye\", 3)] becomes [(\"hello\\\"world\", 2); (\"go\\\"o\\\"d\\\"bye\", 3)]

I'm not sure whether this can be done with Regex(pattern, "\\\"").Replace but I am still too much of a regex novice for the solution to come easily to me. Any solution, regex if possible, would be appreciated.

Edit

Thanks for feedback so far from everyone, I see now that since there is no distinguishing between opening and closing quotes, the syntax is ambiguous and the direction I was going can't be worked. So I will give the big picture in hopes of a some new direction.

I am working on a project which converts F# Quotations into F# source code strings. So a I have function source: Expr -> string which should produce a string which when printed to a typical console like FSI is valid F# code. For this problem, I am looking to improve the way Value quotation expressions are sprinted. Currently I do something like the following (see starting at line 312 of http://code.google.com/p/unquote/source/browse/trunk/Unquote/Sprint.fs for real code):

match expr with
| Value(o, _) ->
  match o with
  | null -> "null"
  | _ -> sprintf "%A" o

But then, for example, I get the following

> <@ "\r\"\n" @> |> source |> stdout.WriteLine;;
"
"
"
val it : unit = ()

instead of the desired

> <@ "\r\"\n" @> |> source |> stdout.WriteLine;;
"\r\"\n"
val it : unit = ()

If I only needed to consider Values encapsulating strings, that would be easy with something like

let unescape s =
    ["\\","\\\\"
     "\b","\\b"
     "\n","\\n" 
     "\r","\\r" 
     "\t","\\t"
     "\"", "\\\""]
    |> List.fold (fun (s:string) (before, after) -> s.Replace(before, after)) s

The problem is, any object may be a Value, including t开发者_JAVA技巧hose with structured formats used by sprint "%A" which I'd like to leverage as much as possible (so while I could go through and handle a finite set of cases like lists, arrays, tuples, and so forth, that isn't as general as I'd like it to be): so sprinting a list<string*int> Value needs special care for example, since we need to distinguish between quotes which should be displayed literally for string construction, versus those which should be displayed as escape sequences.

Any ideas welcome, thanks!


I don't think this can be done as the question stands currently, because the syntax you want to process is ambiguous. For example, it is not possible to tell whether:

[ (\"hello\"world\", 2); (\"good\"bye\", 3)]

Should be turned into list with two elements:

[ (\"hello\\\"world\", 2); (\"good\\\"bye\", 3)]

.. or a list with just single element (with text containing some funky symbols):

[ (\"hello\\\"world\\\", 2); (\\\"good\\\"bye\", 3)]

It seems that you're trying to do something with the output printed by F# Interactive. Maybe there is some better way to print what you need so that you can avoid ambiguity. Could you add some big picture?

If you need to process any list/tuple data structure, then it will be probably easier to write it using F# reflection API (see Microsoft.FSharp.Reflection namespace) than by parsing F# output. (Or you can use the API to write your own unambiguous printer)

0

精彩评论

暂无评论...
验证码 换一张
取 消