开发者

What is the best way to parse this configuration file?

开发者 https://www.devze.com 2023-01-12 08:09 出处:网络
I am working on a personal project that uses a custom config file. The basic format of the file looks like this:

I am working on a personal project that uses a custom config file. The basic format of the file looks like this:

[users]
name: bob
attributes:
    hat: brown
    shirt: black
another_section:
    key: value
    key2: value2

name: sally
sex: female
attributes:
    pants: yellow
    shirt: red

There can be an arbitrary number of users and each can have different key/value pairs and there can be nested keys/values under a section using tab-stops. I know that I can use json, yaml, or even xml for this config file, however, I'd like to keep it custom for now.

Parsing shouldn't be difficult at all as I have already written code to do parse it. My question is, what is the best way to go about parsing this using clean and structured code as well as writing in a way that won't make changes 开发者_如何学Goin the future difficult (there might be multiple nests in the future). Right now, my code looks utterly disgusting. For example,

private void parseDocument() {  
    String current;
    while((current = reader.readLine()) != null) {
        if(current.equals("") || current.startsWith("#")) {
            continue; //comment
        } 
        else if(current.startsWith("[users]")) {
            parseUsers();
        }
        else if(current.startsWith("[backgrounds]")) {
            parseBackgrounds();
        }
    }
}

private void parseUsers()  {        
    String current;
    while((current = reader.readLine()) != null) {
        if(current.startsWith("attributes:")) {
            while((current = reader.readLine()) != null) {
                if(current.startsWith("\t")) {
                    //add user key/values to User object
                }
                else if(current.startsWith("another_section:")) {
                    while((current = reader.readLine()) != null) {
                        if(current.startsWith("\t")) {
                            //add user key/values to new User object
                        } 
                        else if (current.equals("")) {
                            //newline means that a new user is up to parse next
                        }
                    }
                }
            }
        }
        else if(!current.isEmpty()) {
            //
        }


    }
}

As you can see, the code is pretty messy, and I have cut it short for the presentation here. I feel there are better ways to do this as well maybe not using BufferedReader. Can someone please provide possibly a better way or approach that is not as convoluted as mine?


I would suggest not creating custom code for config files. What you're proposing isn't too far removed from YAML (getting started). Use that instead.

See Which java YAML library should I use?


Everyone will recommend using XML because it's simply better.

However, in case you're on a quest to prove your programmer's worth to yourself...

...there is nothing really fundamentally wrong with the code you posted in the sense that it's clear and it's obvious to potential readers what's going on, and unless I'm totally out of the loop on file operations, it should perform pretty much as well as it could.

The one criticism I could offer is that it's not recursive. Every level requires a new level of code to support. I would probably make a recursive function (a function that calls itself with sub-content as parameter and then again if there's sub-sub-content etc.), that could be called, reading all of this stuff into a hashtable with hashtables or something, and then I'd use that hashtable as a configuration object.

Then again, at that point I would probably stop seeing the point and use XML. ;)


I'd recommend changing the configuration file's format to JSON and using an existing library to parse the JSON objects such as FlexJSON.

{
"users": [
    {
        "name": "bob",
        "hat": "brown",
        "shirt": "black",
        "another_section": {
            "key": "value",
            "key2": "value2" 
        } 
    },
    {
        "name": "sally",
        "sex": "female",
        "another_section": {
            "pants": "yellow",
            "shirt": "red" 
        } 
    } 
] 

}


It looks simple enough for a state machine.

while((current = reader.readLine()) != null) {
  if(current.startsWith("[users]"))
    state = PARSE_USER;
  else if(current.startsWith("[backgrounds]"))
    state = PARSE_BACKGROUND;
  else if (current.equals("")) {
    // Store the user or background that you've been building up if you have one.
    switch(state) {
      case PARSE_USER:
      case USER_ATTRIBUTES:
      case USER_OTHER_ATTRIBUTES:
        state = PARSE_USER;
        break;
      case PARSE_BACKGROUND:
      case BACKGROUND_ATTRIBUTES:
      case BACKGROUND_OTHER_ATTRIBUTES:
        state = PARSE_BACKGROUND;
        break;
    }
  } else switch(state) {
    case PARSE_USER:
    case USER_ATTRIBUTES:
    case USER_OTHER_ATTRIBUTES:
      if(current.startsWith("attributes:"))
        state = USER_ATTRIBUTES;
      else if(current.startsWith("another_section:"))
        state = USER_OTHER_ATTRIBUTES;
      else {
        // Split the line into key/value and store into user
        // object being built up as appropriate based on state.
      }
      break;
    case PARSE_BACKGROUND:
    case BACKGROUND_ATTRIBUTES:
    case BACKGROUND_OTHER_ATTRIBUTES:
      if(current.startsWith("attributes:"))
        state = BACKGROUND_ATTRIBUTES;
      else if(current.startsWith("another_section:"))
        state = BACKGROUND_OTHER_ATTRIBUTES;
      else {
        // Split the line into key/value and store into background
        // object being built up as appropriate based on state.
      }
      break;
  }
}
// If you have an unstored object, store it.


If you could utilise XML or JSON or other well-known data encoding as the data format, it will be a lot easier to parse/deserialize the text content and extract the values. For example.

name: bob
attributes:
    hat: brown
    shirt: black
another_section:
    key: value
    key2: value2

Can be Expressed as the follow XML (there are other options to express it in XML as well)

<config>
  <User hat="brown" shirt="black" >
    <another_section>
      <key>value</key>
      <key2>value</key2>
    </another_section>
  </User>
</config>

Custom ( Extremely simple ) As I mentioned in the comment below, you can just make them all name and value pairs. e.g.

name                 :bob
attributes_hat       :brown
attributes_shirt     :black
another_section_key  :value
another_section_key2 :value2

and then do string split on '\n' (newline) and ':' to extract the key and value or build a dictionary/map object.


A nice way to clean it up would be to use a table, i.e. replace your conditionals with a Map. You can then invoke you parsing methods through reflection (simple) or create a few more classes implementing a common interface (more work but more robust).

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号