Design pattern to progressively "fill out" an object (beginner question)_问答_开发者

I need to process a bunch of data - each element (datum?) is essentially just a dictionary of textual attributes. So say the class is Book - it might have an author, title, genre, reading difficulty level, and recommended price. Now, I start off only knowing the first two, and for each book, need to infer or estimate the next three (in my problem it is more than that).

So the approach that is natural to me is to do this iteratively for each book. My design would look something along the lines of (this is in Java)

public class Book
{
    public String author;
    public String title;
    /* ... */
   开发者_开发问答 public double price;
    public Book(String author,String title)
    {
        this.author = author;
        this.title = title;
    }

    public void setGenre(DataProvider dp,...)
    {
        /* some sort of guess, resulting in genreGuess */
        this.genre = genreGuess;
    }

    /* .. and so on for price, etc */
}

And then, my code would like:

for (Book book : bookList)
{ 
    setGenre(book);
    setPrice(book);
    /* and so on */
}

However, I am trying to learn how to design programs better, in a less iterative fashion, using less mutable state. Does anyone have any recommendations on how I might go about this?

I'm NOT an OO-design guru... Here's one way which I personally think is better.

Inject an implementation of the GenreGuesser interface into Book... this is best done via a BookFactory. The factory is configured ONCE, and then used to create "like" books. I'm thinking of using dependency injection here (like Springs DI framework, or Google's Guice), which dramatically cut-down the overhead of "wiring" the factories into the things which depend on them ;-)

Then we could retrieve AND CACHE the calculated attribute on-the fly. Note that caching the result implies that a Book-objects IDENTITY (eg: author & title) are final, or atleast fixed-once-set.

public String getGenre()
{
  if (this.genre==null) 
    this.genre = genreGuesser.getGuess();
  return this.genre;
}

So basically you're doing your own "late binding" for each calculated field. There's also nothing stopping you (or the user) from setting each field manually if the default "guess" is off-base.

This achives a simple "rich" interface on the Book class; at the cost of making Book aware of the concept "guesses"... and I'm not a fan of "intelligent" transfer-objects, per se, which brings to mind another approach.

If we're going to accept all the overhead of having a BookFactory class, and we CAN limit ourselves to ONLY EVER creating books through the factory, then why not just let the BookFactory (which by definition knows all-about Book and it's attributes) populate all the calculated fields with (guessed) default values. Then the Book class is back to being a simple, dumb, transfer object, which does exactly what it needs to, AND NOTHING ELSE.

I'll be interested to read others suggestions.

Cheers. Keith.

The key thing here is that the class you're describing is a very simple one, so it's hard to see how it could be improved.

What happens in real systems, however, is that your Author class would, for example, be a connection to a Person and a Contract, or the Book would have a Publisher. In a library, it might have a history of when it was purchased, when it was loaned out and returned, and something like ISBN and Library of Congress records.

Behind the objects would be some kind of persistent store -- from something as simple as Python's "pickling" to a relational data base or a "NoSQL" table store.

That's where the complexity starts to show up.

So here are some things to think about:

how many objects do you mean to store? Decisions for 10 Books are very different from what you need to store 10 million.
If you have a complicated tree of objects -- with Publisher, Author, Person, Contract, LC records, inventory and so on -- then creating (or "rehydrating") the object from .persistent store can take a long time. Back when OO was first catching on, this was a traditional issue in forst systems: the object model was wonderful, but it took a half-hour to load an object and all its connected objects.

At that point, you need to start thinking about lazy evaluation. Another useful pattern is Flyweight -- instead of making many copies, you cache one copy and simply refer to it.
What are the use cases? You can't just say "I want to model a Book" because you don't know what the book is for. Start with use cases, and work down to the goal of having the methods of your class make it easy to write code.

The best way to handle that is, basically, to write code. Write out, sketch, actual examples of code using your objects and see if they are easy to use.
As Fred Brooks says, "plan to throw one away; you will anyway." As in writing prose, writing code is rewriting.

Firs thing I notice is that setGenre and setPrice are member methods on the Book object. In that case, you shouldn't be passing in a book, but rather calling

book.setGenre();
book.setPrice();

But I'm not sure you should even be doing that. If you're trying to infer Genre and Difficulty and ultimately Price from the author and title, you shouldn't be explicitly calling setGenre().

Instead, you could call