[openrecord-dev] Data model ideas
Brian Douglas Skinner
brian.skinner at gumption.org
Wed Apr 13 01:58:00 CEST 2005
Data model ideas
Caveat:
Chao and I have been talking lately about Extreme Programming (XP), and
how to apply XP guidelines to the OpenRecord project. These ideas look more
like "old school" design than XP, so we might want to take them with a grain
of salt. But I figured I'd post this document anyway, since I'd written up
most of it already. Even if we don't act on it now, we'll have it future
reference.
Background:
A lot of the OpenRecord design was inspired by existing web-based
collaboration projects, like Wikipedia and del.icio.us. I've never actually
used del.icio.us, and I'm only a little more familiar with Wikipedia, but I
still want to draw on them as collaboration examples.
Wikipedia and del.icio.us are both "outsider-oriented" collaboration
projects, rather than "workgroup-scale" collaboration projects. By
"outsider-oriented", I mean that there's a single repository that's
potentially viewed and edited by thousands or millions of people, and those
people generally don't know each other. By "workgroup-scale", I mean that
each workgroup has its own little repository, with any given workgroup only
having a few dozen people. An example of a workgroup-scale collaboration
tool might be the bug-tracking tool used by corporate software development
team.
In workgroup-scale collaboration, you can safely assume that everyone is
pretty much on the same page, and you don't need a lot of formal process and
structure. Or, even if people aren't on the same page, you can assume that
they're able to get together in a room and figure out what they want to do,
so they don't need the collaboration tool itself to have features for
dealing with differences of opinion. In practice, everyone can have full
read/edit permission to everything.
With outsider-oriented collaboration, you begin to have more substantial
problems with differences of opinion. For example, some Wikipedia pages are
prone to "edit wars", where different users keep pushing a page in different
directions. I think the Wikipedia project has evolved some mechanisms to
deal with edit wars. I think those mechanisms involve having different
classes of users (editors vs. authors), and maybe some structure for how
things happen (propose, discuss, vote, action).
In contrast, I believe that del.icio.us does not have any mechanisms for
dealing with differences of opinion. There are no editors, no procedures, no
discussion, and no edit wars. And I think that's because Wikipedia is a
Cathedral, where del.icio.us is a Bazaar. They're both outsider-oriented
collaboration projects, but with del.icio.us the collaboration is a
lightweight, emergent thing.
With del.icio.us, each person adds their own content, and each person can
only edit the content that they created. The collaboration value simply
comes from showing aggregate views of the content that lots of people have
contributed. Collaboration is easy, because each user can make edits
entirely independently from everybody else.
Wikipedia has some Bazaar-like qualities too. Each Wikipedia page is
almost completely independent of all the other pages. In some sense, you can
think of Wikipedia as a collection of 500,000 independent collaboration
projects, one for each page. So if I'm editing one page, and you're editing
another, then we're both collaborating to create Wikipedia, but the
collaboration is really just Bazaar-style aggregation, just like
del.icio.us.
However, any single Wikipedia page is like a little Cathedral. All of the
contributors to a page work together to create a coherent unified whole. One
person adds some content, and a second person edits that content. The
changes that people make build on previous changes. Each contribution is
highly dependent on the previous contributions, not independent like in the
del.icio.us case.
Goal #1, outsider-oriented:
My goal with OpenRecord is to create an outsider-oriented collaboration
tool, not a workgroup-scale collaboration tool. A workgroup-scale tool would
be easier to build, so it's tempting to start with that goal, but I think
that would lead us astray. I believe both CoolChaser and OpenAgenda are
inherently outsider-oriented projects.
Goal #2, Bazaar-style:
If possible, I'd like to create more of a Bazaar-style tool instead of a
Cathedral-style tool. As much as possible, I'd like users to be able to
create and edit content more-or-less independently of each other, and then
have the tool aggregate that content. That may not be a feasible goal, but
I'd at least like to try. So, that said, here's my proposal for the next
iteration of the core data structures and the data model API.
Data structures:
* The server keeps a database of items
* Items have values associated with them
+ an item has values "The Hobbit", "J.R.R. Tolkien", "1938"
* Values can be assigned to attributes
+ "The Hobbit" is assigned to the "Title" attribute
+ "J.R.R. Tolkien" is assigned to the "Author" attribute
+ "1938" is assigned to the "Publication Date" attribute
* An item can have values that are *not* assigned to an attribute
+ an item can have value "Fiction"
* An attribute is itself just another item
* An item can have many values assigned to an attribute
+ "Star Wars" can have a "People" attribute with values "Luke
Skywalker", "Princess Leia", "Han Solo", "Darth Vader"
* A value can be a literal, like a string or a number
* A value can be a reference to another item
* A value is not itself an item
* Values are immutable
* We start out with an initial set of common attributes:
+ name, summary, category, start date, end date, short name
* We have a variety of types of literals
+ string
+ number
+ date
+ URL
* String literals can have an associated "language", which
is a reference to an item representing a language, like
"English" or "German"
* Number literals can have an associated "unit", which is a
reference to an item representing a measurement unit, like
"miles" or "dollars".
* Each item has a unique id -- unique across all servers
* Each value has a unique id -- unique within a single item
* Each item has a creation stamp, with a timestamp marking when
it was created and a userstamp with a reference to an item
representing the user who created it.
* Each value has a creation stamp with a timestamp and a userstamp.
Operations -- single user:
* create a new item
* vote to delete an item
* subsequent to deletion, vote to retain an item, thereby un-deleting it
* create a new value in an item, optionally assigned to an attribute
* add a new value to an attribute of an item
+ add "C3PO" as one of the "People" in "Star Wars"
* vote to delete a value
+ mark "Spock" as deleted, so that it no longer appears in "Star Wars"
* vote to replace a value with a corrected value
+ replace "Luck Skywalker" with "Luke Skywalker"
+ this creates a new value "Luke Skywalker", votes to delete
the old value "Luck Skywalker", and marks the old value with
a pointer to the new replacement value
* change the ordering of the values in an attribute of an item
* change the ordering of the items in a query result set
Operations -- second user, editing the work of the first user:
* vote to retain -- mark a value as affirmed
+ affirm that "C3PO" is one of the "People" in "Star Wars"
* add a new value to an attribute of an item
+ add "R2D2" as one of the "People" in "Star Wars"
* vote to delete a value
+ mark "Spock" as deleted, or affirm someone else's deletion
* replace a value with a corrected value
+ creates a new value and votes to delete the old value
* vote to delete an item
+ mark "Star Wars" as deleted, or affirm someone else's deletion
* vote to retain an item
* change the ordering of the values in an attribute of an item
* change the ordering of the items in a query result set
Data structures redux:
* Item
+ uuid
+ set of values
+ creation stamp (timestamp and userstamp)
+ creation ordinal (a default initial value)
- (T minus creation time) in milliseconds,
where T is (January 1, 2000)
- optimization: can be derived from the creation stamp,
no need to store it
+ optional list of votes to delete and votes to retain
- each vote has a stamp (timestamp and userstamp)
- each vote has a delete/retain flag
- optimization: only stored if users have made retain/delete calls
+ optional list of additional ordinals
- each ordinal has stamp (timestamp and userstamp)
- each ordinal has an ordinal floating point number
- optimization: only stored if users have re-ordered lists
* Value
+ id
+ owning item
+ attribute assignment -- the attribute his value is assigned to
+ creation stamp (timestamp and userstamp)
+ creation ordinal (a default initial value)
- (T minus creation time) in milliseconds,
where T is (January 1, 2000)
- optimization: can be derived from the creation stamp,
no need to store it
+ optional list of votes to delete and votes to retain
- each vote has a stamp (timestamp and userstamp)
- each vote has a delete/retain flag
- replacement pointer -- a deletion vote can have a pointer
to a new value that replaces the deleted value
- optimization: only stored if users have made retain/delete calls
+ optional list of additional ordinals
- each ordinal has stamp (timestamp and userstamp)
- each ordinal has an ordinal floating point number
- optimization: only stored if users have re-ordered lists
+ maybe an optional source (a reference to another item)
* Literal Value
+ data type (string, number, date, etc.)
+ data ("C3PO", "482", "March 1, 1973", etc.)
+ string literals can have a language ("English", "German")
+ number literals can have a unit ("miles", "dollars")
+ date literals can have a timezone
* Reference Value
+ related item
+ related attribute assignment
JavaScript API:
// Login as Lisa and create some items and values
datastore.login(userLisa, authenticationForLisa);
var starWars = datastore.newItem("Star Wars");
var peopleAttribute = datastore.newAttribute("People");
var luck = starWars.addAttributeValue(peopleAttribute, "Luck Skywalker");
var c3po = starWars.addAttributeValue(peopleAttribute, "C3PO");
var luke = starWars.replaceValueWithAttributeValue(luck, peopleAttribute,
"Luke Skywalker");
var fiction = datastore.newItem("Fiction");
starWars.addValue(fiction);
var starWarsPeopleValues =
starWars.getValuesForAttribute(peopleAttribute);
var allStarWarsValues = starWars.getValues();
var creator = luke.getCreationStamp().getUser();
Util.assert(creator == userLisa);
var creationDate = luke.getCreationStamp().getDate();
starWars.deleteValue(c3po);
datastore.logout();
// Login as Bart, and change some existing items and values
datastore.login(userBart, authenticationForBart);
var categoryMovie = datastore.getItemFromUuid(Movies.MOVIE_CATEGORY_UUID);
var movieQuery = datastore.newQuery(Query.CATEGORY_QUERY, categoryMovie);
var movies = movieQuery.getResultSet();
var starWars = null;
for (var uuid in movies) {
var movie = movies[uuid];
if (movie.getName() == "Star Wars") {
starWars = movie;
}
}
if (starWars) {
var attributes = starWars.getAttributes();
var peopleAttribute = null;
for (var uuid in attributes) {
var attribute = attributes[uuid];
if (attribute.getName() == "People") {
peopleAttribute = attribute;
}
}
if (peopleAttribute) {
var allValues = starWars.getValues();
var starWarsPeople = starWars.getValuesForAttribute(peopleAttribute);
for (var uuid in starWarsPeople) {
var person = starWarsPeople[uuid];
var msg = "Was " + person.getDisplayString() + " in Star Wars?";
// show panel and get user input
var yesNo = window.confirm(msg);
if (yesNo) {
starWars.retainValue(character);
} else {
starWars.deleteValue(character);
}
}
}
}
datastore.logout();
More information about the openrecord-dev
mailing list