Sunday, September 23, 2007

Deciding Astoria's Final URI Addressing Scheme

Mike Flasko and Pablo Castro of the Astoria Project team posted URI Format - Part 1 - Addressing resources using URI path segments on September 21, 2007. This post, which was converted from a subitem in the LINQ and Entity Framework Posts for 9/21/2007+ post covers the first of three goals:

Provide a mechanism to point to every resource or member of a resource in the system. That is, every piece of data is addressable, and the URI used to address it needs to be derivable from the service metadata which describes the conceptual model of the system

Future Astoria Team posts will tackle new syntax for simple queries and DML.

The Original Addressing Scheme

Julie Lerman explained the current URI scheme for retrieving entity sets, entities, and members in her What the heck is Astoria? post as similar to the following:

  • http://localhost:1544/Northwind.svc retrieves the list of the service's entities
  • http://localhost:1544/Northwind.svc/Customers retrieves a list of the members of the Customer entity with their key value(s) enclosed by square brackets
  • http://localhost:1544/Northwind.svc/Customers[ALFKI] retrieves the member whose key value is 'ALFKI'
  • http://localhost:1544/Northwind.svc/Customers[ALFKI]/Orders retrieves the Orders entity set for 'ALFKI'
  • http://localhost:1544/Northwind.svc/Customers[ALFKI]/Orders[10643] retrieves Order number 10643
  • http://localhost:1544/Northwind.svc/Customers[ALFKI]/Orders[10643]/Shippers retrieves the Shipper for Order 10643, Shippers[3] (Federal Shipping)
  • http://localhost:1544/Northwind.svc/Customers[ALFKI]/Orders[10643]/Shippers[3]/Orders retrieves all Orders sent by Federal Shipping (equivalent to http://localhost:1544/Northwind.svc/Shippers[3]/Orders

and so on. I attribute the failure of http://localhost:1544/Northwind.svc/Customers[ALFKI]/Orders[10643]/Order_Details to retrieve records to an early lack of a defined syntax for composite keys, which might logically be [10643][52] or the like for OrderID/ProductID. (Early Entity Framework CTPs couldn't handle composite primary keys).

Proposed Changes to the Original Syntax

Square brackets: I'm not even close to being an expert in designing URI addressing schemes but the original URI path segments looked good to me because of their simplicity and consistency. Square brackets correspond to array indices in SOAP encoding (and C#, of course).

However, Mike and Pablo said:

The May 2007 CTP used values in square-brackets (e.g. “…/Customers[ALFKI]”). We got “generous” feedback saying that square-brackets were a bad choice.

My Google search on 'Microsoft Astoria "square brackets"' didn't turn up obvious complaints and Mike/Pablo didn't explain why "square-brackets were a bad choice."

Web3S vs. POX: Replacing POX with Web3S, which doesn't ring my bell, requires syntax that supports heterogeneous sets that Astoria doesn't deliver. Thus Mike and Pablo propose two approaches to retrieve the same entity:

  • Full-form (for Web3S compatibility): ...Customers/Customer('ALFKI'), which specifies a Customer entity within the Customers entity set.
  • Short-form (Astoria standard syntax): ...Customers!'ALFKI', which infers the Customer entity.

Comments:

  1. The syntax provides two methods of addressing for a particular entity so is "a redundant method for getting to it." This was rejected for specifying the default container in the URI.
  2. Unless there are plans for Astoria to somehow deliver heterogeneous sets, the long form should be abandoned as unnecessarily verbose.
  3. Quotes around key strings are an SQL artifact that don't belong in the URI. If meaningful characters (such as /) are to be escaped in literals, there's no reason for quoted-string alphanumeric key values.
  4. Parentheses imply an array index (to VB coders), while the bang (!) operator traditionally separates the names of a collection and its members. If there's a valid reason not to use square brackets to set off key values, parentheses are better than the bang operator.

If it's absolutely necessary to support heterogeneous set syntax, reserve the bang operator for it's traditional use, as in ...Customers!Customer(ALFKI) or substitute a virgule (/) and allow qualifying the default container name.

Literal forms of composite keys: If you require escaping meaningful characters within literals, commas (, = %2C) are meaningful within URIs and composite keys so escape them like you would whacks (/ = %2F) or spaces ( = %20). (Who would include a comma in a key value anyway?) If you prohibit spaces in key values, why not prohibit commas or other URI-reserved characters?

Names for key values: The names might be significant if the key values represent natural keys, but if they're surrogate keys I see no reason to name them. Is there a reason key names can't be made optional if position or name is significant?

Comment: The appearance of composite key names and values indicates to me that current EF bits must support surfacing foreign key values, which I requested some time ago. The need to eager- or lazy-load an entity to retrieve a natural key value only is an unnecessary waste of resources.

Persistence Ignorance is Good but Syntax Ignorance is Bad

"Should we assume that consumers of Astoria URIs understand their syntax?" Yes. Consumers are required to know the container name and the .svc extension. If so, they should have some inkling of the remainder of the URI. How about $syntax as a substitute for /? for text-encoded help?

Syntax for Self-Joins

Currently the link syntax to a many:one association resulting from a self-join, such as the Employees table's Employees:ReportsTo relationship is:

<Employees1 href="Employees[5]/Employees1" />
<Employees2 href="Employees[5]/Employees2" />

which doesn't express the relationship in a meaningful way.

Web3S Redux

I've just reread Sam Ruby's and Tim Bray's June 2007 posts on Web3S. Both Sam and Tim point out that Web3S is intended to be “the central data store in Windows Live for address book information. All Hotmail contacts, Messenger buddies and Spaces’ friends are recorded in Live Contacts." Astoria has no association whatsoever with Live Contacts and should not be saddled with its redefinition of the XML Infoset and lack of a schema.

Web3S is not an IETF or W3C standard and probably never will be. Astoria lends no credence to Web3S as an "industry standard." Web3S requires support for a new HTTP verb, UPDATE. The chance of UPDATE being added to the HTTP protocol and of any network admin allowing it through a corporate firewall is infinitesimal. How many business networks permit PUT and DELETE from the outside (or inside)? Not many.

Tim suggests LDAP as a more appropriate protocol. Having had some experience with LDAP in the DSMLv2 and DSfW Beta 1 era, I don't see how anyone could call LDAP RESTful.

0 comments: