Programmers Stack Exchange is a question and answer site for professional programmers interested in conceptual questions about software development. It's 100% free.

Sign up
Here's how it works:
  1. Anybody can ask a question
  2. Anybody can answer
  3. The best answers are voted up and rise to the top

Let's say I have three resources that are related like so:

Grandparent (collection) -> Parent (collection) -> and Child (collection)

The above depicts the relationship among these resources like so: Each grandparent can map to one or several parents. Each parent can map to one or several children. I want the ability to support searching against the child resource but with the filter criteria:

If my clients pass me an id reference to a grandparent, I want to only search against children who are direct descendants of that grandparent.

If my clients pass me an id reference to a parent, I want to only search against children who are direct descendants of my parent.

I have thought of something like so:

GET /myservice/api/v1/grandparents/{grandparentID}/parents/children?search={text}

and

GET /myservice/api/v1/parents/{parentID}/children?search={text}

for the above requirements, respectively.

But I could also do something like this:

GET /myservice/api/v1/children?search={text}&grandparentID={id}&parentID=${id}

In this design, I could allow my client to pass me one or the other in the query string: either grandparentID or parentID, but not both.

My questions are:

1) Which API design is more RESTful, and why? Semantically, they mean and behave the same way. The last resource in the URI is "children", effectively implying that the client is operating on the children resource.

2) What are the pros and cons to each in terms of understandability from a client's perspective, and maintainability from the designer's perspective.

3) What are query strings really used for, besides "filtering" on your resource? If you go with the first approach, the filter parameter is embedded in the URI itself as a path parameter instead of a query string parameter.

Thanks!

share|improve this question
1  
The title of your question should be extremely confusing to anyone viewing this. The valid segments of a URI are defined as <scheme>://<user>:<password>@<host>:<port>/<path>;<params>?<query>/#<fragment> (although <password> is deprecated) A "query string" is a valid component of a URI so your "vs" in the title is crazy talk. – K. Alan Bates Jun 3 '15 at 14:26
    
Do you mean I want to only search against children who are INdirect descendants of that grandparent. ? According to your structure, Grandparent has no direct children. – null Jun 3 '15 at 15:25
    
What is the diference between a child and a parent? Is a parent a parent if he doesnt have children? Smells of a design fault – Pinoniq Jun 4 '15 at 11:10
up vote 16 down vote accepted

First

As Per RFC 3986 §3.4 (Uniform Resource Identifiers § (Syntax Components)|Query

3.4 Query

The query component contains non-hierarchical data that, along with data in the path component (Section 3.3), serves to identify a resource within the scope of the URI's scheme and naming authority (if any).

Query components are for retrieval of non-hierarchical data; there are few things more hierarchical in nature than a family tree! Ergo, regardless of whether you think it is "REST-y" or not, to conform to the formats, protocols, and frameworks of developing on the internet, you must not use the query string to identify this information.

REST has nothing to do with this.

Before addressing your questions, your query parameter of "search" is very poorly named. Treat your query segment as a dictionary of key-value pairs. Your query string could be more appropriately defined as

?first_name={firstName}&last_name={lastName}&birth_date={birthDate} etc.

To answer your specific questions

1) Which API design is more RESTful, and why? Semantically, they mean and behave the same way. The last resource in the URI is "children", effectively implying that the client is operating on the children resource.

None of these resource interfaces are RESTful. The major precondition for the RESTful architectural style is that Application State transitions must be communicated from the server as hypertext. People have labored over the structure of URIs to make them somehow "RESTful URIs" but REST doesn't have anything at all to say about how you should structure them except insofar as REST says that if you are to use HTTP for distributing messages between the tiers of your system, then you should conform to HTTP's requirements and the additional requirements of the standards that HTTP subsumes. Your question is addressed directly by RFC3986 §3.4, which I have linked above. The bottom line is that even though a conforming URI is insufficient to consider an API "RESTful", if you want your system to actually be "RESTful" and you are using HTTP and URIs, then you cannot identify hierarchical data through the query string.

3.4 Query

The query component contains non-hierarchical data

...it's as simple as that.

2) What are the pros and cons to each in terms of understandability from a client's perspective, and maintainability from the designer's perspective.

The "pros" of the first two is that they are on the right path. The "cons" of the third one is that it is flat out wrong.

As far as your understandability and maintainability concerns, those definitely depend on the comprehension level of the client developer and the design chops of the designer. The URI specification is the definitive answer as to how URIs are supposed to be formatted. Hierarchical data is supposed to be represented on the path and with path parameters. Non-hierarchical data is supposed to be represented in the query. The fragment is more complicated, because its semantics depend specifically upon the media type of the representation being requested. So to address the "understandability" component of your question, I will attempt to translate exactly what your first two URIs are actually saying. Then, I will attempt to represent what you say you are trying to accomplish with valid URIs.

Translation of your URIs to their semantic meaning /myservice/api/v1/grandparents/{grandparentID}/parents/children?search={text} This says for the parents of grandparents, find their child having search={text} What you said with your URI is only coherent if searching for a grandparent's siblings. With your "grandparents, parents, children" you found a "grandparent" went up a generation to their parents and then came back down to the "grandparent" generation by looking at the parents' children.

/myservice/api/v1/parents/{parentID}/children?search={text} This says that for the parent identified by {parentID}, find their child having ?search={text} This is closer to correct to what you are wanting, and represents a parent->child relationship that can likely be used to model your entire API. If the client has a "grandparentId" then they should realize there is a layer of indirection between the ID they have and the portion of the family graph they are wishing to see. To find a "child" by "grandparentId", you can call your /parents/{parentID}/children service and then foreach child that is returned, search their children for your person identifier.

Implementation of your requirements as URIs If you want a model a more extensible resource identifier that can walk the tree, I can think of several ways you can accomplish that.

1) The first one, I've already alluded to. Represent the graph of "People" as a composite structure. Each person has a reference to the generation above it through its Parents path and to a generation below it through its Children path.

/Persons/Joe/Parents/Mother/Parents would be a way to grab Joe's maternal grandparents.

/Persons/Joe/Parents/Parents would be a way to grab all of Joe's grandparents.

/Persons/Joe/Parents/Parents?id={Joe.GrandparentID} would grab Joe's grandparent having the identifier you have in hand.

and these would all make sense. You also benefit from having the ability to support any arbitrary number of generations. If, for some reason, you desire to look up 8 generations, you could represent this as

/Persons/Joe/Parents/Parents/Parents/Parents/Parents/Parents/Parents/Parents?id={Joe.NotableAncestor}

but this leads into the second dominant option for representing this data: through a path parameter.


2) Use path parameters to "query the hierarchy" You could develop the following structure to help ease the burden on consumers and still have an API that makes sense.

To look back 147 generations, representing this resource identifier with path parameters allows you to do

/Persons/Joe/Parents;generations=147?id={Joe.NotableAncestor}

To locate Joe from his Great Grandparent, you could look down the graph a known number of generations for Joe's Id. /Persons/JoesGreatGrandparent/Children;generations=3?id={Joe.Id}

The major thing of note with these approaches is that without further information in the identifier and request, you should expect that the first URI is retrieving a Person 147 generations up from Joe with the identifier of Joe.NotableAncestor. You should expect the second one to retrieve Joe. Assume that what you actually want is for your calling client to be able to retrieve the entire set of nodes and their relationships between the root Person and the final context of your URI. You could do that with the same URI (with some additional decoration) and setting an Accept of text/vnd.graphviz on your request, which is the IANA registered media type for the .dot graph representation. With that, change the URI to

/Persons/Joe/Parents;generations=147?id={Joe.NotableAncestor}#directed

with an HTTP Request Header Accept: text/vnd.graphviz and you can have clients fairly clearly communicate that they want the directed graph of the generational hierarchy between Joe and 147 generations prior where that 147th ancestral generation contains a person identified as Joe's "Notable Ancestor."

I'm unsure if text/vnd.graphviz has any pre-defined semantics for its fragment;I could find none in a search for instruction. If that media type actually does have pre-defined fragment information, then its semantics should be followed to create a conforming URI. But, if those semantics are not pre-defined, the URI specification states that the semantics of the fragment identifier are unconstrained and instead defined by the server, making this usage valid.


3) What are query strings really used for, besides "filtering" on your resource? If you go with the first approach, the filter parameter is embedded in the URI itself as a path parameter instead of a query string parameter.

I believe I have already thoroughly beaten this to death, but query strings are not for "filtering" resources. They are for identifying your resource from non-hierarchical data. If you have drilled down your hierarchy with your path by going /person/{id}/children/ and you are wishing to identify a specific child or a specific set of children, you would use some attribute that applies to the set you are identifying and include it inside the query.

share|improve this answer
    
The RFC is only concerned with hierarchy insofar as it defines a syntax and algorithm for resolving relative URI references. Could you elaborate or cite some sources explaining why the examples in the original post are not conforming? – user2313838 Jun 30 '15 at 13:58
    
@user2313838 (comment probably not best medium for elaboration) No. The global restriction you seem to have placed upon RFC3986 for specifying only the resolution of relative URI references is not present. The mentioning in the RFC of "relative URIs" is permissive, not exclusive of other approaches. The hierarchical nature of the path described in the general syntax imports no context of defining references relative to a separately defined context. The hierarchical nature of the path is to identify hierarchically organized data. If your data is hierarchical, thou shalt use the path. – K. Alan Bates Jun 30 '15 at 14:25
    
Isn't a family tree really a graph not a tree, and not at all hierarchical. considering multiple parents, divorce and re-marriage etc. – Myster Feb 18 at 1:49
    
@Myster Yes and no. A "tree" is technically known as an arborescent graph(directed and acyclic), but this is a hierarchical structure. I did make a mistake in my final point where I used the "undirected" fragment; I meant to say "directed" and I hadn't noticed that I had reversed it. I probably wouldn't have caught it if you had not made your comment. – K. Alan Bates Feb 18 at 14:31

This is where you get it wrong:

If my clients pass me an id reference

In a REST systems, client should never be bothered with IDs. The only resource identifiers that the client should know about should be URIs. This is the principle of "uniform interface".

Think about how clients would interact with your system. Say the user starts browsing through a list of grandparents, he picked one of grandparent's child, that brings him to /grandparent/123. If the client should be able to search the children of /grandparent/123, then according to "HATEOAS", whatever returned when you do a query on /grandparent/123 should return a URL to the search interface. This URL should already have whatever data is needed to filter by the current grandparent embedded in it.

Whether the link looks like /grandparent/123?search={term} or /parent?grandparent=123&search={term} or /parent?grandparentTerm=someterm&someothergplocator=blah&search={term} are inconsequential according to REST. Notice how all of those URLs have the same number of parameters, which is {term}, even though they use different criterias. You can switch between any of those URLs or you can mix them up depending on the specific grandparents and the client wouldn't break, because the logical relationship between the resources are the same even though the underlying implementation might differ significantly.

If you had instead created the service such that it requires /grandparent/{grandparentID}?search={term} when you go one way but /children?parent={parentID}&search={term} a} when you go another way, that is too much coupling because the client would have to know to interpolate different things on different relations that are conceptually similar.

Whether you actually go with /grandparent/123?search={term} or /parent?grandparent=123&search={term} is a matter of taste and whichever implementation is easier for you right now. The important thing is to not require the client to be modified if you change your URL strategy or if you use different strategies on different parents-children relations.

share|improve this answer

I'm not sure why people think putting the ID values in the URL means its somehow a REST API, REST is about handling verbs, passing resources.

So if you want to PUT a new user, you'd have to send a fair chunk of data and a POST http request is ideal, so although you might send the key (eg. user id), you'll send the user data (eg name, address) as POST data.

Now it is a common idiom to put the resource identifier in the URI, but this is more convention than any form of canonical "its not REST if its not in the URI". Remember that the original thesis of REST doesn't really mention http at all, its an idiom for handling data in a client-server, not something that is an extension to http (though, obviously, http is our primary form of implementing REST).

For example, Fielding uses the request of an academic paper as an example. You want to retrieve the resource "Dr John's Paper on the drinking of beers", but you might also want the initial version, or the latest version, so the resource identifier might not be something that is easily referenced as a single ID that can be placed in the URI. REST allows for this and a stated intention behind it is:

REST relies instead on the author choosing a resource identifier that best fits the nature of the concept being identified

So there's nothing stopping you from using a static URI to retrieve your parents, passing in a search term in the query string to identify the exact user you're after. In this case, the 'resource' you're identifying is the set of grandparents (and so the URI contains 'grandparents' as part of the URI. REST includes the concept of 'control data' that is designed for determining which representation of your resource is to be retrieved - the example given in the these is cache control, but also version control - so my request for Dr John's excellent paper can be refined by passing the version as control data, not part of the URI.

I think an example of REST interface that is not usually mentioned is SMTP. When constructing a mail message, you send verbs (FROM, TO etc) with the resource data for each part of the mail message. This is RESTful even though it doesn't use the http verbs, it uses its own set.

So... whilst you do need to have some resource identification in your URI, it doesn't have to be your id reference. This can happily be sent as control data, in the query string or even in POST data. What you're really identifying in your REST API is that you're after a child, which you already have in your URI.

So to my mind, reading the definition of REST, you're requesting a child resource, passing in control data (in the form of querystring) to determine which one you want returned. As a result, you cannot make a URI request for a grandparent or parent. You want children returned so the term parent/grandparent or id should definitely not be in such a URI. Children should be.

share|improve this answer
    
@K.AlanBates I'll delete this too shortly, and even forgive you if you give me a point!!!! :-) – gbjbaanb Jun 3 '15 at 15:34

A lot of people have allready talked about what REST means, etc etc. But none seem to address the real issue: your design.

Why is grandparent different from a father? they both have children that can possibly have children that can ...

Eventually, they are all 'human'. You probably have that in your code as well. So use that:

GET /<api-endpoint>/humans/<ID>

Will return some usefull info about the human. (Name, and stuff)

GET /<api-endpoint>/humans/<ID>/children

will obviously return an array of children. If no children exist, an empty array is probably the way to go.

To make it easy, you could for instance add a flag 'hasChildren'. or 'hasGrandChildren'.

Think smarter not harder

share|improve this answer

The following is more RESTfull because every grandparentID gets it own URL. This way the resource gets identified in a unique way.

GET /myservice/api/v1/grandparents/{grandparentID}
GET /myservice/api/v1/grandparents/{grandparentID}/parents/children?search={text}

The query parameter search is a good way to execute a search from the context of that resource.

When a family is getting very large you can use start/limit as query options for example:

GET /myservice/api/v1/grandparents/{grandparentID}/children?start=1&limit=50

As a developer it is good to have different resources with a unique URL/URI. I think you should use query parameter only when they also could be left out.

Maybe this is a good read http://www.thoughtworks.com/insights/blog/rest-api-design-resource-modeling and otherwise the original PhD thesis of Roy T Fielding https://www.ics.uci.edu/~fielding/pubs/dissertation/fielding_dissertation.pdf that explains the concepts very well and complete.

share|improve this answer

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.