REST Messages And Data Transfer Objects

2014-12-08

In Patterns of Enterprise Application Architecture, Martin Fowler defines a Data Transfer Object (DTO) as

An object that carries data between processes in order to reduce the number of method calls.

Note that a Data Transfer Object is not the same as a Data Access Object (DAO), although they have some similarities. A Data Access Object is used to hide details from the underlying persistence layer.

REST Messages Are Serialized DTOs

message-transferIn a RESTful architecture, the messages sent across the wire are serializations of DTOs.

This means all the best practices around DTOs are important to follow when building RESTful systems.

For instance, Fowler writes

…encapsulate the serialization mechanism for transferring data over the wire. By encapsulating the serialization like this, the DTOs keep this logic out of the rest of the code and also provide a clear point to change serialization should you wish.

In other words, you should follow the DRY principle and have exactly one place where you convert your internal DTO to a message that is sent over the wire.

In JAX-RS, that one place should be in an entity provider. In Spring, the mechanism to use is the message converter. Note that both frameworks have support for several often-used serialization formats.

Following this advice not only makes it easier to change media types (e.g. from plain JSON or HAL to a more mature media type like Siren, Mason, or UBER). It also makes it easy to support multiple media types.

mediaThis in turn enables you to switch media types without breaking clients.

You can continue to serve old clients with the old media type, while new clients can take advantage of the new media type.

Introducing new media types is one way to evolve your REST API when you must make backwards incompatible changes.

DTOs Are Not Domain Objects

Domain objects implement the ubiquitous language used by subject matter experts and thus are discovered. DTOs, on the other hand, are designed to meet certain non-functional characteristics, like performance, and are subject to trade-offs.

This means the two have very different reasons to change and, following the Single Responsibility Principle, should be separate objects. Blindly serializing domain objects should thus be considered an anti-pattern.

That doesn’t mean you must blindly add DTOs, either. It’s perfectly fine to start with exposing domain objects, e.g. using Spring Data REST, and introducing DTOs as needed. As always, premature optimization is the root of all evil, and you should decide based on measurements.

The point is to keep the difference in mind. Don’t change your domain objects to get better performance, but rather introduce DTOs.

DTOs Have No Behavior

big-dataA DTO should not have any behavior; it’s purpose in life is to transfer data between remote systems.

This is very different from domain objects.

There are two basic approaches for dealing with the data in a DTO.

The first is to make them immutable objects, where all the input is provided in the constructor and the data can only be read.

This doesn’t work well for large objects, and doesn’t play nice with serialization frameworks.

The better approach is to make all the properties writable. Since a DTO must not have logic, this is one of the few occasions where you can safely make the fields public and omit the getters and setters.

Of course, that means some other part of the code is responsible for filling the DTO with combinations of properties that together make sense.

Conversely, you should validate DTOs that come back in from the client.


DevOps Is The New Agile

2014-12-01

In The Structure of Scientific Revolutions, Thomas Kuhn argues that science is not a steady accumulation of facts and theories, but rather an sequence of stable periods, interrupted by revolutions.

3rd-platformDuring such revolutions, the dominant paradigm breaks down under the accumulated weight of anomalies it can’t explain until a new paradigm emerges that can.

We’ve seen similar paradigm shifts in the field of information technology. For hardware, we’re now at the Third Platform.

For software, we’ve had several generations of programming languages and we’ve seen different programming paradigms, with reactive programming gaining popularity lately.

The Rise of Agile

We’ve seen a revolution in software development methodology as well, where the old Waterfall paradigm was replaced by Agile. The anomalies in this case were summarized as the software crisis, as documented by the Chaos Report.

The Agile Manifesto was clearly a revolutionary pamphlet:

We are uncovering better ways of developing software by doing it and helping others do it.

It was written in 2001 and originally signed by 17 people. It’s interesting to see what software development methods they were involved with:

  1. Adaptive Software Development: Jim Highsmith
  2. Crystal: Alistair Cockburn
  3. Dynamic Systems Development Method: Arie van Bennekum
  4. eXtreme Programming: Kent Beck, Ward Cunningham, Martin Fowler, James Grenning, Ron Jeffries, Robert Martin
  5. Feature-Driven Development: Jon Kern
  6. Object-Oriented Analysis: Stephen Mellor
  7. Scrum: Mike Beedle, Ken Schwaber, Jeff Sutherland
  8. Andrew Hunt, Brian Marick, and Dave Thomas were not associated with a specific method

Only two of the seven methods were represented by more than one person: eXtreme Programming (XP) and Scrum. Coincidentally, these are the only ones we still hear about today.

Agile Becomes Synonymous with Scrum

ScrumScrum is the clear winner in terms of market share, to the point where many people don’t know the difference between Agile and Scrum.

I think there are at least two reasons for that: naming and ease of adoption.

Decision makers in environments where nobody ever gets fired for buying IBM are usually not looking for something that is “extreme”. And “programming” is for, well, other people. On the other hand, Scrum is a term borrowed from sports, and we all know how executives love using sport metaphors.

[BTW, the term “extreme” in XP comes from the idea of turning the dials of some useful practices up to 10. For example, if code reviews are good, let’s do it all the time (pair programming). But Continuous Integration is not nearly as extreme as Continuous Delivery and iterations (time-boxed pushes) are mild compared to pull systems like Kanban. XP isn’t so extreme after all.]

Scrum is easy to get started with: you can certifiably master it in two days. Part of this is that Scrum has fewer mandated practices than XP.

That’s also a danger: Scrum doesn’t prescribe any technical practices, even though technical practices are important. The technical practices support the management practices and are the foundation for a culture of technical excellence.

The software craftsmanship movement can be seen as a reaction to the lack of attention for the technical side. For me, paying attention to obviously important technical practices is simply being a good software professional.

The (Water)Fall Of Scrum

The jury is still out on whether management-only Scrum is going to win completely, or whether the software craftsmanship movement can bring technical excellence back into the picture. This may be more important than it seems at first.

ScrumFallSince Scrum focuses only on management issues, developers may largely keep doing what they were doing in their Waterfall days. This ScrumFall seems to have become the norm in enterprises.

No wonder that many Scrum projects don’t produce the expected benefits. The late majority and laggards may take that opportunity to completely revert back to the old ways and the Agile Revolution may fail.

In fact, several people have already proclaimed that Agile is dead and are talking about a post-Agile world.

Some think that software craftsmanship should be the new paradigm, but I’m not buying that.

Software craftsmanship is all about the code and too many people simply don’t care enough about code. Beautiful code that never gets deployed, for example, is worthless.

Beyond Agile with DevOps

Speaking of deploying software, the DevOps movement may be a more likely candidate to take over the baton from Agile. It’s based on Lean principles, just like Agile was. Actually, DevOps is a continuation of Agile outside the development team. I’ve even seen the term agile DevOps.

So what makes me think DevOps won’t share the same fate as Agile?

First, DevOps looks at the whole software delivery value stream, whereas Agile confined itself to software development. This means DevOps can’t remain in the developer’s corner; for DevOps to work, it has to have support from way higher up the corporate food chain. And executive support is a prerequisite for real, lasting change.

Second, the DevOps movement from the beginning has placed a great deal of emphasis on culture, which is where I think Agile failed most. We’ll have to see whether DevOps can really do better, but at least the topic is on the agenda.

metricsThird, DevOps puts a lot of emphasis on metrics, which makes it easier to track its success and helps to break down silos.

Fourth, the Third Platform virtually requires DevOps, because Systems of Engagement call for much more rapid software delivery than Systems of Record.

Fifth, with the number of security breaches spiraling out of control, the ability to quickly deploy fixes becomes the number one security measure. The combination of DevOps and Security is referred to as Rugged DevOps or DevOpsSec.

What do you think? Will DevOps succeed where Agile failed? Please leave a comment.


Three Ways To Become a Better Software Professional

2014-11-26

The other day InfoQ posted an article on software craftsmanship.

In my view, software craftsmanship is no more or less than being a good professional. Here are three main ways to become one.

1. See the Big Picture

Let’s start with why. Software rules the world and thus we rule the world. And we all know that with great power comes great responsibility.

Now, what is responsible behavior in this context?

It’s many things. It’s delivering software that solves real needs, that works reliably, is secure, is a pleasure to use, etc. etc.

There is one constant in all these aspects: they change. Business needs evolve. New security threats emerge. New usability patterns come into fashion. New technology is introduced at breakneck speed.

The number one thing a software professional must do is to form an attitude of embracing change. We cope with change by writing programs that are easy to change.

Adaptability is not something we’ll explicitly see in the requirements; it’s silently assumed. We must nevertheless take our responsibility to bake it in.

Unfortunately, adaptability doesn’t find its way into our programs by accident. Writing programs that are easy to change is not easy but requires a considerable amount of effort and skill. The skill of a craftsman.

2. Hone Your Skills

How do we acquire the required skills to keep our programs adaptable?

We need to learn. And the more we learn, the more we’ll find that there’s always more to learn. That should make us humble.

How do we learn?

By reading/watching/listening, by practicing, and by doing. We need to read a lot and go to conferences to infuse our minds with fresh ideas. We need to practice to put such new ideas to the test in a safe environment. Finally, we need to incorporate those ideas into our daily practices to actually profit from them.

BTW, I don’t agree with the statement in the article that

Programmers cannot improve their skills by doing the same exercise repeatedly.

One part of mastering a skill is building muscle memory, and that’s what katas like Roman Numerals are for. Athletes and musicians understand that all too well.

But we must go even further. There is so much to learn that we’ll have to continuously improve our ability to do so to keep up. Learning to learn is a big part of software craftsmanship.

3. Work Well With Others

Nowadays software development is mostly a team sport, because we’ve pushed our programs to the point where they’re too big to fail build alone. We are part of a larger community and the craftsmanship model emphasizes that.

There are both pros and cons to being part of a community. On the bright side, there are many people around us who share our interests and are willing to help us out, for instance in code retreats. The flip side is that we need to learn soft skills, like how to influence others or how to work in a team.

Being effective in a community also means our individually honed skills must work well with those of others. Test-Driven Development (TDD), for example, can’t successfully be practiced in isolation. An important aspect of a community is its culture, as the DevOps movement clearly shows.

To make matters even more interesting, we’re actually simultaneously part of multiple communities: our immediate team, our industry (e.g. healthcare), and our community of interest (e.g. software security or REST), to name a few. We should participate in each, understanding that each of those communities will have their own culture.

It’s All About the Journey

Software craftsmanship is not about becoming a master and then resting on your laurels.

While we should aspire to master all aspects of software development, we can’t hope to actually achieve it. It’s more about the journey than the destination. And about the fun we can have along the way.


The State of REST

2014-11-10

rest-easyThe S in REST stands for State. Unfortunately, state is an overloaded word.

In this post I’ll discuss the two different kinds of state that apply to REST APIs.

Applications

The first type of state is application state, as in Hypermedia As The Engine Of Application State (HATEOAS), the distinguishing feature of the REST architectural style.

We must first understand what exactly an application in a RESTful architecture is and isn’t. A REST API is not an interface to an application, but an interface for an application.

A RESTful service is just a bunch of interrelated and interconnected resources. In and of themselves they don’t make up an application. It’s a particular usage pattern of those resources that turn them into an application.

The term application pertains more to the client than to the server. It’s possible to build different applications on top of the same resources. It’s also possible to build an application out of resources hosted on different servers or even by different organizations (mashups).

Application State vs. Resource State

The stateless constraint of REST doesn’t say that a server can’t maintain any state, it only says that a server shouldn’t maintain application (or session) state.

A server is allowed to maintain other state. In fact, most servers would be completely useless if they didn’t. Think of the Amazon web store without any books! We call this resource state to distinguish it from application state.

So where do we draw the line between resource state and application state?

Resource state is information we want to be available between multiple sessions of the same user, and between sessions of different users. Resource state can initially be supplied by either servers (e.g. books) or clients (e.g. book reviews).

Application state is the information that pertains to one particular session of the application. The contents of my shopping cart could be application state, for instance.

Note that this is not how Amazon implemented it; they keep this state on the server. That doesn’t mean that the people at Amazon don’t understand REST. The web browser that I use to shop isn’t sophisticated enough to maintain the application state. Also, they want me to be able to close my browser and return to my shopping cart tomorrow.

This example shows that what is application and what resource state is a design decision.

Application state pertains to the goal the user is trying to achieve while driving the client. It is this state that we’re referring to when we talk about state diagrams for REST APIs, not the resource state.

State Transfer

Application state is all the information maintained on the client side while the user is trying to accomplish a goal. This information is built up piece by piece based on the resource state that is transferred between client and server.

The resource state is transferred as a representation, a particular serialization of the resource state suitable for inclusion in an HTTP message body.

Serialization is governed by the rules laid out in a media type. There are many different media types, some more mature than others.

Since clients and servers transfer representations of resource state, we speak of Representational State Transfer (REST).


How To Return Error Details From REST APIs

2014-11-03

errorThe HTTP protocol uses status codes to return error information. This facility, while extremely useful, is too limited for many use cases. So how do we return more detailed information?

There are basically two approaches we can take:

  1. Use a dedicated media type that contains the error details
  2. Include the error details in the used media type

Dedicated Error Media Types

There are at least two candidates in this space:

  1. Problem Details for HTTP APIs is an IETF draft that we’ve used in some of our APIs to good effect. It treats both problem types and instances as URIs and is extensible. This media type is available in both JSON and XML flavors.
  2. application/vnd.error+json is for JSON media types only. It also feels less complete and mature to me.

A media type dedicated to error reporting has the advantage of being reusable in many REST APIs. Any existing libraries for handling them could save us some effort. We also don’t have to think about how to structure the error details, as someone else has done all that hard work for us.

However, putting error details in a dedicated media type also increases the burden on clients, since they now have to handle an additional media type.

Another disadvantage has to do with the Accept header. It’s highly unlikely that clients will specify the error media type in Accept. In general, we should either return 406 or ignore the Accept header when we can’t honor it. The first option is not acceptable (pun intended), the second is not very elegant.

Including Error Details In Regular Media Type

We could also design our media types such that they allow specifying error details. In this post, I’m going to stick with three mature media types: Mason, Siren, and UBER.

Mason uses the @error property:

{
  "@error": {
    "@id": "b2613385-a3b2-47b7-b336-a85ac405bc66",
    "@message": "There was a problem with one or more input values.",
    "@code": "INVALIDINPUT"
  }
}

The existing error properties are compatible with Problem Details for HTTP APIs, and they can be extended.

Siren doesn’t have a specific structure for errors, but we can easily model errors with the existing structures:

{
  "class": [
    "error"
  ],
  "properties": {
    "id": "b2613385-a3b2-47b7-b336-a85ac405bc66",
    "message": "There was a problem with one or more input values.",
    "code": "INVALIDINPUT"
  }
}

We can even go a step further and use entities to add detailed sub-errors. This would be very useful for validation errors, for instance, where you can report all errors to the client at once. We could also use the existing actions property to include a retry action. And we could use the error properties from Problem Details for HTTP APIs.

UBER has an explicit error structure:

{
  "uber": {
    "version": "1.0",
    "error": {
      "data": [
        { "name": "id", "value": "b2613385-a3b2-47b7-b336-a85ac405bc66" },
        { "name": "message", "value": "There was a problem with one or more input values." },
        { "name": "code", "value": "INVALIDINPUT" }
      ]
    }
  }
}

Again, we could reuse the error properties from Problem Details for HTTP APIs.

Conclusion

My advice would be to use the error structure of your existing media type and use the extensibility features to steal all the good ideas from Problem Details for HTTP APIs.


How To Design a REST API

2014-10-27

rest-easyThere is a lot of interest in REST APIs these days. Unfortunately, most APIs I see are not very mature.

In this post I’d like to share my approach to designing REST APIs:

  1. Understand the problem domain and application requirements and document them as a state diagram
  2. Discover the resources from the transitions
  3. Name the resources with URIs
  4. Select one or more media types to serialize the various representations identified in the resource model
  5. Assign link relations to each of the transitions
  6. Add documentation as required

Note that this is a variation of the design process described in RESTful Web Services. Now, let’s look at each of the steps in detail.

Step 1: Understand the problem domain and application requirements and document them as a state diagram

Without understanding the domain, it’s impossible to come up with a good design for anything.

We want to capture the domain in a way that makes sense for REST APIs. Since the purpose in life for a REST API is to be consumed by REST clients, it makes sense to document the application domain from the point of view of a REST client.

A REST client starts at some well-known URI (the billboard URI, or the URI of the home resource) and then follows links until its goal is met:
restClientFlow
In other words, a REST client starts at some initial state, and then transitions to other states by following links (i.e. executing HTTP methods against URIs) that are discovered from the previously returned representation.

A natural way to capture this information is in the form of a state diagram. For bonus points, we can turn the requirements into executable specifications using BDD techniques and derive the state diagram from the BDD scenarios.

For each scenario, we should specify the happy path, any applicable alternative paths and also the sad paths (edge, error, and abuse cases). We can do this iteratively, starting with only the happy path and then adding progressively more detail based on alternative and sad paths.

We can first collect all scenarios and build the entire state diagram from there. Alternatively, we can start with a select few scenarios, work through the design process, and then repeat everything with new scenarios.

In other words, we can do waterfall-like Big Analysis/Design Up Front, or work through feature by feature in a more Agile manner.

Either way, we document the requirements using a state diagram and work from there.

Step 2: Discover the resources from the transitions

You can build up the resource model piece-by-piece:
transitions

  1. Start with the initial state
  2. Create (or re-use) a resource with a representation that corresponds to this state
  3. For each transition starting from the current state, make sure there is a corresponding method in some resource that implements the transition
  4. Repeat for all transitions in each of the remaining states

Step 3: Name the resources with URIs

Every resource should be identified by a URI. From the client’s perspective, this is an implementation detail, but we still need to do this before we can implement the server.

We should follow best practices for URIs, like keeping them cool.

Step 4: Select one or more media types to serialize the various representations identified in the resource model

mediaWhen extending an existing design, you should stick with the already selected media types.

For new APIs, we should choose a mature format, like Siren or Mason.

There could be specific circumstances where these are not good choices. In that case, carefully select an alternative.

Step 5: Assign link relations to each of the transitions

A REST client follows transitions in the state diagram by discovering links in representations. This discovery process is made possible by link relations.

Link relations decouple the client from the URIs that the server uses, giving the server the freedom to change its URI structure at will without breaking any clients. Link relations are therefore an important part of any REST API.

We should try to use existing link relations as much as possible. They don’t cover every case, however, so sometimes you need to invent your own.

Step 6: Add documentation as required

learn moreIn order to help developers build clients that work against your API, you will most likely want to add some documentation that explains certain more subtle points.

Examples are very helpful to illustrate those points.

You may also add instructions for server developers that will implement the API, like what caching to use.

Conclusion

I’ve successfully used this approach on a number of APIs. Next time, I’ll show you with an example how the above process is actually very easy with the right support.

In the meantime, I’d love to hear how you approach REST API design. Please leave a comment below.


How To Control Access To REST APIs

2014-10-20

hackerExposing your data or application through a REST API is a wonderful way to reach a wide audience.

The downside of a wide audience, however, is that it’s not just the good guys who come looking.

Securing REST APIs

Security consists of three factors:

  1. Confidentiality
  2. Integrity
  3. Availability

In terms of Microsoft’s STRIDE approach, the security compromises we want to avoid with each of these are Information Disclosure, Tampering, and Denial of Service. The remainder of this post will only focus on Confidentiality and Integrity.

In the context of an HTTP-based API, Information Disclosure is applicable for GET methods and any other methods that return information. Tampering is applicable for PUT, POST, and DELETE.

Threat Modeling REST APIs

A good way to think about security is by looking at all the data flows. That’s why threat modeling usually starts with a Data Flow Diagram (DFD). In the context of a REST API, a close approximation to the DFD is the state diagram. For proper access control, we need to secure all the transitions.

The traditional way to do that, is to specify restrictions at the level of URI and HTTP method. For instance, this is the approach that Spring Security takes. The problem with this approach, however, is that both the method and the URI are implementation choices.

link-relationURIs shouldn’t be known to anybody but the API designer/developer; the client will discover them through link relations.

Even the HTTP methods can be hidden until runtime with mature media types like Mason or Siren. This is great for decoupling the client and server, but now we have to specify our security constraints in terms of implementation details! This means only the developers can specify the access control policy.

That, of course, flies in the face of best security practices, where the access control policy is externalized from the code (so it can be reused across applications) and specified by a security officer rather than a developer. So how do we satisfy both requirements?

Authorizing REST APIs

I think the answer lies in the state diagram underlying the REST API. Remember, we want to authorize all transitions. Yes, a transition in an HTTP-based API is implemented using an HTTP method on a URI. But in REST, we shield the URI using a link relation. The link relation is very closely related to the type of action you want to perform.

The same link relation can be used from different states, so the link relation can’t be the whole answer. We also need the state, which is based on the representation returned by the REST server. This representation usually contains a set of properties and a set of links. We’ve got the links covered with the link relations, but we also need the properties.

PolicyIn XACML terms, the link relation indicates the action to be performed, while the properties correspond to resource attributes.

Add to that the subject attributes obtained through the authentication process, and you have all the ingredients for making an XACML request!

There are two places where such access control checks comes into play. The first is obviously when receiving a request.

You should also check permissions on any links you want to put in the response. The links that the requester is not allowed to follow, should be omitted from the response, so that the client can faithfully present the next choices to the user.

Using XACML For Authorizing REST APIs

I think the above shows that REST and XACML are a natural fit.

All the more reason to check out XACML if you haven’t already, especially XACML’s REST Profile and the forthcoming JSON Profile.


Follow

Get every new post delivered to your Inbox.

Join 311 other followers