How To Develop Software Using Only SaaS

cloud-codeThe world is fast moving to Software-as-a-Service (SaaS) and we developers are busy learning how to build SaaS applications.

We can now finally do that using nothing but SaaS applications ourselves.

The Developer’s Toolbox

As developers, we don’t ask for much.

An Integrated Development Environment (IDE) lets us do our main task: writing code. A Source Code Management (SCM) system stores our Heartbreaking Work of Staggering Genius. A Continuous Integration (CI) server pulls our code through hoops that prove it is ready for use. And finally a Platform-as-a-Service (PaaS) or other deployment environment runs our applications.

We are used to running all of these on premises. IDEs like Eclipse or IntelliJ run on our local machines. SCMs like Git or Subversion run on some company server, as does our Jenkins/Hudson or TeamCity CI server. Finally, we deploy to a Paas like CloudFoundry, or to a custom server.

Most of those tools already run in the cloud. For those that don’t, we can easily find good alternatives. Let’s take a look at some of the candidates.

Integrated Development Environments

I’ve written about Cloud9 before. It’s mainly focused on web languages like JavaScript. For Java, Codenvy seems a better choice. For both, you can run the hosted offering, or deploy it in your own data center.

Neither can match a local IDE experience yet, but the gap is closing. On the other hand, they offer some functionality you won’t easily find in locally installed IDEs, like remote pair programming.

Source Code Management

githubGit has taken over the world, and the SaaS version of it, GitHub, is following suit.

Some people even think that your GitHub profile is your resume.

Again, you can use the hosted version (with public or private repositories), or install GitHub in your data center.

Both Cloud9 and Codenvy work seamlessly with GitHub repositories.

Continuous Integration

Jenkins/Hudson is the leader in this space, and CloudBees offers a SaaS version. Other products include Bamboo, Travis CI and CodeShip. Some of these are free for open source projects. Again, there are hosted and on premises versions.

The CI tools support GitHub through public SSH keys for access and commit hooks for starting jobs.

Platform-as-a-Service

After GitHub, these are probably the most familiar to you: Pivotal CloudFoundry, Heroku, Google App Engine, and Azure. CloudFoundry is backed by many big organizations (including the company I work for, EMC) and seems to be emerging as the leader.

cloudfoundrySome cloud IDEs let you push to a PaaS directly, but I don’t think that’s the right way to do it.

You should commit to your SCM and let CI pick up your changes.

Your CI jobs should be responsible for pushing to the PaaS. Your CI may have a custom integration to your PaaS, or you may have to use something like the CloudFoundry command-line interface to push your changes.

Conclusion

It seems that our entire tool chain is now available as a service, although the IDEs still leave us wanting a bit. Most of these tools are available as open source and can be deployed in your own data center.

Looks like we’re making some progress towards a Frictionless Development Environment!

What SaaS applications are you using for software development? Please leave a comment below.

How To Process Java Annotations

One of the cool new features of Java 8 is the support for lambda expressions. Lambda expressions lean heavily on the FunctionalInterface annotation.

In this post, we’ll look at annotations and how to process them so you can implement your own cool features.

Annotations

Annotations were added in Java 5. The Java language comes with some predefined annotations, but you can also define custom annotations.

Many frameworks and libraries make good use of custom annotations. JAX-RS, for instance, uses them to turn POJOs into REST resources.

Annotations can be processed at compile time or at runtime (or even both).

At runtime, you can use the reflection API. Each element of the Java language that can be annotated, like class or method, implements the AnnotatedElement interface. Note that an annotation is only available at runtime if it has the RUNTIME RetentionPolicy.

Compile-Time Annotation Processing

Java 5 came with the separate apt tool to process annotations, but since Java 6 this functionality is integrated into the compiler.

You can either call the compiler directly, e.g. from the command line, or indirectly, from your program.

In the former case, you specify the -processor option to javac, or you use the ServiceLoader framework by adding the file META-INF/services/javax.annotation.processing.Processor to your jar. The contents of this file should be a single line containing the fully qualified name of your processor class.

The ServiceLoader approach is especially convenient in an automated build, since all you have to do is put the annotation processor on the classpath during compilation, which build tools like Maven or Gradle will do for you.

Compile-Time Annotation Processing From Within Your Application

You can also use the compile-time tools to process annotations from within your running application.

Rather than calling javac directly, use the more convenient JavaCompiler interface. Either way, you’ll need to run your application with a JDK rather than just a JRE.

The JavaCompiler interface gives you programmatic access to the Java compiler. You can obtain an implementation of this interface using ToolProvider.getSystemJavaCompiler(). This method is sensitive to the JAVA_HOME environment variable.

The getTask() method of JavaCompiler allows you to add your annotation processor instances. This is the only way to control the construction of annotation processors; all other methods of invoking annotation processors require the processor to have a public no-arg constructor.

Annotation Processors

A processor must implement the Processor interface. Usually you will want to extend the AbstractProcessor base class rather than implement the interface from scratch.

Each annotation processor must indicate the types of annotations it is interested in through the getSupportedAnnotationTypes() method. You may return * to process all annotations.

The other important thing is to indicate which Java language version you support. Override the getSupportedSourceVersion() method and return one of the RELEASE_x constants.

With these methods implemented, your annotation processor is ready to get to work. The meat of the processor is in the process() method.

When process() returns true, the annotations processed are claimed by this processor, and will not be offered to other processors. Normally, you should play nice with other processors and return false.

Elements and TypeMirrors

The annotations and the Java elements they are present on are provided to your process() method as Element objects. You may want to process them using the Visitor pattern.

The most interesting types of elements are TypeElement for classes and interfaces (including annotations), ExecutableElement for methods, and VariableElement for fields.

Each Element points to a TypeMirror, which represents a type in the Java programming language. You can use the TypeMirror to walk the class relationships of the annotated code you’re processing, much like you would using reflection on the code running in the JVM.

Processing Rounds

Annotation processing happens in separate stages, called rounds. During each round, a processor gets a chance to process the annotations it is interested in.

The annotations to process and the elements they are present on are available via the RoundEnvironment parameter passed into the process() method.

If annotation processors generate new source or class files during a round, then the compiler will make those available for processing in the next round. This continues until no more new files are generated.

The last round contains no input, and is thus a good opportunity to release any resources the processor may have acquired.

Initializing and Configuring Processors

Annotation processors are initialized with a ProcessingEnvironment. This processing environment allows you to create new source or class files.

It also provides access to configuration in the form of options. Options are key-value pairs that you can supply on the command line to javac using the -A option. For this to work, you must return the options’ keys in the processor’s getSupportedOptions() method.

Finally, the processing environment provides some support routines (e.g. to get the JavaDoc for an element, or to get the direct super types of a type) that come in handy during processing.

Classpath Issues

To get the most accurate information during annotation processing, you must make sure that all imported classes are on the classpath, because classes that refer to types that are not available may have incomplete or altogether missing information.

When processing large numbers of annotated classes, this may lead to a problem on Windows systems where the command line becomes too large (> 8K). Even when you use the JavaCompiler interface, it still calls javac behind the scenes.

The Java compiler has a nice solution to this problem: you can use argument files that contain the arguments to javac. The name of the argument file is then supplied on the command line, preceded by @.

Unfortunately, the JavaCompiler.getTask() method doesn’t support argument files, so you’ll have to use the underlying run() method.

Remember that the getTask() approach is the only one that allows you to construct your annotation processors. If you must use argument files, then you have to use a public no-arg constructor.

If you’re in that situation, and you have multiple annotation processors that need to share a single instance of a class, you can’t pass that instance into the constructor, so you’ll be forced to use something like the Singleton pattern.

Conclusion

Annotations are an exciting technology that have lots of interesting applications. For example, I used them to extract the resources from a REST API into a resource model for further processing, like generating documentation.

I’m very interested to learn what you have used them for. Please leave a comment below.

REST Messages And Data Transfer Objects

In Patterns of Enterprise Application Architecture, Martin Fowler defines a Data Transfer Object (DTO) as

An object that carries data between processes in order to reduce the number of method calls.

Note that a Data Transfer Object is not the same as a Data Access Object (DAO), although they have some similarities. A Data Access Object is used to hide details from the underlying persistence layer.

REST Messages Are Serialized DTOs

message-transferIn a RESTful architecture, the messages sent across the wire are serializations of DTOs.

This means all the best practices around DTOs are important to follow when building RESTful systems.

For instance, Fowler writes

…encapsulate the serialization mechanism for transferring data over the wire. By encapsulating the serialization like this, the DTOs keep this logic out of the rest of the code and also provide a clear point to change serialization should you wish.

In other words, you should follow the DRY principle and have exactly one place where you convert your internal DTO to a message that is sent over the wire.

In JAX-RS, that one place should be in an entity provider. In Spring, the mechanism to use is the message converter. Note that both frameworks have support for several often-used serialization formats.

Following this advice not only makes it easier to change media types (e.g. from plain JSON or HAL to a more mature media type like Siren, Mason, or UBER). It also makes it easy to support multiple media types.

mediaThis in turn enables you to switch media types without breaking clients.

You can continue to serve old clients with the old media type, while new clients can take advantage of the new media type.

Introducing new media types is one way to evolve your REST API when you must make backwards incompatible changes.

DTOs Are Not Domain Objects

Domain objects implement the ubiquitous language used by subject matter experts and thus are discovered. DTOs, on the other hand, are designed to meet certain non-functional characteristics, like performance, and are subject to trade-offs.

This means the two have very different reasons to change and, following the Single Responsibility Principle, should be separate objects. Blindly serializing domain objects should thus be considered an anti-pattern.

That doesn’t mean you must blindly add DTOs, either. It’s perfectly fine to start with exposing domain objects, e.g. using Spring Data REST, and introducing DTOs as needed. As always, premature optimization is the root of all evil, and you should decide based on measurements.

The point is to keep the difference in mind. Don’t change your domain objects to get better performance, but rather introduce DTOs.

DTOs Have No Behavior

big-dataA DTO should not have any behavior; it’s purpose in life is to transfer data between remote systems.

This is very different from domain objects.

There are two basic approaches for dealing with the data in a DTO.

The first is to make them immutable objects, where all the input is provided in the constructor and the data can only be read.

This doesn’t work well for large objects, and doesn’t play nice with serialization frameworks.

The better approach is to make all the properties writable. Since a DTO must not have logic, this is one of the few occasions where you can safely make the fields public and omit the getters and setters.

Of course, that means some other part of the code is responsible for filling the DTO with combinations of properties that together make sense.

Conversely, you should validate DTOs that come back in from the client.

DevOps Is The New Agile

In The Structure of Scientific Revolutions, Thomas Kuhn argues that science is not a steady accumulation of facts and theories, but rather an sequence of stable periods, interrupted by revolutions.

3rd-platformDuring such revolutions, the dominant paradigm breaks down under the accumulated weight of anomalies it can’t explain until a new paradigm emerges that can.

We’ve seen similar paradigm shifts in the field of information technology. For hardware, we’re now at the Third Platform.

For software, we’ve had several generations of programming languages and we’ve seen different programming paradigms, with reactive programming gaining popularity lately.

The Rise of Agile

We’ve seen a revolution in software development methodology as well, where the old Waterfall paradigm was replaced by Agile. The anomalies in this case were summarized as the software crisis, as documented by the Chaos Report.

The Agile Manifesto was clearly a revolutionary pamphlet:

We are uncovering better ways of developing software by doing it and helping others do it.

It was written in 2001 and originally signed by 17 people. It’s interesting to see what software development methods they were involved with:

  1. Adaptive Software Development: Jim Highsmith
  2. Crystal: Alistair Cockburn
  3. Dynamic Systems Development Method: Arie van Bennekum
  4. eXtreme Programming: Kent Beck, Ward Cunningham, Martin Fowler, James Grenning, Ron Jeffries, Robert Martin
  5. Feature-Driven Development: Jon Kern
  6. Object-Oriented Analysis: Stephen Mellor
  7. Scrum: Mike Beedle, Ken Schwaber, Jeff Sutherland
  8. Andrew Hunt, Brian Marick, and Dave Thomas were not associated with a specific method

Only two of the seven methods were represented by more than one person: eXtreme Programming (XP) and Scrum. Coincidentally, these are the only ones we still hear about today.

Agile Becomes Synonymous with Scrum

ScrumScrum is the clear winner in terms of market share, to the point where many people don’t know the difference between Agile and Scrum.

I think there are at least two reasons for that: naming and ease of adoption.

Decision makers in environments where nobody ever gets fired for buying IBM are usually not looking for something that is “extreme”. And “programming” is for, well, other people. On the other hand, Scrum is a term borrowed from sports, and we all know how executives love using sport metaphors.

[BTW, the term “extreme” in XP comes from the idea of turning the dials of some useful practices up to 10. For example, if code reviews are good, let’s do it all the time (pair programming). But Continuous Integration is not nearly as extreme as Continuous Delivery and iterations (time-boxed pushes) are mild compared to pull systems like Kanban. XP isn’t so extreme after all.]

Scrum is easy to get started with: you can certifiably master it in two days. Part of this is that Scrum has fewer mandated practices than XP.

That’s also a danger: Scrum doesn’t prescribe any technical practices, even though technical practices are important. The technical practices support the management practices and are the foundation for a culture of technical excellence.

The software craftsmanship movement can be seen as a reaction to the lack of attention for the technical side. For me, paying attention to obviously important technical practices is simply being a good software professional.

The (Water)Fall Of Scrum

The jury is still out on whether management-only Scrum is going to win completely, or whether the software craftsmanship movement can bring technical excellence back into the picture. This may be more important than it seems at first.

ScrumFallSince Scrum focuses only on management issues, developers may largely keep doing what they were doing in their Waterfall days. This ScrumFall seems to have become the norm in enterprises.

No wonder that many Scrum projects don’t produce the expected benefits. The late majority and laggards may take that opportunity to completely revert back to the old ways and the Agile Revolution may fail.

In fact, several people have already proclaimed that Agile is dead and are talking about a post-Agile world.

Some think that software craftsmanship should be the new paradigm, but I’m not buying that.

Software craftsmanship is all about the code and too many people simply don’t care enough about code. Beautiful code that never gets deployed, for example, is worthless.

Beyond Agile with DevOps

Speaking of deploying software, the DevOps movement may be a more likely candidate to take over the baton from Agile. It’s based on Lean principles, just like Agile was. Actually, DevOps is a continuation of Agile outside the development team. I’ve even seen the term agile DevOps.

So what makes me think DevOps won’t share the same fate as Agile?

First, DevOps looks at the whole software delivery value stream, whereas Agile confined itself to software development. This means DevOps can’t remain in the developer’s corner; for DevOps to work, it has to have support from way higher up the corporate food chain. And executive support is a prerequisite for real, lasting change.

Second, the DevOps movement from the beginning has placed a great deal of emphasis on culture, which is where I think Agile failed most. We’ll have to see whether DevOps can really do better, but at least the topic is on the agenda.

metricsThird, DevOps puts a lot of emphasis on metrics, which makes it easier to track its success and helps to break down silos.

Fourth, the Third Platform virtually requires DevOps, because Systems of Engagement call for much more rapid software delivery than Systems of Record.

Fifth, with the number of security breaches spiraling out of control, the ability to quickly deploy fixes becomes the number one security measure. The combination of DevOps and Security is referred to as Rugged DevOps or DevOpsSec.

What do you think? Will DevOps succeed where Agile failed? Please leave a comment.

Three Ways To Become a Better Software Professional

The other day InfoQ posted an article on software craftsmanship.

In my view, software craftsmanship is no more or less than being a good professional. Here are three main ways to become one.

1. See the Big Picture

Let’s start with why. Software rules the world and thus we rule the world. And we all know that with great power comes great responsibility.

Now, what is responsible behavior in this context?

It’s many things. It’s delivering software that solves real needs, that works reliably, is secure, is a pleasure to use, etc. etc.

There is one constant in all these aspects: they change. Business needs evolve. New security threats emerge. New usability patterns come into fashion. New technology is introduced at breakneck speed.

The number one thing a software professional must do is to form an attitude of embracing change. We cope with change by writing programs that are easy to change.

Adaptability is not something we’ll explicitly see in the requirements; it’s silently assumed. We must nevertheless take our responsibility to bake it in.

Unfortunately, adaptability doesn’t find its way into our programs by accident. Writing programs that are easy to change is not easy but requires a considerable amount of effort and skill. The skill of a craftsman.

2. Hone Your Skills

How do we acquire the required skills to keep our programs adaptable?

We need to learn. And the more we learn, the more we’ll find that there’s always more to learn. That should make us humble.

How do we learn?

By reading/watching/listening, by practicing, and by doing. We need to read a lot and go to conferences to infuse our minds with fresh ideas. We need to practice to put such new ideas to the test in a safe environment. Finally, we need to incorporate those ideas into our daily practices to actually profit from them.

BTW, I don’t agree with the statement in the article that

Programmers cannot improve their skills by doing the same exercise repeatedly.

One part of mastering a skill is building muscle memory, and that’s what katas like Roman Numerals are for. Athletes and musicians understand that all too well.

But we must go even further. There is so much to learn that we’ll have to continuously improve our ability to do so to keep up. Learning to learn is a big part of software craftsmanship.

3. Work Well With Others

Nowadays software development is mostly a team sport, because we’ve pushed our programs to the point where they’re too big to fail build alone. We are part of a larger community and the craftsmanship model emphasizes that.

There are both pros and cons to being part of a community. On the bright side, there are many people around us who share our interests and are willing to help us out, for instance in code retreats. The flip side is that we need to learn soft skills, like how to influence others or how to work in a team.

Being effective in a community also means our individually honed skills must work well with those of others. Test-Driven Development (TDD), for example, can’t successfully be practiced in isolation. An important aspect of a community is its culture, as the DevOps movement clearly shows.

To make matters even more interesting, we’re actually simultaneously part of multiple communities: our immediate team, our industry (e.g. healthcare), and our community of interest (e.g. software security or REST), to name a few. We should participate in each, understanding that each of those communities will have their own culture.

It’s All About the Journey

Software craftsmanship is not about becoming a master and then resting on your laurels.

While we should aspire to master all aspects of software development, we can’t hope to actually achieve it. It’s more about the journey than the destination. And about the fun we can have along the way.

The State of REST

rest-easyThe S in REST stands for State. Unfortunately, state is an overloaded word.

In this post I’ll discuss the two different kinds of state that apply to REST APIs.

Applications

The first type of state is application state, as in Hypermedia As The Engine Of Application State (HATEOAS), the distinguishing feature of the REST architectural style.

We must first understand what exactly an application in a RESTful architecture is and isn’t. A REST API is not an interface to an application, but an interface for an application.

A RESTful service is just a bunch of interrelated and interconnected resources. In and of themselves they don’t make up an application. It’s a particular usage pattern of those resources that turn them into an application.

The term application pertains more to the client than to the server. It’s possible to build different applications on top of the same resources. It’s also possible to build an application out of resources hosted on different servers or even by different organizations (mashups).

Application State vs. Resource State

The stateless constraint of REST doesn’t say that a server can’t maintain any state, it only says that a server shouldn’t maintain application (or session) state.

A server is allowed to maintain other state. In fact, most servers would be completely useless if they didn’t. Think of the Amazon web store without any books! We call this resource state to distinguish it from application state.

So where do we draw the line between resource state and application state?

Resource state is information we want to be available between multiple sessions of the same user, and between sessions of different users. Resource state can initially be supplied by either servers (e.g. books) or clients (e.g. book reviews).

Application state is the information that pertains to one particular session of the application. The contents of my shopping cart could be application state, for instance.

Note that this is not how Amazon implemented it; they keep this state on the server. That doesn’t mean that the people at Amazon don’t understand REST. The web browser that I use to shop isn’t sophisticated enough to maintain the application state. Also, they want me to be able to close my browser and return to my shopping cart tomorrow.

This example shows that what is application and what resource state is a design decision.

Application state pertains to the goal the user is trying to achieve while driving the client. It is this state that we’re referring to when we talk about state diagrams for REST APIs, not the resource state.

State Transfer

Application state is all the information maintained on the client side while the user is trying to accomplish a goal. This information is built up piece by piece based on the resource state that is transferred between client and server.

The resource state is transferred as a representation, a particular serialization of the resource state suitable for inclusion in an HTTP message body.

Serialization is governed by the rules laid out in a media type. There are many different media types, some more mature than others.

Since clients and servers transfer representations of resource state, we speak of Representational State Transfer (REST).

How To Return Error Details From REST APIs

errorThe HTTP protocol uses status codes to return error information. This facility, while extremely useful, is too limited for many use cases. So how do we return more detailed information?

There are basically two approaches we can take:

  1. Use a dedicated media type that contains the error details
  2. Include the error details in the used media type

Dedicated Error Media Types

There are at least two candidates in this space:

  1. Problem Details for HTTP APIs is an IETF draft that we’ve used in some of our APIs to good effect. It treats both problem types and instances as URIs and is extensible. This media type is available in both JSON and XML flavors.
  2. application/vnd.error+json is for JSON media types only. It also feels less complete and mature to me.

A media type dedicated to error reporting has the advantage of being reusable in many REST APIs. Any existing libraries for handling them could save us some effort. We also don’t have to think about how to structure the error details, as someone else has done all that hard work for us.

However, putting error details in a dedicated media type also increases the burden on clients, since they now have to handle an additional media type.

Another disadvantage has to do with the Accept header. It’s highly unlikely that clients will specify the error media type in Accept. In general, we should either return 406 or ignore the Accept header when we can’t honor it. The first option is not acceptable (pun intended), the second is not very elegant.

Including Error Details In Regular Media Type

We could also design our media types such that they allow specifying error details. In this post, I’m going to stick with three mature media types: Mason, Siren, and UBER.

Mason uses the @error property:

{
  "@error": {
    "@id": "b2613385-a3b2-47b7-b336-a85ac405bc66",
    "@message": "There was a problem with one or more input values.",
    "@code": "INVALIDINPUT"
  }
}

The existing error properties are compatible with Problem Details for HTTP APIs, and they can be extended.

Siren doesn’t have a specific structure for errors, but we can easily model errors with the existing structures:

{
  "class": [
    "error"
  ],
  "properties": {
    "id": "b2613385-a3b2-47b7-b336-a85ac405bc66",
    "message": "There was a problem with one or more input values.",
    "code": "INVALIDINPUT"
  }
}

We can even go a step further and use entities to add detailed sub-errors. This would be very useful for validation errors, for instance, where you can report all errors to the client at once. We could also use the existing actions property to include a retry action. And we could use the error properties from Problem Details for HTTP APIs.

UBER has an explicit error structure:

{
  "uber": {
    "version": "1.0",
    "error": {
      "data": [
        { "name": "id", "value": "b2613385-a3b2-47b7-b336-a85ac405bc66" },
        { "name": "message", "value": "There was a problem with one or more input values." },
        { "name": "code", "value": "INVALIDINPUT" }
      ]
    }
  }
}

Again, we could reuse the error properties from Problem Details for HTTP APIs.

Conclusion

My advice would be to use the error structure of your existing media type and use the extensibility features to steal all the good ideas from Problem Details for HTTP APIs.