Saturday, December 24, 2011

Service Bus or Service Governance

What would you do if you start an IT journey to SOA? You buy a service bus first? Or you introduce service governance first?

Buying a service bus is easy and quick. And you can probably boast about the fact that you are "doing SOA." But without governance, your bus will eventually cost a lot of money in maintenance, and is at risk of quickly becoming a legacy system.

Introducing SOA governance is generally more difficult, because it touches the organization and its business processes itself. But if you govern services properly, you don't need a bus per se to come to a SOA. The bus is then a tool which will facilitate the governance.

So, SOA governance should be taken seriously from day one of your journey to SOA, and it is probably more important than having a service bus. So maybe using a Service Repository is more important than using a Service Bus?

Thursday, December 15, 2011

Denial of Service attack in the cloud

IT companies bring their services into the cloud, such as the Amazon Web Services cloud or RackspaceCloud. Here, you pay per use. You can configure your service such that its resource usage grows or shrinks with demand. This can be set up automatically. And this is what you want, right? You want to serve a bigger audience.

You pay for resource usage, so upon high demand, you pay more to the cloud providers. This is very logical.

But a rotten neighbour (not mine!) could possibly set up some scripts which fire heavy load to your cloud-based service. Okay, he needs to be a bit clever to do this in such a way that Amazon and Rackspace don't recognize this as a "Denial of Service" attack.

The owner of the service may receive a large bill.

Can malign individuals use this to bomb their competitors out of business because they go broke due to heavy bills from the cloud providers?

You probably should not forget to set upper limits on the resource usage at the cloud providers, but then you are again susceptible to real Denial of Service attacks, just like when you run your own data centre.

Sunday, December 11, 2011

Logging service

I came across it again, the LoggingService! Built into a framework for monitoring what actually happens in a web service setting, the LoggingService is intended to log incoming and outgoing data.

Why would you want to code cross-cutting concerns - aspects - in a "home-made service?"

In Java, you can use bytecode injection techniques to deal with aspects. In web services, use the application server's frameworks, or use web service agents, such as the AmberPoint product does.

But don't do it yourself, you will tie your code to a non-standard framework. Building your own LoggingService, the code using that service, must be hardcoded to use it. This tightly couples service implementations to the cross-cutting concern of logging and monitoring. This is not a good idea.

Friday, December 02, 2011

SOA antipattern example

I recently received a web service definition file (WSDL) with a single operation defined:


    <wsdl:operation name="sendRequest" parameterOrder="request requestHeader">
      <wsdl:input message="GenericRequest" />
      <wsdl:output message="GenericResponse" />
      <wsdl:fault message="SystemError" name="SystemError" />
    </wsdl:operation>

This web service is actually the (one and only) entry point to a "service platform." 

We see the operation sendRequest, with a GenericRequest message as input, and a GenericResponse message as output. Here are the definitions:

  <xsd:element name="GenericRequest" type="xsd:base64Binary" />

  <xsd:element name="GenericResponse">
    <xsd:complexType>
      <xsd:choice>
        <xsd:element name="response" type="xsd:base64Binary" />
        <xsd:element ref="Errors" />
      </xsd:choice>
    </xsd:complexType>
  </xsd:element>


Both input and output messages are base64 encoded binaries. 

Well, in reality, I also received a large bundle of XML schemas defining what these binaries actually could be: a whole bunch of XML messages, all base64 encoded.

This is a beautiful example of how web service contracts should not be made.

From the WSDL, the consumer cannot infer at all what the contract is really about. He will need to check additional documentation to figure this out. And indeed, many (>30!) documents accompany the WSDL. Good luck! Hope you don't miss a subtle point somewhere.

And why base64 encoded XML? Base64 is good for real binaries, like images, pdf documents, etc., if you can't send them as attachments, but it is not particularly good for plain text or XML. It makes it hard to unmarshall the XML.

Basically, the base64encoding is an example of a do-it-yourself transport; a protocol that needs to be specifically implemented by both service and consumer. It's embedded within the standard SOAP transport, which may make people think it is standard, but it isn't.

This type of web service violates any good service oriented design. It leads to unnecessary high costs for all parties involved. The governance of future change is going to be a nightmare, and the operations support will be very costly. 

This is a legacy system from day one!


Tuesday, November 29, 2011

Ping!


Regularly, I see web services offer a ping(), or equivalent, operation. I hear several reasons:
1. “If I depend on the availability of another service, I’ll ask if it is available, just before I use it.”
2. “My service provider says he is available at least 90% of the time, so I will check him out regularly if he keeps his promise!”
3. “My service depends on another service, so I will not make my service available if my service provider is not available.”

And thus, we see a cascade of pings arising.


Let's have a closer look at what this actually means.

1. “If I depend on the availability of another service, I’ll ask if it is available, just before I use it.”

This is brilliant. The network is known to have availability issues every now and then, especially with synchronous protocols like HTTP. So before calling a SOAP web service, let’s do a ping first. If the service doesn’t answer, we don’t call it with the real request.

Flaws:
a.There is no guarantee that after a successful ping, the real request will pass through.
b.An unsuccessful ping does not guarantee that the real request would not have passed through.

The conclusion drawn from the result of the ping can thus very well be wrong!

Solution:
Just call the service with the real request and make sure that you handle a connection error properly. This way, you have never taken the wrong decision. Compared with the ping, there will be more successful hits, which means more business.

2.  “My service provider says he is available at least 90% of the time, so I will check him out regularly if he keeps his promise!”

Let’s do a ping every n minutes, and calculate the availability percentage of the provider. This is a measure of the uptime of the service provider.

Flaws:
a.There is no guarantee that between two successful pings, the service was really available.
b.An unsuccessful ping can be an intermittent network issue of very short duration, but it will be counted as n minutes

The conclusion drawn from the monitoring by pinging has nothing to do with the real statistics of uptime of the provider
Solution:
Just call the service with the real request and make sure that you handle a connection error properly. It doesn’t matter if the service provider is off-air in between requests. All that matters is that at least x percent of your requests are served correctly. That’s what needs to be stated in a service level agreement. And that is what is measurable.

3. “My service depends on another service, so I will not make my service available if my service provider is not available.”

Let’s do a ping of the service provider. If he doesn’t answer I will close down my web application so consumers can’t use this functionality. That avoids frustration of the people receiving connection errors. I’ll open the web application again after a successful ping.

Flaws:
a.There can be no guarantee the service is down. The web application will sometimes be shut down erroneously. This can result in missed business.

Solution:
Just call the service with the real request and make sure that you handle a connection error properly. Even in this way you can shut down the web application, for example after p unsuccessful requests. Compared with the ping “solution,” there will be more successful hits, which means more business.

Okay, so there can be a reason to do "proactive monitoring." After all, if the service goes down in the evening, it would be nice to notice before the first user in the morning notices it. But do you need a ping for that? The ping may check the connection to the service, but doesn't check the systems behind it. It would be far better to do a real request. The service  provider has to agree that this request is recognized as a monitoring request only, and not a real consumer request.

So don't use ping operations! And especially, don't draw any statistics conclusions from it.