A hint to all companies offering access to their platforms via an API: look at your API documentation sites as an opportunity to build a community around your API. Do not require users of the documentation to login to your service to read the documentation, do not require them to click through arcane paths to "find" documentation. Make it as easy to find the API documentation as it would be to find your blog or support site (i.e. http://api.yourcompany.com).
Two can't-miss events are coming up May 12-15 at the Computer History Museum in Mountain View, California:
This is where things happen, so you don't want to miss it. Both events will be mostly unconference-style, so you get to make the agenda. If these are new areas to you, or if you consider yourself a seasoned expert, these are best opportunities of the year to share and learn with others.
Each requires independent registration.
See you there!
P.S. If you need any more reason to come, the evening fun should worth it alone...
I think most people are missing the point about Google App Engine.
Its the Commoditization of Software Architecture and Scaling Skills
Google App Engine is Google's attempt to democratize the scaling of web applications. Put it another way, they're trying to commoditize the hard-to-find skills and experience needed for building massively scalable web apps. Ask any startup and they can find any number of web developers who've built a web 2.0 app. But how many developers or architects have experience and the mindset to build applications that support 100s of thousands of concurrent users with 5 9's uptime?
Scaling Big is Really Hard
The web industry has mostly moved to a horizontal scaling model - but this is not a model that most web application developers have experience with. Horizontal scaling means, basically, that instead of using more capable hardware, you use more instances of less-capable hardware, each handling a slice of the work, but doing the same function (e.g. sliced between groups of users). The intent, as much as is possible, is to reduce centralization of resources - the ultimate goal is to simply be able to add more instances of the hardware without limit to meet increased scale requirements.
Without getting into too much detail, this stuff is hard. Its not how we're used to thinking about applications - instead of application flow, we have to think about asynchronous interactions with application components. Instead of thinking about centralized data stores, we have to think about data clouds. For the vast majority of web application developers, its black magic.
Google AppEngine Makes It Easy by Imposing a New Architecture
Here's why Google App Engine is important, at least in intent. If you build your app on the Google App Engine architecture, it will scale to unlimited levels without any extra effort. Full stop.
You won't have to worry about things like replicating state, scaling datastores, building caches, etc. You won't have to hire as many really smart systems architects to bend your applications around the constraints imposed by a requirement of unbounded scaling.
As a developer, you'll have to learn a new way of thinking about building your apps. Most unnerving, you'll have to unlearn some deeply held principles about "efficiency" and scalability.
Here's a great example: your instincts as a developer are to keep as much state (e.g. web sessions) in memory between requests as possible. With App Engine, however, you'll learn to accept a certain fixed (i.e. invariant with respect to scale) latency of accessing BigTable (Google's 'data could') in exchange for never having to have to worry about any added latency in handling 100,000 (or a million) concurrent users. As a software architect, I'll take the fixed hit for the benefit of infinite scaling.
In other words, I could probably make each request "faster" with one concurrent user with traditional web app architectures. However, at a large scale (say 10,000 concurrent users), the Google App Engine architecture requests are going to run as quickly, while a traditional application may run 100x slower, unless I were to invest major effort to build a scalable architecture for it (using expensive software systems architects, of course).
Why the BigTable is so Important
In this section, I'm going to make a few points about why the data architecture Google has picked forces the developer to make a scalable app:
Dragging Developers into Event Think
One of the most painful things developers are going to experience with AppEngine is the concept that their app must be built entirely from event handlers. In some senses, an AppEngine app looks just like a traditional web app, answering HTTP requests, and rendering HTTP responses. But the subtle difference (and what makes it all scalable) is that each request handler must be entirely stateless. This means no web session state on the server side - no data specific to this user can be stored in memory between requests. This is good because it allows requests to be routed to any server in the world that has the code loaded (and can access BigTable) - this is what enables inifinite scalability. But because of this subtle little change, a developer may have to rethink what they are doing in each request.
When scaling big, you've got to think minimal. Do your processing, modify your data store, fire off other behavior (other events), and return. You don't worry about threading (mostly), you don't worry about hanging the server (there is no "the server"), you don't worry about resources beyond those which you consume in handling the immediate request. Its not actually different than the PHP model, actually...
(In fact, if an application framework has a way of automatically storing web session data to BigTable,
then the app can probably be deployed as before - the "event-ness" of the app is hidden from the developer by simulating state in the BigTable. I haven't looked in detail about whether this is practical in App Engine or not, though)
And there's a deep structure here - underneath all the abstractions, Internet interactions are just reallya series of events. A HTTP request comes in. Event. A datastore request is made. Event. An XMPP message comes in. Event. In fact, from this point of view, there's no reason why other protocols could be supported in the App Engine architecture - while they have deployed only HTTP connectors and routers, there's no reason why Google couldn't deploy other connectors for other protocols like XMPP.
And the Moral Is
The moral of the story is that Google App Engine will be remembered primarily for ushering in the era of uber-scalability for the masses. Its not about free hosting or lock-in (though it arguably is), its not about competing with AWS (though it will), and its not even about Python (though it will certainly bring Python more attention). Its about giving out the secrets to the Google kingdom. I predict that we'll see an open source suite that allows any individual or organization to deploy a system architecture that roughly approximates the scalability features of AppEngine. In fact, the pieces are probably already out there.
Google is pushing the community of web developers to rethink how they build web applications. Other will follow and innovate, but Google will get credit for the first big push. Thats what AppEngine's role in history will be.
This is a followup and expansion to a blog entry I wrote in October 2007, titled "Memo to OAuth Service Providers".
Recently, I've been working with a host of companies with open APIs built specifically for 3rd parties to build on. Call them mashups companies, platform companies, or just plain old smart companies. But if you are contemplating creating an API for your new online service, I have some suggestions on making your API offerings a success.
Use Building Blocks
If at all possible, don't invent a new protocol, don't invent new formats, and don't even invent a new API, if you can avoid it. There any number of building blocks out there such as OpenID, OAuth, HTTP (REST especially) and Atom upon which you can compose a set of interactions to fulfill a wide variety of needs. This isn't to say that your API demands unique components, but the thing to remember is that the service offering behind your API, not the mechanics of the API itself, should distinguish your API offering from others.
Think of the API as a Separate Product You Are Delivering
It really is a separate product. Its your company's service packaged as a product for delivery to a new set of customers (other developers) with a new intent (letting third parties make your business successful and creating an ecosystem in which you are cemented as a core participant). Along these lines, there are several key things to keep in mind:
Your API User are Your Customers and Partners.
As I mentioned in Memo to OAuth Service Providers", the mindset you should have is that the users of your API are your customers as well your partners. This is especially true if you are entirely or largely a platform company with little or no direct contact with end users. Your API customers are likely going to bring you business development leads. They are your best sales team -if you're lucky, maybe they are your only sales team! Think of them and care for them like you would any other service user: understand their needs, maintain bidirectional communication channels and respect their sensitivity to quality (uptime, performance, etc).
Make Life for Your Developer Users EASY
This is probably the most important message I have to deliver. To these nice folks (many of whom are just like you and may even themselves be providing services behind APIs), you are often one of several options and you need to make their lives easier than it is today, not harder. Here are a number of concrete steps you can take, some of which I've seen implemented, some of which I think may be novel:
Eat Your Own Dogfood
This is more than just building a proof of concept. I actually advocate building a live application that accesses your API as would any other client. This is so that you can feel the pain (and experience the joy) of your API client customers. If you are good, you'll be able to anticipate what your customers want and need before they even know it.
Monitor Usage of Your API
Like any other online service, you really need to know, at a technical level, what's going on with your API service as it is getting used. Know your usage patterns - they are almost certainly going to be different than what you first guess. Knowing your service helps you maintain uptime, predict scalability issues, and be able to avoid issues that will keep your customers from being happy. You will be able to predict future growth and plan for investments in the API offering. Finally, monitoring usage will also help you in the area of security, because monitoring helps establish baseline usage volumes and patterns that can be used to detect security or other anomalies.
Build a Community of Developer Customers
If you launch an API and have developers building to it, you already have a community. Recognize it, support it, and realize that they are likely investing a lot of time and exposing themselves to a certain amount of business risk by relying on your API. Give them reasons to continue the partnership and feel that the risk is well understood (if not completely managed). Let them know who you are as a company, and the sorts of investments and commitments you are making in this API, and acknowledge the fact that this is a symbiosis. There is no such thing as a service provider if there isn't a service consumer.
And here's one more thing about the community. Call it the Wachob corrollary to Selfridge's law: "The API Customer is Always Right". Thats not literally true, of course, but it is the mentality you need to have, especially when you are just launching. This is a make or break time - either your API users like what you have, or they will quickly walk away. You are serving the API users, not the other way around (unless you are special or have some abnormal market power). You aren't doing them a favor by providing the API - they are doing you a favor by embedding your service in their offering. Recognize.
Don't Forget the Legal Stuff
Your lawyers will surely tell you this, but let me save you 15 minutes of your lawyers time (spent lecturing you). Protect yourself. Think about disclaimers (esp with respect to warranties and service level agreements). If you want to offer those things, think about how you are exposing yourself and how you can cover the risk cost. Think about other legal issues such as copyright (and the DMCA), privacy laws, or other oddball legal issues that your customers may drag you into (online gambling?).
But also look at your legal agreements as an opportunity to express, in a positive way, commitments to your API customers that signal your appreciation of them as part of the ecosystem. One of my favorite signaling mechanisms, built into legal agreements for user-created content service offerings, is explicit language preserving intellectual property ownership on the part of the user for all content they upload/contribute/share through the service. Also, strongly worded privacy commitments, though they may raise the hackles of many lawyers, are also a great way to indicate your intentions towards your API users.
Make it Fun, Make it Personal
Lastly, I think you need to make developing against your API fun and personal. Try to enable "The Thrill of the Hack". Be real, be available, be a company of people, not a company behind a API call. Do this and everything above and you'll have the best odds for developing the fanatical userbase your service API so richly deserves...
Want Some Help on This?
Glad you asked. I'm available to consult to any company that would like help on the issues described in this blog post. I can be reached at firstname.lastname@example.org
If you are a developer, you know what the thrill of the hack is - when your building something, and you sit down and implement a new feature and all of a sudden, your stuff plugs into a bunch of other people's stuff and what was once a cool standalone thing is now part of an ecosystem of interoperating cool stuff. The whole becomes greater than the sum of the parts. And you, the developer, are part of it.
I think the "Thrill of the Hack" (TOTH) is a key factor in the success of technologies like RSS, tagging, and XMPP and I see it making OpenID successful even as I write this blog. I don't think I'm overstating the value TOTH to say that the web wouldn't have happened without TOTH.
But TOTH doesn't just happen by itself. Its enabled by "busy developer guides", robust open source development efforts, community support, hangouts for developers and curious users, and friendly easy-to-understand IPR policies (see sec 10.2.3). All of these things take deliberate effort, and yet in isolation may not seem to have any direct value for those investing time and effort. However, I think the evidence is clear that one of the best ways to enable a new open network technology is to enable TOTH and open source development around that technology.
TOTH also helps the technology move forward - developers who become hooked through TOTH go on to innovate on top of the technology to build things that were completely off the radar of the original promoters of the technology (again, think about the web here).
I'm concerned that the INames community has failed to enable TOTH. We have efforts in most of the directions (hangout, open source, a good IPR policy, a busy developer guide, community support), but they all need more work. Much of the effort on inames has focused on communicating how inames are usable to end users. But we haven't enabled developers to make INames (and even XRI, which doesn't necessarily rely on the global root directories) ubiquitous and we haven't enabled developers to go beyond what we've envisioned and come up with the really killer apps.
I've heard a lot of the frustration from folks who are interested in playing with INames, and I want to you know that we hear you, and we understand. In fact, I share that very frustration with you.
Things are definitely happening, especially in response to IIW 2006B, and i hope to highlight them here. Stay tuned...
Wes Felter points me to a paper being submitted for Sigcomm 2006: "Routing on Flat Labels". This paper presents a concept of using hierarchical Distributed Hash Tables (DHTs) for the purpose of allowing flat namespaces to be "routable" on Internet scales. That is, "pure" names (with no location or other hiearchy information) could be practically routable across a network of networks on the scale of the Internet.
In 'digital identity' terms, imagine a community deploying such a system which allows routing of messages to 'digital identities' without centralized infrastructure (e.g., a large "directory in the sky")... Generating a new identity that can partcipate in communications across the Internet (or other networks) could be as easy as generating a long random number as an identity and publishing metadata about it to this distributed "ROFL"-based system. If that long random number were a public key, you could have self-authenticated, globally routable, totally decentralized identifiers. Not human friendly of course, as Zooko's triangle will say, of course...