"The Five Habits of Highly Successful Community Managers" (by Roland Legrand) is a great article that discusses tactics for successful online community managers. These include "Speak up", "Focus on Concrete Issues", "Be Honest", "Be Firm", "Be Grateful". These "people skills" are as equally applicable to a social community (online or offline) as they are to a developer community.
This got me thinking, should API service providers have a visible "community manager" (and not just "maintainer" or "evangelist") for their users and outside developers? I'd say yes. The issues and tactics highlighted in the article seem to me to be just as applicable to a technical user community as any other. But its not something I see very often - maybe its because developers don't see themselves as a community? Are there some types of API services that more naturally grow an ecosystem of users/developers around them?
What do you think? Do you know of API service providers that have someone with this responsibility and title?
[Thanks to Chia Hwu for pointing it out this article.]
A hint to all companies offering access to their platforms via an API: look at your API documentation sites as an opportunity to build a community around your API. Do not require users of the documentation to login to your service to read the documentation, do not require them to click through arcane paths to "find" documentation. Make it as easy to find the API documentation as it would be to find your blog or support site (i.e. http://api.yourcompany.com).
Why make it hard for folks to evaluate your product for the purpose of investing their time and effort into building your community of users?
I'm proud to say that not only does Socialtext make their API documentation public, its on a wiki that allows for feedback inline. In fact, Facebook and Twitter (and many others) use wikis for API documentation - a practice I highly recommend. What better way to create dialogue and community with your developer/users than giving them a direct means to give feedback on the APIs they are being asked to use?
This is where things happen, so you don't want to miss it. Both events will be mostly unconference-style, so you get to make the agenda. If these are new areas to you, or if you consider yourself a seasoned expert, these are best opportunities of the year to share and learn with others.
Each requires independent registration.
See you there!
P.S. If you need any more reason to come, the evening fun should worth it alone...
Its the Commoditization of Software Architecture and Scaling Skills
Google App Engine is Google's attempt to democratize the scaling of web applications. Put it another way, they're trying to commoditize the hard-to-find skills and experience needed for building massively scalable web apps. Ask any startup and they can find any number of web developers who've built a web 2.0 app. But how many developers or architects have experience and the mindset to build applications that support 100s of thousands of concurrent users with 5 9's uptime?
Scaling Big is Really Hard
The web industry has mostly moved to a horizontal scaling model - but this is not a model that most web application developers have experience with. Horizontal scaling means, basically, that instead of using more capable hardware, you use more instances of less-capable hardware, each handling a slice of the work, but doing the same function (e.g. sliced between groups of users). The intent, as much as is possible, is to reduce centralization of resources - the ultimate goal is to simply be able to add more instances of the hardware without limit to meet increased scale requirements.
Without getting into too much detail, this stuff is hard. Its not how we're used to thinking about applications - instead of application flow, we have to think about asynchronous interactions with application components. Instead of thinking about centralized data stores, we have to think about data clouds. For the vast majority of web application developers, its black magic.
Google AppEngine Makes It Easy by Imposing a New Architecture
Here's why Google App Engine is important, at least in intent. If you build your app on the Google App Engine architecture, it will scale to unlimited levels without any extra effort. Full stop.
You won't have to worry about things like replicating state, scaling datastores, building caches, etc. You won't have to hire as many really smart systems architects to bend your applications around the constraints imposed by a requirement of unbounded scaling.
As a developer, you'll have to learn a new way of thinking about building your apps. Most unnerving, you'll have to unlearn some deeply held principles about "efficiency" and scalability.
Here's a great example: your instincts as a developer are to keep as much state (e.g. web sessions) in memory between requests as possible. With App Engine, however, you'll learn to accept a certain fixed (i.e. invariant with respect to scale) latency of accessing BigTable (Google's 'data could') in exchange for never having to have to worry about any added latency in handling 100,000 (or a million) concurrent users. As a software architect, I'll take the fixed hit for the benefit of infinite scaling.
In other words, I could probably make each request "faster" with one concurrent user with traditional web app architectures. However, at a large scale (say 10,000 concurrent users), the Google App Engine architecture requests are going to run as quickly, while a traditional application may run 100x slower, unless I were to invest major effort to build a scalable architecture for it (using expensive software systems architects, of course).
Why the BigTable is so Important
In this section, I'm going to make a few points about why the data architecture Google has picked forces the developer to make a scalable app:
BigTable, the "data objects in the cloud" technology which undergirds Google's massive applications, has the magic property of being essentially infinitely scalable with respect to the amount of data, and the amount of transaction activity. It is essentially the horizontal "partitioning" or "sharding" of data taken to the extreme.
Most people using frameworks like Ruby on Rails (or O/R systems like Hibernate) are using relational databases (like MySQL) as object databases, and not leveraging most of the relational features. They are paying a big cost for relational functionality without real need. Google App Engine acknowledges this fact, and provides the true object interface that most application developers are using anyway.
Google App Engine forces you to be explicit about data indexes, but this is something you have to do anyway when horizontally scaling traditional databases anyway. In traditional web application architectures, scalability almost always involves partitioning data among several database instances. The moment you partition data among multiple SQL stores, you have to think about indexes, because searches across those stores requires you to perform some sort of scatter/gather (in functional programming & Google parlance "MapReduce") - and if you aren't careful about how you build indexes, you'll end up with incredible inefficiencies (like having to merge large data sets from multiple stores in memory).
Dragging Developers into Event Think
One of the most painful things developers are going to experience with AppEngine is the concept that their app must be built entirely from event handlers. In some senses, an AppEngine app looks just like a traditional web app, answering HTTP requests, and rendering HTTP responses. But the subtle difference (and what makes it all scalable) is that each request handler must be entirely stateless. This means no web session state on the server side - no data specific to this user can be stored in memory between requests. This is good because it allows requests to be routed to any server in the world that has the code loaded (and can access BigTable) - this is what enables inifinite scalability. But because of this subtle little change, a developer may have to rethink what they are doing in each request.
When scaling big, you've got to think minimal. Do your processing,
modify your data store, fire off other behavior (other events), and return. You don't worry
about threading (mostly), you don't worry about hanging the server
(there is no "the server"), you don't worry about resources beyond
those which you consume in handling the immediate request. Its not actually different than the PHP model, actually...
(In fact, if an application framework has a way of automatically storing web session data to BigTable, then the app can probably be deployed as before - the "event-ness" of the app is hidden from the developer by simulating state in the BigTable. I haven't looked in detail about whether this is practical in App Engine or not, though)
And there's a deep structure here - underneath all the abstractions, Internet interactions are just reallya series of events. A HTTP request comes in. Event. A datastore request is made. Event. An XMPP message comes in. Event. In fact, from this point of view, there's no reason why other protocols could be supported in the App Engine architecture - while they have deployed only HTTP connectors and routers, there's no reason why Google couldn't deploy other connectors for other protocols like XMPP.
And the Moral Is
The moral of the story is that Google App Engine will be remembered primarily for ushering in the era of uber-scalability for the masses. Its not about free hosting or lock-in (though it arguably is), its not about competing with AWS (though it will), and its not even about Python (though it will certainly bring Python more attention). Its about giving out the secrets to the Google kingdom. I predict that we'll see an open source suite that allows any individual or organization to deploy a system architecture that roughly approximates the scalability features of AppEngine. In fact, the pieces are probably already out there.
Google is pushing the community of web developers to rethink how they build web applications. Other will follow and innovate, but Google will get credit for the first big push. Thats what AppEngine's role in history will be.
Recently, I've been working with a host of companies with open APIs built specifically for 3rd parties to build on. Call them mashups companies, platform companies, or just plain old smart companies. But if you are contemplating creating an API for your new online service, I have some suggestions on making your API offerings a success.
Use Building Blocks If at all possible, don't invent a new protocol, don't invent new formats, and don't even invent a new API, if you can avoid it. There any number of building blocks out there such as OpenID, OAuth, HTTP (REST especially) and Atom upon which you can compose a set of interactions to fulfill a wide variety of needs. This isn't to say that your API demands unique components, but the thing to remember is that the service offering behind your API, not the mechanics of the API itself, should distinguish your API offering from others.
Think of the API as a Separate Product You Are Delivering It really is a separate product. Its your company's service packaged as a product for delivery to a new set of customers (other developers) with a new intent (letting third parties make your business successful and creating an ecosystem in which you are cemented as a core participant). Along these lines, there are several key things to keep in mind:
The API isn't delivered until your side (the provider) conforms to a publicly published specification. This seems so obvious stated this way, but like other product launches, its all to easy to make "vaporware" claims about APIs - declaring you have delivered more than what is actually there.
You may even want to go so far as to build tests for your API first, before you build a server-side implementation. Build a proof of concept client application that exercises all the functionality in the API. When that POC works, your API is complete!
Your API User are Your Customers and Partners. As I mentioned in Memo to OAuth Service Providers", the mindset you should have is that the users of your API are your customers as well your partners. This is especially true if you are entirely or largely a platform company with little or no direct contact with end users. Your API customers are likely going to bring you business development leads. They are your best sales team -if you're lucky, maybe they are your only sales team! Think of them and care for them like you would any other service user: understand their needs, maintain bidirectional communication channels and respect their sensitivity to quality (uptime, performance, etc).
Make Life for Your Developer Users EASY This is probably the most important message I have to deliver. To these nice folks (many of whom are just like you and may even themselves be providing services behind APIs), you are often one of several options and you need to make their lives easier than it is today, not harder. Here are a number of concrete steps you can take, some of which I've seen implemented, some of which I think may be novel:
Document the service and not just the API. Start at 50,000 feet and work your way down. Nothing frustrates me more as an API user to get API documentation that starts with API methods and fields, without bothering to tell me what they do. So many times, API providers forget to describe to me what the service behind the API does - it is often the most critical thing for an API client to understand. After all, an API customer is typically not writing to the network level specification, but rather they are orchestrating interactions with your service via calls to a API client library.
Provide open source client API libraries in all the major languages your users are likely to implement in. This one is obvious, but its a prerequisite for success these days.
Provide a reference client application to demonstrate usage of the API. Again, another obvious one. Make sure its well documented, and reflects any updates in the API over time. Consider the reference client part of the deliverable package when you update the API.
Make an instance of your service available for developers in a sandbox, if at all possible. Developers will do stupid things and need to hit your API to learn. Don't make them feel guilty for doing it, or punish them for messing up. Encourage them. Make it easy. Allow them to make mistakes in the privacy of a development sandbox, if at all possible.
Don't neglect error reporting, and think of it as an educational tool for the API client developer. People developing server sides of APIs often think that the appropriate solution for reporting errors is to be cryptic and expect folks to refer to specifications to understand what has happened. There's generally no reason for this sort of torture. Especially when a client developer has no insight into the server environment, its nice to give them some extra information about what's going wrong. Unless you have a buggy server implementation, its usually a problem on the client side - and more often than not its a bug in comprehension of how your API is supposed to be used. Use failures and exceptions as learning moments.
Build mechanisms to give client developers extra debugging information during development. Perhaps you can provide an internal log of how an API call is being processed and the steps inside your implementation, at some general level of detail. This could be controversial (some API client developers may claim your system is not doing the right thing if they have too much insight). However, it can help them understand how the system is actually working when things are going right, so they can see why things are going wrong when they are going wrong. Also, we know that abstractions are leaky (an API is one sort of abstraction) so giving client developers a peek under the covers of the API implementation on the server side may occasionally be helpful in discovering when those leaky abstractions are causing them problems.
Build a way for developers to reset your service to a known state for development and debugging. There's really nothing more frustrating than trying to create tests or track down bugs that only occur when the API being accessed is in a certain (sometimes unknown) state. In many cases, it could be very helpful for a client developer to "push a button" to set the server side to some known state (e.g. empty out a database, populate a database with a known set of data, etc). This also helps greatly with the next bullet point.
Think about your client developers' testing issues. Think about their testability problems and how you might be able to help them, or at least not create extra problems. This is especially true with a test-first development approach that is advocated in some agile development methodologies. This sort of testability may mean providing a client API with a test harness built in (e.g. Mock Objects to stub out the actual service API). Or, it could mean the sort of "known-state setting" that I mentioned in the last bullet point, so that even unit testing of the client software can happen against the live (development) API you are providing.
Eat Your Own Dogfood This is more than just building a proof of concept. I actually advocate building a live application that accesses your API as would any other client. This is so that you can feel the pain (and experience the joy) of your API client customers. If you are good, you'll be able to anticipate what your customers want and need before they even know it.
Monitor Usage of Your API Like any other online service, you really need to know, at a technical level, what's going on with your API service as it is getting used. Know your usage patterns - they are almost certainly going to be different than what you first guess. Knowing your service helps you maintain uptime, predict scalability issues, and be able to avoid issues that will keep your customers from being happy. You will be able to predict future growth and plan for investments in the API offering. Finally, monitoring usage will also help you in the area of security, because monitoring helps establish baseline usage volumes and patterns that can be used to detect security or other anomalies.
Build a Community of Developer Customers If you launch an API and have developers building to it, you already have a community. Recognize it, support it, and realize that they are likely investing a lot of time and exposing themselves to a certain amount of business risk by relying on your API. Give them reasons to continue the partnership and feel that the risk is well understood (if not completely managed). Let them know who you are as a company, and the sorts of investments and commitments you are making in this API, and acknowledge the fact that this is a symbiosis. There is no such thing as a service provider if there isn't a service consumer.
And here's one more thing about the community. Call it the Wachob corrollary to Selfridge's law: "The API Customer is Always Right". Thats not literally true, of course, but it is the mentality you need to have, especially when you are just launching. This is a make or break time - either your API users like what you have, or they will quickly walk away. You are serving the API users, not the other way around (unless you are special or have some abnormal market power). You aren't doing them a favor by providing the API - they are doing you a favor by embedding your service in their offering. Recognize.
Don't Forget the Legal Stuff Your lawyers will surely tell you this, but let me save you 15 minutes of your lawyers time (spent lecturing you). Protect yourself. Think about disclaimers (esp with respect to warranties and service level agreements). If you want to offer those things, think about how you are exposing yourself and how you can cover the risk cost. Think about other legal issues such as copyright (and the DMCA), privacy laws, or other oddball legal issues that your customers may drag you into (online gambling?).
But also look at your legal agreements as an opportunity to express, in a positive way, commitments to your API customers that signal your appreciation of them as part of the ecosystem. One of my favorite signaling mechanisms, built into legal agreements for user-created content service offerings, is explicit language preserving intellectual property ownership on the part of the user for all content they upload/contribute/share through the service. Also, strongly worded privacy commitments, though they may raise the hackles of many lawyers, are also a great way to indicate your intentions towards your API users.
Make it Fun, Make it Personal Lastly, I think you need to make developing against your API fun and personal. Try to enable "The Thrill of the Hack". Be real, be available, be a company of people, not a company behind a API call. Do this and everything above and you'll have the best odds for developing the fanatical userbase your service API so richly deserves...
Want Some Help on This? Glad you asked. I'm available to consult to any company that would like help on the issues described in this blog post. I can be reached at firstname.lastname@example.org
Understand that many of the consumer applications of your service are driving users to your site,
and in the world of composable services, your consumer application developers
will often have choice. Choice means power. Recognize.
Keep the channels of communication open with developers of consumers of your service. Use blogs, email lists, IRC channels, and show up at community events that are relevant to your service (barcamps, superhappydevhouse, etc).
Be transparent about API changes - try to schedule changes ahead of time, try to keep changes limited to a regular scheduled time, be prepared to rollback changes, and be especially available immediately after API changes.
Think like a service provider and be concerned about security, availability and uptime, even if nobody is directly complaining - remember, your service availability may impact the perceived quality of the services offered by consumer applications.
Be transparent about outages - you will have more credibility with your consumers (see above about choice).
Be proactive about supporting consumer applications - know which consumer applications are accessing your service - you never know who your friends might be. I suspect OAuth will be a great channel for business development!
If you are a developer, you know what the thrill of the hack is - when your building something, and you sit down and implement a new feature and all of a sudden, your stuff plugs into a bunch of other people's stuff and what was once a cool standalone thing is now part of an ecosystem of interoperating cool stuff. The whole becomes greater than the sum of the parts. And you, the developer, are part of it.
I think the "Thrill of the Hack" (TOTH) is a key factor in the success of technologies like RSS, tagging, and XMPP and I see it making OpenID successful even as I write this blog. I don't think I'm overstating the value TOTH to say that the web wouldn't have happened without TOTH.
But TOTH doesn't just happen by itself. Its enabled by "busy developer guides", robust open source development efforts, community support, hangouts for developers and curious users, and friendly easy-to-understand IPR policies (see sec 10.2.3). All of these things take deliberate effort, and yet in isolation may not seem to have any direct value for those investing time and effort. However, I think the evidence is clear that one of the best ways to enable a new open network technology is to enable TOTH and open source development around that technology.
TOTH also helps the technology move forward - developers who become hooked through TOTH go on to innovate on top of the technology to build things that were completely off the radar of the original promoters of the technology (again, think about the web here).
I'm concerned that the INames community has failed to enable TOTH. We have efforts in most of the directions (hangout, open source, a good IPR policy, a busy developer guide, community support), but they all need more work. Much of the effort on inames has focused on communicating how inames are usable to end users. But we haven't enabled developers to make INames (and even XRI, which doesn't necessarily rely on the global root directories) ubiquitous and we haven't enabled developers to go beyond what we've envisioned and come up with the really killer apps.
I've heard a lot of the frustration from folks who are interested in playing with INames, and I want to you know that we hear you, and we understand. In fact, I share that very frustration with you.
Things are definitely happening, especially in response to IIW 2006B, and i hope to highlight them here. Stay tuned...
Wes Felter points me to a paper being submitted for Sigcomm 2006: "Routing on Flat Labels". This paper presents a concept of using hierarchical Distributed Hash Tables (DHTs) for the purpose of allowing flat namespaces to be "routable" on Internet scales. That is, "pure" names (with no location or other hiearchy information) could be practically routable across a network of networks on the scale of the Internet.
In 'digital identity' terms, imagine a community deploying such a system which allows routing of messages to 'digital identities' without centralized infrastructure (e.g., a large "directory in the sky")... Generating a new identity that can partcipate in communications across the Internet (or other networks) could be as easy as generating a long random number as an identity and publishing metadata about it to this distributed "ROFL"-based system. If that long random number were a public key, you could have self-authenticated, globally routable, totally decentralized identifiers. Not human friendly of course, as Zooko's triangle will say, of course...