I think most people are missing the point about Google App Engine.
Its the Commoditization of Software Architecture and Scaling Skills
Google App Engine is Google's attempt to democratize the scaling of web applications. Put it another way, they're trying to commoditize the hard-to-find skills and experience needed for building massively scalable web apps. Ask any startup and they can find any number of web developers who've built a web 2.0 app. But how many developers or architects have experience and the mindset to build applications that support 100s of thousands of concurrent users with 5 9's uptime?
Scaling Big is Really Hard
The web industry has mostly moved to a horizontal scaling model - but this is not a model that most web application developers have experience with. Horizontal scaling means, basically, that instead of using more capable hardware, you use more instances of less-capable hardware, each handling a slice of the work, but doing the same function (e.g. sliced between groups of users). The intent, as much as is possible, is to reduce centralization of resources - the ultimate goal is to simply be able to add more instances of the hardware without limit to meet increased scale requirements.
Without getting into too much detail, this stuff is hard. Its not how we're used to thinking about applications - instead of application flow, we have to think about asynchronous interactions with application components. Instead of thinking about centralized data stores, we have to think about data clouds. For the vast majority of web application developers, its black magic.
Google AppEngine Makes It Easy by Imposing a New Architecture
Here's why Google App Engine is important, at least in intent. If you build your app on the Google App Engine architecture, it will scale to unlimited levels without any extra effort. Full stop.
You won't have to worry about things like replicating state, scaling datastores, building caches, etc. You won't have to hire as many really smart systems architects to bend your applications around the constraints imposed by a requirement of unbounded scaling.
As a developer, you'll have to learn a new way of thinking about building your apps. Most unnerving, you'll have to unlearn some deeply held principles about "efficiency" and scalability.
Here's a great example: your instincts as a developer are to keep as much state (e.g. web sessions) in memory between requests as possible. With App Engine, however, you'll learn to accept a certain fixed (i.e. invariant with respect to scale) latency of accessing BigTable (Google's 'data could') in exchange for never having to have to worry about any added latency in handling 100,000 (or a million) concurrent users. As a software architect, I'll take the fixed hit for the benefit of infinite scaling.
In other words, I could probably make each request "faster" with one concurrent user with traditional web app architectures. However, at a large scale (say 10,000 concurrent users), the Google App Engine architecture requests are going to run as quickly, while a traditional application may run 100x slower, unless I were to invest major effort to build a scalable architecture for it (using expensive software systems architects, of course).
Why the BigTable is so Important
In this section, I'm going to make a few points about why the data architecture Google has picked forces the developer to make a scalable app:
- BigTable, the "data objects in the cloud" technology which undergirds Google's massive applications, has the magic property of being essentially infinitely scalable with respect to the amount of data, and the amount of transaction activity. It is essentially the horizontal "partitioning" or "sharding" of data taken to the extreme.
- Most people using frameworks like Ruby on Rails (or O/R systems like Hibernate) are using relational databases (like MySQL) as object databases, and not leveraging most of the relational features. They are paying a big cost for relational functionality without real need. Google App Engine acknowledges this fact, and provides the true object interface that most application developers are using anyway.
- Google App Engine forces you to be explicit about data indexes, but this is something you have to do anyway when horizontally scaling traditional databases anyway. In traditional web application architectures, scalability almost always involves partitioning data among several database instances. The moment you partition data among multiple SQL stores, you have to think about indexes, because searches across those stores requires you to perform some sort of scatter/gather (in functional programming & Google parlance "MapReduce") - and if you aren't careful about how you build indexes, you'll end up with incredible inefficiencies (like having to merge large data sets from multiple stores in memory).
Dragging Developers into Event Think
One of the most painful things developers are going to experience with AppEngine is the concept that their app must be built entirely from event handlers. In some senses, an AppEngine app looks just like a traditional web app, answering HTTP requests, and rendering HTTP responses. But the subtle difference (and what makes it all scalable) is that each request handler must be entirely stateless. This means no web session state on the server side - no data specific to this user can be stored in memory between requests. This is good because it allows requests to be routed to any server in the world that has the code loaded (and can access BigTable) - this is what enables inifinite scalability. But because of this subtle little change, a developer may have to rethink what they are doing in each request.
When scaling big, you've got to think minimal. Do your processing, modify your data store, fire off other behavior (other events), and return. You don't worry about threading (mostly), you don't worry about hanging the server (there is no "the server"), you don't worry about resources beyond those which you consume in handling the immediate request. Its not actually different than the PHP model, actually...
(In fact, if an application framework has a way of automatically storing web session data to BigTable,
then the app can probably be deployed as before - the "event-ness" of the app is hidden from the developer by simulating state in the BigTable. I haven't looked in detail about whether this is practical in App Engine or not, though)
And there's a deep structure here - underneath all the abstractions, Internet interactions are just reallya series of events. A HTTP request comes in. Event. A datastore request is made. Event. An XMPP message comes in. Event. In fact, from this point of view, there's no reason why other protocols could be supported in the App Engine architecture - while they have deployed only HTTP connectors and routers, there's no reason why Google couldn't deploy other connectors for other protocols like XMPP.
And the Moral Is
The moral of the story is that Google App Engine will be remembered primarily for ushering in the era of uber-scalability for the masses. Its not about free hosting or lock-in (though it arguably is), its not about competing with AWS (though it will), and its not even about Python (though it will certainly bring Python more attention). Its about giving out the secrets to the Google kingdom. I predict that we'll see an open source suite that allows any individual or organization to deploy a system architecture that roughly approximates the scalability features of AppEngine. In fact, the pieces are probably already out there.
Google is pushing the community of web developers to rethink how they build web applications. Other will follow and innovate, but Google will get credit for the first big push. Thats what AppEngine's role in history will be.