The reality is that your site is nearly always going to be bottlenecked on the database anyway, though. It is generally the case that network latency completely overwhelms the cost of running your code on the database server, so executing everything requiring transactional consistency in a single round-trip communication to the database is usually the most scalable approach. It's also very much the case that centralization yields the best performance when it comes to both latency and throughput, assuming you can find reasonable partitions by datacenter, unless you can find a synchronization-free execution or are willing to tolerate inconsistency (usually, even then you're doing this for availability, not performance, but if you're Facebook or an ad server sometimes for latency too).
Moreover, while this isn't usually taken advantage of by contemporary databases or schedulers, in research databases knowing details about the transactions that are going to execute allows dramatic improvements in throughput compared to ad-hoc queries (for a variety of reasons, from being able to find a synchronization-free execution that preserves certain invariants to being able to optimize for data locality to trading latency for throughput with group commit... the list goes on).
My personal experience has been that for the vast majority of websites, there simply isn't enough computational work going on over the data to outweigh these benefits.
All that being said: I still hate stored procedures with a fiery passion for other reasons (mostly that they often fail to integrate nicely into the rest of the application's deployment mechanisms, especially if you're using a hosted database solution which prevents custom extensions) and try not to let them anywhere near a database unless there are performance reasons to do so.
"FaRM achieves a peak throughput of 140 million TATP transactions per second on 90 machines with a 4.9 TB database, and it recovers from a failure in less than 50 ms." It also performs 4.5 million TPC-C New Order transactions per second with a 1.9 ms 99% latency [this includes 10% remote partition transactions]).
Food for thought. But what you are describing sounds more performance oriented, not scalability-oriented. Are the speed gains from extra optimization enough to make up form putting more eggs in one basket?
Scalability is just a means to an end. Most often, that end is better performance. So it's pretty relevant if a non-scalable solution can deliver vastly better performance in practice than a more scalable one (and this is relatively common, especially when you start taking into account things like developer cost, hardware cost, and available network bandwidth).
Historically, that was true, but recent versions also allow ad-hoc queries. That being said, VoltDB is no longer the state of the art in serializable OLTP performance (though maybe it is in the commercial space, but I think HyPer for example outperforms it on a lot of benchmarks).
Moreover, while this isn't usually taken advantage of by contemporary databases or schedulers, in research databases knowing details about the transactions that are going to execute allows dramatic improvements in throughput compared to ad-hoc queries (for a variety of reasons, from being able to find a synchronization-free execution that preserves certain invariants to being able to optimize for data locality to trading latency for throughput with group commit... the list goes on).
My personal experience has been that for the vast majority of websites, there simply isn't enough computational work going on over the data to outweigh these benefits.
All that being said: I still hate stored procedures with a fiery passion for other reasons (mostly that they often fail to integrate nicely into the rest of the application's deployment mechanisms, especially if you're using a hosted database solution which prevents custom extensions) and try not to let them anywhere near a database unless there are performance reasons to do so.
(BTW, just to show I'm not talking out of my ass, here's the latest and greatest in serializable performance within a datacenter (aka centralized): http://research.microsoft.com/pubs/255848/SOSP15-final227.pd.... To quote:
"FaRM achieves a peak throughput of 140 million TATP transactions per second on 90 machines with a 4.9 TB database, and it recovers from a failure in less than 50 ms." It also performs 4.5 million TPC-C New Order transactions per second with a 1.9 ms 99% latency [this includes 10% remote partition transactions]).