IDEA: it could be really interesting to make an intelligent DNS which give answer based on location of the request, from the IP address, we can easily find these information from Whois or other service on the web. By doing something like that, we add a layer of complexity on DNS but we can control easily the location of user. To make things fast, we probably also need a way to communicate with IP services to know where exactly a request come from. Should be a good idea to add more complexity to DNS, an already complex service? Can solve the issue of people who can't manage BGP? It is a kind of big quick and dirty fix. BGP is complex and require lot of knowledge, can be a good solution to not use BGP in "simple" setup. To be clear, DNS is used by everybody around the world, BGP is only used by few people around the world (and actually, it's a kind of hidden service).

Dynamic content, find the premature optimization is hard and actually, few people (no one) can find it easily. Don't do work you don't have to. Don't pay to generate the same content twice; generate only the change; Break the system in two parts, one that change rapidly to one that change infrequently, this permit to isolate the costs.

Caching (see memcache) is a really interesting thing, you can use it to optimize the usage of the database but also the volume of data served for users. This make the bottleneck moving to the web server (Apache in our case), but it is easy to deploy many web server. [NOTE: every examples are created for Apache/PHP]

Databases scale vertically, if the data are fragmented it will throw away relational constraint. If you don't need relationship between data, you can use alternative like files, CouchDB or cookies. These database are easier to scale. Do only what is absolutely necessary in the database. You build thing, you see the pain point and you correct it afterward. This is the best way to create something working.

Network is part of the architecture, people tends to forget it. firewall states and load balancing algorithms are important. You must understand the bottleneck. To manage it you can use routing an use algorithm like OSPF (source base/hash routing). That add fault-tolerance, distributes network load and its free. To be clear, same technique to protect against DDoS attacks.

Service Decoupling is the most overlooked techniques  for building scalable systems. Break down the user transaction into parts, isolate the asynchronous one, queue the information to complete the task, and process the queues (see Erlang). it's called messaging, you can use JMS (https://en.wikipedia.org/wiki/Java_Message_Service), spread (http://www.spread.org/) or AMQP (https://www.amqp.org/). Typical use-case require a message queue and a job dispatcher.

Scaling is hard, performance is easier. Extremely high performance systems tend to be easier to scale because they don't have to scale as much!