Why does the website that looks simple also want top expert to develop
comes from a familiar topic: "why do many seemingly complex websites, such as Facebook and Taobao, need a lot of top experts to develop?"
Zi Liu, Taobao handyman code
takes Taobao as a way to give new people some popular science.
says the most important ones on the page you see:
[search for goods] – this feature, if you have thousands of goods, you can completely use select such operations to fix. But when you have 10000000000 (ten billion) when a commodity, any database cannot be stored, how do you search? Here needs to use the distributed data storage scheme, the search can not be directly from the database to collect data necessary to use search engine (simple search engine faster). Well, can the seizure of goods, whether one can be accomplished well? Early, whose goods appear on the first page here? Need a huge complex sorting algorithm. If you do some buying behavior according to the personalized recommendation that a good enough to help the algorithm engineer struggle for life.
 – product details is the search is completed, see you’re interested in, click to view the product page, the page has the attributes of goods, detailed description, evaluation, vendor information and so on, this page shows the number of every day in more than 3 billion, the same way, if you do have 10 people visit a website every day and you do not feel the slightest pressure on the server, but the 3 billion, to solve the problem more to go. First of all, the request can not be directly pressed into the database, any single or distributed database under 3 billion daily pressure, will collapse completely without happiness, this situation is to use the technology of large-scale distributed cache, all sellers information, assessment information, the description of the goods are from the inside to get to the cache a little, "even more extreme views of goods" this information, each one will open the page refresh, you can guess from the cache inside to take Taobao? Do, details of the whole goods are in the buffer.
[commodity pictures] – a commodity with 5 pictures, description of goods, there are more pictures, you guess how many pictures of Taobao to store more than 10 billion. If so many pictures on your hard drive, how do you go to a search of them? If your classmates want to copy your pictures, you need to how much he drives? You need to configure how much bandwidth? If you can bear the card? How long do you need to copy this scale to him? Unfortunately, the market has no commercial solution, eventually we have to develop a storage system, if you have heard of Google GFS, we like him, called TFS. By the way, Tencent also has such a set, too