People you can follow
You can use calculations based on factors:
factorA = getFactorA(); // say double(0.3) factorB = getFactorB(); // say double(0.6) factorC = getFactorC(); // say double(0.8) result = (factorA+factorB+factorC) / 3 // double(0.5666666666666667) // if result is more than 0.5, you show this person
So say in the case of Twitter, “People you can follow” can be based on the following factors (user A is the user who views this function “People you can follow” may have more or less factors):
- Relativity between frequent keywords found in User A and User B tweets
- Relativity between the profile description of both users
- Relativity between location of users A and B
- Are users of User A following user B?
So where do they compare "People You Can Follow"? The list probably came from a combination of people with a large number of followers (they are probably celebrities, alpha geeks, famous products / services, etc.), and [the people that user A follows] follow.
Basically, a certain level of data mining should be defined here, reading tweets and biography, calculations. This can be done on a daily or weekly cron job, when the server load is less than a day (or maybe 24/7 on a separate server).
How did you connect
This is probably smart work here so that you feel that a lot of brute force has been done to determine the path. However, after some surface research, I find this simple:
Say you are user A; User B is your connection; and user C is the connection of user B.
In order for you to visit User C, you first need to visit user B's profile. By visiting user B's profile, the website already saves information indicating that user A is in user B's profile. Therefore, when you visit user C from user B, the website immediately informs you that “User A → User B → User C”, ignoring all other possible paths.
This is the maximum level that user C has, Acannot continues to browse his connections until User C connects to user A.
Source: LinkedIN Observation
Like you
This is the same as # 1 (People you can follow), except that the algorithm is read in a different list of people. The list of people the algorithm reads is the people you stick to.
You meant
Good thing you got it right there, except that Google probably used more than just soundex. There is a language translation, word replacement and many other algorithms used for Google. I can’t comment much because it is likely to be very complicated and I am not a specialist in language processing.
If we research a bit more on the Google infrastructure, we may find that Google has servers dedicated to spelling and translation services. You can learn more about the Google platform at http://en.wikipedia.org/wiki/Google_platform .
Conclusion
The key to heavily enhanced algorithms is caching. After caching the result, you do not need to load every page. Google does it, Stack does it (on most pages with a list of questions) and Twitter is not surprising!
Algorithms are mainly determined by developers. You can use other algorithms, but in the end, you can also create your own.