Firebase data structure - is Firefeed important?

Firefeed is a very good example of what can be achieved with Firebase, a fully client-side clone of Twitter. So there is this page: https://firefeed.io/about.html , which explains the logic behind the accepted data structure. This helps to understand the Firebase security rules.

At the end of the demo there is this piece of code:

var userid = info.id; // info is from the login() call earlier. var sparkRef = firebase.child("sparks").push(); var sparkRefId = sparkRef.name(); // Add spark to global list. sparkRef.set(spark); // Add spark ID to user list of posted sparks. var currentUser = firebase.child("users").child(userid); currentUser.child("sparks").child(sparkRefId).set(true); // Add spark ID to the feed of everyone following this user. currentUser.child("followers").once("value", function(list) { list.forEach(function(follower) { var childRef = firebase.child("users").child(follower.name()); childRef.child("feed").child(sparkRefId).set(true); }); }); 

It shows how the recording is made in order to read what was read simply: as indicated:

When we need to display the feed for a specific user, we only need to look in one place

Therefore, I understand that. But if we look at Twitter, we will see that on some accounts there are several million followers (most of them are Katy Perry with over 61 million!). What will happen to this structure and this approach? Whenever Katie posts a new tweet, he will do 61 million recording operations. Wouldn't that just kill the application? And even more, does not consume a lot of extra space?

+5
source share
1 answer

With denormalized data, the only way to connect data is to write to every place from which it is read. So, to post a tweet to 61 million followers, you need 61 million posts.

You would not do this in a browser. The server will listen for child_added events for new tweets, and then a group of workers breaks the load, parsing a subset of the followers at a time. You can potentially prioritize online users to get entries first.

With normalized data, you tweet once, but pay for the connection when reading. If you cache tweets in feeds to avoid getting into the database for each request, you return to 61 million entries for retry for each Katy Perry tweet. To pull out a tweet in real time, you still need to write a tweet to the socket for each online subscriber.

+5
source

Source: https://habr.com/ru/post/1209436/


All Articles