How to develop higher scalability in SQL Server databases requiring specified operations?

Imagine a movie application that customizes the following movie for users based on this very simple algorithm:

  • The film must be new to the user
  • The user did not mark the film as "not interested"

This is a simple SQL Server database design:

Movies:
    Id bigint
    Name nvarchar(100)

SeenMovies:
    Id bigint
    UserId bigint
    MovieId bigint

NotInterestedFlags:
    Id bigint
    UserId bigint
    MovieId bigint

To get the following movie, we run this query:

select top 1 *
from Movies 
where Id not in 
(
    select MovieId 
    from SeenMovies 
    where UserId = 89283
)
and Id not in 
(
    select MovieId 
    from NotInterestedFlags
    where UserId = 89283
)

This design is getting slower and slower due to more use of the application and more data. So, with an imaginary database of 100 thousand films and more than 10 million customers, how to change this design so that it scales horizontally?

+4
2

, .

, SeenMovies NotInterestedFlags , , , UserId. MovieId. , .

, , , , , , , SeenMovies NotInterestedFlags .

SELECT TOP 1
    Movies.*

FROM
    Users

CROSS JOIN
    Movies

WHERE 
    NOT EXISTS
    (
        SELECT NULL
        FROM SeenMovies
        WHERE 
            SeenMovies.UserId = Users.Id
            AND
            SeenMovies.MovieId = Movies.Id 
    )
    AND 
    NOT EXISTS
    (
        SELECT NULL
        FROM NotInterestedFlags
        WHERE 
            NotInterestedFlags.UserId = Users.Id
            AND
            NotInterestedFlags.MovieId = Movies.Id 
    )
    AND
    Users.Id = 89283

- , , UNION MovieId SeenMovies NotInterestedFlags UserId, EXCEPT , .

, , , , , - , TOP 1.

, ( ), SeenMovies NotInterestedFlags.

, , , , 10 , .

, , 10 , , , .

+1

"shortlisted" . . , , , . User . .

0

Source: https://habr.com/ru/post/1679249/


All Articles