Stack Overflow Demo Database Column Store Edition: Exploring Relationships

Special Relativity


To get you started exploring the Stack Overflow column store database, here are some queries that show how tables are related.

The two main relationships are User Id, and Post Id.

User Id

/*User Id*/
SELECT COUNT_BIG(*) AS records
FROM dbo.Users AS u
JOIN dbo.Badges AS b
    ON b.UserId = u.Id;

SELECT COUNT_BIG(*) AS records
FROM dbo.Users AS u
JOIN dbo.Comments AS c
    ON c.UserId = u.Id;

SELECT COUNT_BIG(*) AS records
FROM dbo.Users AS u
JOIN dbo.Posts AS p
    ON p.OwnerUserId = u.Id;

SELECT COUNT_BIG(*) AS records
FROM dbo.Users AS u
JOIN dbo.Votes AS v
    ON v.UserId = u.Id;

Post Id

/*Post Id*/
SELECT COUNT_BIG(*) AS records
FROM dbo.Posts AS p
JOIN dbo.Comments AS c
    ON c.PostId = p.Id;

SELECT COUNT_BIG(*) AS records
FROM dbo.Posts AS p
JOIN dbo.Votes AS v
    ON v.PostId = p.Id;

Note-ry


A couple things to note, here:

  • Joining Users to Votes is unreliable, because most of the voting is anonymized in the dump
  • Things get much more interesting when you start building queries within relationships

For example, using the User Id columns in tables that aren’t users to join larger tables together, or joining Comments to Votes on Post Id.

You can really start to feel your CPU fans.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.