This is my first try at creating a map of lemmy. I based it on the overlap of commentors that visited certain communities.

I only used communities that were on the top 35 active instances for the past month and limited the comments to go back to a maximum of August 1 2024 (sometimes shorter if I got an invalid response.)

I scaled it so it was based on percentage of comments made by a commentor in that community.

Here is the code for the crawler and data that was used to make the map:

https://codeberg.org/danterious/Lemmy_map

  • threelonmusketeers@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    0
    ·
    2 months ago

    I used dimensionality reduction to make it 2D

    Huh, interesting. So is the idea to spread the data out as much an possible, while keeping “similar” communities near each other? What was the dimensionality of the original set?

      • keepthepace@slrpnk.net
        link
        fedilink
        English
        arrow-up
        0
        ·
        2 months ago

        That’s really interesting! It shows which communities share users. I am part of jlai.lu, a french-speaking community that is relatively isolated by also slrpnk.net that seems very spread out!

        Would it make sense to compute the standard deviation of each instance’s communities? It would give an idea of which are islands and which are more extended. Not sure if it makes sense to compute it more on 2 dimensions or on the original 21934 though.