This post is stating the obvious. Facebook just scared me with some data points it has about me. Full story, right after the break.

Today’s abundance of knowledge as presented by Facebook comes from Facebook’s stranger familiarization feature. Fancy made-up words aside, it’s the notifications of the form:

Jaime Lannister (friends with Cersei Lannister) liked your photo.

Now, I hadn’t gotten to see these sort of notices too often. The first time I encountered those, Facebook chose some bad examples. A human, given my friends list — complete with some basic data and descriptions — could have easily devised a better one.

But that was long ago. Facebook’s knowledge of me has improved vastly. You should see my face back when I received another notification lately. Facebook chose a mutual friend out of ten possibilities. And boy, had they nailed it. Now, with my memory of the previous occurrence, I thought it was a good guess. A guess, as I I believed it was just good ol’ /dev/urandom supplemented by the 10% probability resulting in this serendipitous output.

Now, this happened yet again yesterday. Two distinct people, one identical suggestion. A perfect one at that. This would be too good. The probability was kinda lower this time. And yet, it came up with a perfect choice.

How this happened is beyond me. It looks like this is not random. It’s an algorithm. Facebook probably has some sort of lookup table, matching my friends with some fancy likability coefficient — probably calculated from chat history, or likes, or the vast archives Facebook has.

I’d love to see what Facebook thinks about me. I might even contribute better numbers, as they might be wrong — then again, I have absolutely no idea what they are.

If anyone from Facebook is listening: I’d love to see the data you use for this algorithm — and possibly many others.

Bonus: downloading my data

I actually requested my Facebook data archive right now. One of the information included was Mobile Network Connection Quality. It shows the average bandwidth and round-trip time of my HSPA+ mobile networks and Wi-Fi networks (collectively). Here’s a question: WHY does Facebook care?! What is the purpose of collecting this information? They can’t do anything about it, unless they were to send complaints to the service providers. Which they cannot.

Digging through the archive even further, Facebook seems to fail with ads (that I don’t see anyways…) How can I fix it? The tags are completely wrong and not valid for yours truly… They should at least try to show me valid advertisements. Here is a sample of what was chosen:

  • #Censorship. Not really. I don’t have anything to do with censorship at all. It actually comes from a page I liked, that doesn’t really do any anti-censorship activism.

  • #[my city here]. #[a video game]. Both make sense. I actually liked pages related to both. (Though I liked the second game of the series and the tag is for the first, or — at worst — all games collectively.)

  • #[a movie I HAVE NEVER SEEN]. Facebook failed at getting the reference yet again. The title of the movie appears in the name of a page — but they are not related at all.

  • #[a South Hemisphere company I never heard of.] I actually see the reference after looking at my likes and their Wikipedia page. It’s very invalid.

  • …and many more incorrect things I do not care about.

You would have thought Facebook has developed algorithms to devise correct ads — the thing their money comes from…

I still do not understand all this.