CaelNCSU All American 7132 Posts user info edit post |
I parsed the last 11 pages of Soapbox removing the quoted comments and generated vector embeddings off the comments by user (not all of them it's super slow, but chugging along). This is a plot of the average embedding across the posts for the user for the ones I do have. Using PCA you can break down the dimensionality to 2D so you can plot it on a graph:
What can you do with the embeddings:
- Find clusters of users that have similar posting styles and beliefs and dox aliases - See the trend of posting style/belief over time - Generate a mathematical "Political Compass" - Ask questions like "is this poster a shitlib"?
If this is against ToS let me know @qntmfred.
[Edited on November 13, 2024 at 10:35 AM. Reason : damnnnnn] 11/13/2024 10:29:58 AM |
qntmfred retired 40806 Posts user info edit post |
spelling my name incorrectly is against ToS
[Edited on November 13, 2024 at 10:39 AM. Reason : which model did you use] 11/13/2024 10:33:29 AM |
CaelNCSU All American 7132 Posts user info edit post |
OpenAI 'text-embedding-3-small'. I may try it with Llama 3.1 too if I get frustrated with how long this takes, my guess was it would be way faster than burning my own cycles, but that does not appear to be the case. 11/13/2024 11:08:34 AM |
CaelNCSU All American 7132 Posts user info edit post |
utowncha is tgl or dtral going by cosine similarity:
The big image got scaled down in the gallery, but it's pretty neat.
Added stars for the regulars and infamous posters.
Raige is the most unusual in terms of the distance from others.
[Edited on November 13, 2024 at 1:20 PM. Reason : a] 11/13/2024 12:55:36 PM |
utowncha All American 918 Posts user info edit post |
Shockingly insulting. 11/13/2024 5:46:39 PM |
qntmfred retired 40806 Posts user info edit post |
hey what do you expect when you get flattened down to two dimensions
i played around with some twwerbots a while back but the results were uninspiring. might be time to try again. i've been playing with the llama 3.2 vision model. just wait until we build profiles out of 20 years of livestreaming rather than 20 years of message boarding and tweeting.
[Edited on November 13, 2024 at 6:01 PM. Reason : my how was your 2024 thread response will be generated by AI based on my livestreams] 11/13/2024 5:58:40 PM |
CaelNCSU All American 7132 Posts user info edit post |
^ yeah I built an esgarg one twice. It started using the n word and saying fag a lot so deleted the fine tune so my account didn't get cancelled.
Think I mentioned the Llama one in the other thread. I've been super curious how they would respond to current events. Would be cool to have one trained on only subsets, like ubertsb or uberchitchat.
[Edited on November 13, 2024 at 6:23 PM. Reason : A] 11/13/2024 6:23:29 PM |
StTexan King Dingaling 7385 Posts user info edit post |
^^^^possible to add me? Wonder if I am close to a couple bros on here
[Edited on November 13, 2024 at 6:26 PM. Reason : Too few]
[Edited on November 13, 2024 at 6:26 PM. Reason : Or two few aint that something] 11/13/2024 6:25:58 PM |
CaelNCSU All American 7132 Posts user info edit post |
I added you, you just may not in the view there. Will take another screenshot when I find you. StTexan top similar
[(0.9573154205531155, 'rwoody'), (0.945110925776677, 'HaLo'), (0.9390791610460517, 'bbehe'), (0.9348659692783031, 'synapse'), (0.9337087257886083, 'Money_Jones'), (0.93322072709863, 'skywalkr'), (0.9314596584573368, 'The Coz'), (0.9304043675756769, 'qntmfred'), (0.9302928550201987, 'utowncha'), (0.9299006964490467, 'A Tanzarian')]
[Edited on November 13, 2024 at 6:52 PM. Reason : a] 11/13/2024 6:38:02 PM |
StTexan King Dingaling 7385 Posts user info edit post |
Cool thats a pretty good list to be on I think. What exactly is the cosine similarity thing?
Dang can you put top 50? Top 10 all real close at like .92]
I bet my top 10 could rule the country better than yours!!! Lol
[Edited on November 13, 2024 at 6:59 PM. Reason : Add another note] 11/13/2024 6:55:45 PM |
The Coz Tempus Fugitive 26267 Posts user info edit post |
Interesting.
I'm flattered that a got a star!
And I think this proves I wasn't Rem Lezar. 11/13/2024 8:14:31 PM |
aaronburro Sup, B 53136 Posts user info edit post |
I don't see myself on there... am I that far gone? 11/13/2024 10:13:54 PM |
StTexan King Dingaling 7385 Posts user info edit post |
Of course ddd isn't on here...the weirdo is off the charts! 11/13/2024 10:47:12 PM |
CaelNCSU All American 7132 Posts user info edit post |
Near nighthawk in the upper left. Aaronburro is around I think in this view. 11/13/2024 10:57:35 PM |
The Coz Tempus Fugitive 26267 Posts user info edit post |
^You got got! 11/14/2024 1:33:34 AM |
emnsk All American 2861 Posts user info edit post |
pretty cool
^^^ I'm in the upper middle, towards the left of The Coz
So this just looks for similarity in the words used, or is there something else?
Also, aren't BubbleBobble and ReceiveDeath the same person? Yet, they're so far apart. Yet utowncha and thegoodlife3 are right next to one another.
Maybe BB posted a lot more on one account when he was younger and had a different diction. Or tgl3 is just boring
I should generate a bot based off of my current profile. I wonder what it would talk about this should be an inbuilt feature on social media sites talk to your clone
[Edited on November 16, 2024 at 11:45 AM. Reason : it be interesting to do chit chat and the soapbox for a controlled set of active users and compare] 11/16/2024 11:40:40 AM |
qntmfred retired 40806 Posts user info edit post |
Quote : | "this should be an inbuilt feature on social media sites. talk to your clone" |
it's coming. I built a qntmbot a while back and though it wasn't quite good enough yet, the goal is to put it on https://kenwarner.ai. i mostly just want to prank my mom by calling her on the phone and let her talk to my bot and see how long it takes her to figure out it's not me
[Edited on November 16, 2024 at 12:14 PM. Reason : AGI is when]11/16/2024 12:11:19 PM |
CaelNCSU All American 7132 Posts user info edit post |
^^ If TGL is posting about the plight of the trans and utowncha was like, THIS FUCKING GUY, and posts in response they would be closer together since the meaning of the comments would be similar (about trans stuff).
It also maps structure and word (token freq), but there are better techniques to determine similarities there. I bet a standard naive bayes classifier would work pretty well for that. My original interest and question was if you could map people into some kind of belief based on post comments.
Makes a pretty good AI test bed. Would be cool to have it broken into something you could load with Pandas.
[Edited on November 16, 2024 at 8:05 PM. Reason : A] 11/16/2024 8:04:06 PM |
emnsk All American 2861 Posts user info edit post |
I want more TWW analytics 11/22/2024 10:03:35 AM |
The Coz Tempus Fugitive 26267 Posts user info edit post |
You would have liked the charts qntmfred used to make. 11/22/2024 10:54:27 AM |
moron All American 34183 Posts user info edit post |
Should try tsne instead of pca
Something is off. Salisbury hasn’t posted in forever. This isn’t the last 11 pages
[Edited on November 26, 2024 at 1:31 AM. Reason : C] 11/26/2024 1:29:26 AM |
CaelNCSU All American 7132 Posts user info edit post |
^ In my previous TWW hack I fine-tuned on esgargs posts and it looks like it combined one of the clean json files that contained pages esgargs had posted in which has Salisbury. 11/26/2024 8:14:27 AM |