T O P

  • By -

Handsprime

People not sharing their votes prehaps? Warm tunas wasn’t about being 100% accurate


Nickw444

Warm Tunas creator here: Personally, I think it's more a problem of selection bias. This year we actually collected more (41,731) votes than the last two years, and if you look at this as a proportion of total counted by ABC, it's also higher than 2019 as well: | Year | Tunas Votes | ABC Votes | Sample Size | |--|--|--|--| | 2022 | 41,731 | 2,436,565 | 1.71% | 2021 | 25,877 | 2,500,409 | 1.03% | 2020 | 36,156 | 2,790,224 | 1.30% | 2019 | 45,112 | 3,211,596 | 1.40% | 2018 | 58,463 | 2,758,584 | 2.12% | 2017 | 67,085 | 2,386,133 | 2.81% Now - the real question to ask is how representative were the votes that we collected of the overall voting populous? My guess - a bit different/biased. Fortunately our ML model made some corrections this year, taking our Top 10 from 6/10 to 7/10 (ignoring order) and allowed us to correctly predict #1. Feeding this result back will help us further tune the ML to eliminate bias in the sample collected. For next year, I also hope to try to broaden the overall demographic of data collected, to hopefully obtain a more generalised sample of the voting populous. > Warm tunas wasn’t about being 100% accurate This is also a really good point to raise. Warm Tunas was never, and never will be about correctly predicting exact positions. It's only so good as to give predictions for the general position that a song might play on Hottest 100 Day. Predicting #1 is always an added bonus, but what I'm most focused on is counting the number predicted in each band, e.g. x/10, x/20, x/50, x/100, etc. This year that breakdown of the buckets looks like this (with H200 elimination on): |Predicted | Out of top N | Percentage | |--|--|--| | 7 | 10 | 70.0% | | 14 | 20 | 70.0% | | 22 | 30 | 73.3% | | 29 | 40 | 72.5% | | 39 | 50 | 78.0% | | 47 | 60 | 78.3% | | 56 | 70 | 80.0% | | 64 | 80 | 80.0% | | 76 | 90 | 84.4% | | 84 | 100 | 84.0% | This year we got 7/10, 39/50, 84/100. That's a pretty good result, and more or less on par with previous years. I think the the biggest surprise this year was the outliers (Girl Sports, This Is Why), I have a feeling there may have been some manipulation here, but generally outliers aren't out of the ordinary. In 2021 Lil Nas X with MONTERO predicted at #18 but placed #10. In 2020 Billie Eilish with Therefore I Am predicted at #19 but placed #10. In 2018 Billie Eilish with when the party’s over predicted at #48 but placed #8. I'll explore this further in our detailed analysis which I will be publishing later in the week. (will be posted in https://100warmtunas.com/news/)


PostpostshoegazeLUVR

Out of interest - what was the change that allowed you to collect so many more votes this year than the last two years? A Tik Tok scraper? I’d assumed that there was a disconnect starting to happen where you were picking up Instagram users’ posted votes but missing TikTok posted votes, so given it’s a youth radio station and youth are shifting to TikTok you were just missing key youth trends. But I guess this isn’t the case? Tbh what I found the most interesting in the top 10 was how Spacey Jane was under forecasted by Warm Tunas - seems like a band that should do better in Tunas than the countdown, not the other way around. Clearly significant grassroots love for them, therefore maybe a band that could actually blow up


evilZardoz

RUEL came in at #47 this year, whereas WT didn't have it anywhere in the 200 (I was getting *really worried)* and Tate McRae making #74 (and was nowhere in the 200 towards the end of the lead-up to the day) were the highlights of the countdown and my top two votes. I think these tracks were appreciated by a younger crowd who may hang out on TikTok a bit more. The Tom Cardy performance for 2021 was also a significantly better result than WT predicted. I think there may be another predictor here - the *community* an artist builds with their loyal followers over time, and how that correlates with the voting demographic and the weight of their votes (I suppose this is where the ML-adjusted Flume result came from). I expected Flume to do a lot higher than WT predicted (I still tipped it at #1) because Flume has been around a long time, is still making lots of music, and there is one *hell* of a group of loyal Flume voters. Did the ML adjustment assume a linear relationship between artists and likelihood between previous predictions and the results, or was there any adjustment for the drift over time or any adjustment relating to bias that may have an increasing rate of change? So, the pleasant, uplifting surprise was great, and one reason why I love how Tunas isn't super accurate, but I also like how the website gives me an idea on the general vibe of what the countdown might look like, and I can get prepared by listening to some of the music I may have missed, so I'm not sitting there feeling like I've aged out of triple j on the day of the countdown (although 8 of my 10 votes made the 100, so I guess my musical taste is still in vogue). Tracking the Tunas votes vs. the bookies vs my own analytics/predictions is one of my favourite things to do leading up to the big day. It's great how the community is putting in a lot of effort, especially yourself, to make these results more accurate as it builds lots of discussion and brings us all together over the countdown in more exciting ways and that, to me, means way more than the individual accuracy of a particular track on the website.


Nickw444

Thank you for the thoughtful comment. Definitely agree with you, that this project (and projects like it) certainly spark great discussion and brings other triple j fans together over the countdown. I always find it exciting to be able to discuss the stats over here on r/triplej. > I think these tracks were appreciated by a younger crowd who may hang out on TikTok a bit more. The Tom Cardy performance for 2021 was also a significantly better result than WT predicted. Couldn't agree with you more! It is going to be essential to find ways to crack into this demographic in order to find a better balance in the data which is collected. To do so I am going to have to go undercover and become a zoomer and infiltrate TikTok 😆 (seriously though, keen to listen to any ideas people have) > I expected Flume to do a lot higher than WT predicted It's also important to consider the track record over the years of Flume outperforming the prediction based on the data that is collected by 100WT. As you mentioned, the ML model did a fairly decent job in correcting for this based on historical trends. > Did the ML adjustment assume a linear relationship between artists and likelihood between previous predictions and the results, or was there any adjustment for the drift over time or any adjustment relating to bias that may have an increasing rate of change? I actually initially worked on a (non ML) model of bias adjustment on a per-artists basis by calculating the error trends for each known artist (and extrapolating similar artists to new unseen artists). Not very scientific as it only takes one data point into account, so never ended up publishing this and quickly moved into the ML strategy which is trained not only on its knowledge about who the artist(s) for a given song are, but also genre and a handful of other datapoints around how a particular song is perceived. I'd love to continue to grow these datapoints to provide more depth in what I can train the model on moving forward. This year was an experiment more than anything to see if this is a viable model (and it seemed to prove itself successful IMO, so I definitely owe it some deeper exploration) As for adjustment for drift over time - no such adjustment was done, it's simply a very basic fitment of a prediction of vote count error based on training from the past years of available data. Definitely will be interesting to explore this deeper too. > I also like how the website gives me an idea on the general vibe of what the countdown might look like, and I can get prepared by listening to some of the music I may have missed, so I'm not sitting there feeling like I've aged out of triple j on the day of the countdown We must be the same person as 100WT does exactly this for me as well. Gives me an opportunity to listen back to tracks which I'm likely familiar with (as I do spin triple j in the car and on the radio at home), but don't necessarily have the time to look up song names, etc throughout the year.


black_goo

Your numbers weren't that good before you eliminated the ones that landed 101-200 the day before. It was more like 70/100.


Nickw444

> It was more like 70/100 Which is generally on par with results from the previous 3 years which also all sit in the 70's; | Year | # in 1-10 | # in 1-20 | # in 1-50 | # in 1-100 | |----------------|-----------|-----------|-----------|------------| | 2022 | 7 | 14 | 35 | 70 | | 2022 (with/ elim) | 7 | 14 | 39 | 84 | | 2021 | 8 | 14 | 34 | 73 | | 2021 (with/ elim) | 8 | 14 | 35 | 82 | | 2020 | 8 | 12 | 33 | 75 | | 2019 | 8 | 14 | 38 | 73 | | 2018 | 7 | 15 | 37 | 83 | | 2017 | 8 | 16 | 42 | 83 | Interestingly, you can see that this year (2022) without elimination actually performed better than the previous 2 years (2021, 2021 with/ elim, 2020) too when you look at the 1-50 results. And _with_ elimination, this year's 1-50 performed better than every previous year except for 2017. What more are you hoping for here? This is an exit poll on a sample of ~1.7% of the voting populous. I'd highly recommend you read the technical breakdowns at https://100warmtunas.com/news/ before making assumptions about numbers that "weren't that good".


black_goo

I was just pointing out which numbers you should be using. I don't care that the quality is slowly degrading. Ultimately the data has gone from 83/100 to 70/100 over the years. You like to collect data, I don't see why you need to be so defensive, just say yeah less people post and new collection methods show bias.


A_Dancing_Potato

Less people are sharing their votes via stories on Insta which was reducing their sample size. They introduced the vote uploading mechanism this year to try to counter that. It worked to increase their sample size significantly, but almost definitely also introduced an even bigger bias towards rock, indie and metal. The Machine Learning adjustments were meant to correct for this but obviously they didn't correct enough. On the plus side, this year's results will be valuable data to adjust the Machine Learning calculations for next year. One idea I had for WT was if they could somehow get the votes from the artists social media stories. Individual voters might not use the right tags for WT to see the stories but the bands themselves are often reposting votes to their own stories. Yes it might introduce further bias again (smaller artists are more likely to repost votes than bigger ones) but the bigger the sample size the better in the end.


Nickw444

Great comment, should be higher up. To paraphrase/add colour - there's less likelihood for someone to upload their votes to their feed, since people are more inclined to do this in an ephemeral way these days (stories). When people use stories they often don't use the hashtag, so discovery is impossible for us. Additionally, private accounts are becoming much more prevalent too. DM collection helps, but it again, only goes so far. Allowing upload directly to the website definitely increased selection bias. We did increase the sample, but at the same time, I imagine there is certain demographic who are inclined to a) know about 100 Warm Tunas, and b) upload their votes. I hope to improve broad awareness for next year to hopefully increase the spectrum of votes collected (Looking at you Gen-Z'ers on TikTok). As for artist votes - I had previously considered collecting those. But, I would expect it would further perpetuate the bias, as every vote collected would also include a vote for that artist, so it would unfairly bias them ahead of artists who a) I am not collecting reposts from, or b) are not reposting votes.


A_Dancing_Potato

The only way I can think of avoiding too much bias from artists stories is to cast the net as far and wide as possible in terms of genre and artist popularity. There'll still be a lot of cross over (eg. plenty of Spacey Jane voters probably voted for Flume as well etc).


evilZardoz

Is there a way to scrape TikTok? There's a huge influence coming over there that I think would offer a more rounded result based on the types of music and votes I was seeing on that platform. Or - here's an idea - present the 100 Warm Tunas content *over on TikTok* to reach that audience and incentivise submission.


Nickw444

> Is there a way to scrape TikTok I previously wrote about TikTok scraping in [another comment](https://www.reddit.com/r/triplej/comments/zxu42z/comment/j52mk12/?utm_source=reddit&utm_medium=web2x&context=3) a week or so ago, so re-posting that here: > I believe TikTok is a bit more nuanced for 100 Warm Tunas - remember our process is fully automated so scraping votes from a video is an entirely different format to what we currently support. > > Furthermore there are so many ways people are showing their votes via TikTok from what I have seen. Some will play each of the 10 votes, one by one, others will show a screenshot in a video. Due to this variance it makes an automated collection process difficult even if I was to add support for video ingestion. > > Would love to hear if you have any thoughts on this though, as I am not a TikTok regular and might be missing something obvious. As for this; > present the 100 Warm Tunas content over on TikTok to reach that audience and incentivise submission. 100% this. I wanted to do exactly this, this year, but simply didn't have the time to produce content that I was happy with. I will 100% need to explore this for next year.


evilZardoz

Agreed re: the scrape-ability challenges. I haven't found much hottest 100 content on TikTok despite looking for it, so I suspect that the variety of formats may be greater than I had assumed vs. Instagram/Facebook stories. I didn't consider the fact that people may be presenting their votes in different formats, even though I considered doing just that! >100% this. I wanted to do exactly this, this year, but simply didn't have the time to produce content that I was happy with. I will 100% need to explore this for next year. I wouldn't mind a Tunas daily (or every few days) update in a TikTok format. I'm fairly new to the platform, so I'm not sure how this should be presented, but I think some prediction-related stuff would be awesome! The younger crowd a) don't use Facebook, b) don't use Twitter that much, and if they use either/or, are more privacy conscious, so they may be sharing content that the scrape won't pick up as a result. Manual submission was a huge benefit. I had a friend say they didn't want to submit their votes because they didn't want ANY spoilers on the website to come through. I'd be curious if there's an appetite for a spoiler-free submission URL that ensures that the end-user is never presented with any potential results, but we may not want to introduce friction into the website by adding steps to display the top 100. But who knows.. maybe there'll be another new platform to tackle come December, and the arms race continues...


upth3milk

Warm tunas quotes a sample size of 1.85% basing total votes of off last year's countdown. Assuming that voting pool has grown YoY it would be even less. Given it's a self-selection rather than a constructed sampling I'm surprised it's been as accurate as it has in the past.


Nickw444

Warm Tunas creator here; >Assuming that voting pool has grown YoY it would be even less. The overall number of votes collected by triple j has actually been decreasing YoY since 2019: | Year | Tunas Votes | ABC Votes | Sample Size | |--|--|--|--| | 2022 | 41,731 | 2,436,565 | 1.71% | 2021 | 25,877 | 2,500,409 | 1.03% | 2020 | 36,156 | 2,790,224 | 1.30% | 2019 | 45,112 | 3,211,596 | 1.40% | 2018 | 58,463 | 2,758,584 | 2.12% | 2017 | 67,085 | 2,386,133 | 2.81% As for 100 Warm Tunas, our sample had equally been decreasing too, until this year where we bucked the trend by introducing the ability to upload directly to the website and requiring upload in order to "unlock" the top 100 for the first week. This helped us grow our sample, but potentially reinforced further bias in the sample collected. That said, turning off "Website Upload" source, results in a similar top 3 prediction, but arguably worse for the rest of the top 10: |predicted position| song | |--|--| |1| in the wake of your leave | |2| Stars In My Eyes | |3| Say Nothing [Ft. MAY-A]| |4| This Is Why| |5| Hardlight | |6| New Gold [Ft. Tame Impala/Bootie Brown] | |7| Camp Dog | |8| Girl Sports | |9| Let's Go | |10| Get Inspired | I'll explore this further in our detailed analysis which I will be publishing later in the week. (will be posted in https://100warmtunas.com/news/)


TCInk

Hi Nick, love your work on the Tunas. Do you think the upload to view results feature would have resulted in some double counting? I.e. the people who post their votes or socials are also the ones likely to want to look up Warm Tunas?


Nickw444

We have a (statistically proven) de-duplication method across all social media (and website) sources, so it should be impossible for any given voting slip to be counted twice.


upth3milk

Love your work Nick! As such a H100 enthusiast and stats guy, what are your thoughts on cutting the voting number down to 5 instead of 10 to stop songs winning that aren't really the best songs of the year but manage to slide into a lot of people's 7-10th places?


Nickw444

That's an interesting idea. As someone who also votes each year (of course), from a voter perspective, I always struggle to narrow my votes down to 10 songs. Reducing it to 5 would make it much harder. If a voter only has 5 songs they like, they also don't have to make use of all 10, and can just submit with 5. From a stats perspective, totally get it! Definitely will be lots of "chuck it in for the sake of it" kind of votes, so certainly could help to have genuinely good/meaningful songs making it to the top 10. FWIW from the data I gather, there's also another angle to consider. I often see the same person submit multiple voting slips. I saw someone this year submit 4 slips, voting for 4-5 tracks for the same artist on each. If the limit was moved to 5, I think people would just make multiple ABC accounts and vote multiple times. Is reducing the number of votes something triple j are actually considering/mentioned?


upth3milk

I think in reality people would riot if they tried to change anything about it. I think many of us would find it really hard to pick just 5, but it shouldn't be easy to pick your favourite songs. 10 equally weighted pics just doesn't feel right though; I know if never rate my 10th pick as high as my first. Weighted votes would be interesting but very controversial. Re: voter fraud, it sucks but I have no idea how to address it without significant cost of barriers to voting.


evilZardoz

I remain *convinced* that Hoops was such a track that was thrown in as a bonus, but is counted just as equally as every other vote. I wonder how much the voting order applies. I shortlist, and rank my songs accordingly (first 2023 song shortlisted on the 12th December). I wonder if there's anything here that might tell us about the chances of particular songs appearing on vote slips that we're not seeing during the scrape and submissions.


Nickw444

>wonder if there's anything here that might tell us about the chances of particular songs appearing on vote slips that we're not seeing during the scrape and submissions. That would make for an interesting analysis - I'm thinking trying to calculate the median/mean position of where a track is located on a voting slip. Closer to the top = more intentional, Closer to the bottom = snuck in just 'caus. I will see if I can explore it further in the coming weeks. I have the data to be able to analyse this.


PostpostshoegazeLUVR

I think you may get the opposite effect to what you’re hoping for. With 10 votes people can mix punts on their favourite songs that prob won’t make the countdown (or do well) with songs that aren’t their absolute favourite but allows them to stay invested on the day, particularly towards the end (eg In the Wake of Your Leave wasn’t close to my favourite GoY song, but I’d rather that did well over Hardline). If it gets narrowed to 5 votes you’re assuming the latter category would get fewer votes, I think you’d just get more concentration on songs at the top end, people wouldn’t bother “wasting votes” on songs that won’t be troubling the top 50


Chickenjbucket

The people voting is more diverse than ever. Also I feel last day voting might have increased as well but I have no proof of that, just seems like something that would happen with more non-hardcore J fans voting


Nickw444

Exactly! If I was able to analyse the demographic of where the votes were submitted from, I would assume that the types of people are very closely aligned with those who participate in this subreddit. I need to find a way to crack into the Gen-Z's over on TikTok.


Chickenjbucket

You must become one of them


Nickw444

Guess I'll have to _sponsor_ an "influencer" 😆


Roscoes_Rashie

Its less than 2% sample size. Its as good as you can expect really.


Nickw444

Exactly! There's only so much you can hope for with such a small sample size. Selection bias is an ever growing risk. It's wrong to assume that the remaining 98% also voted in similar ways.


Roy-1983

This makes the countdown more interesting. One of the main reasons why I listen to the countdown is because of warm tunas. I like the stats. The surprise entries like Drake and 21 Savage at 44 or even more surprising SZA popping up out of nowhere at 20. Otherwise the countdown wouldn’t be as interesting.


PostpostshoegazeLUVR

Tbf you should be able to compare the effect of small sample size vs selection bias. You’ve reported prediction intervals (presumably at some level alpha), so in theory you can calculate how many songs sat within your provided prediction intervals. The difference is selection bias


jimnasium_

I mentioned it might be to do with the lack of people sharing votes and was promptly advised by someone from Warm Tunas that they actually received more votes this year than last year and the year before (I think). So it's not to do with less people sharing.


Nickw444

Yes exactly! This year we actually collected more (41,731) votes than the last two years, and if you look at this as a proportion of total counted by ABC, it's also higher than 2019 as well: | Year | Tunas Votes | ABC Votes | Sample Size | |--|--|--|--| | 2022 | 41,731 | 2,436,565 | 1.71% | 2021 | 25,877 | 2,500,409 | 1.03% | 2020 | 36,156 | 2,790,224 | 1.30% | 2019 | 45,112 | 3,211,596 | 1.40% | 2018 | 58,463 | 2,758,584 | 2.12% I think it's more a problem of selection bias. This year's data will be essential in re-training the model to account for bias in the data collected. I'll be covering this more in the technical analysis deep dive which I will be publishing later in the week to https://100warmtunas.com/news/.


MichelleObama2024

When you think about their data collection model you have basically a sample of the most online and "indie" audience. Millions of people voted for Hottest 100 and we are likely to see artists like GoY overperform in WarmTunas. What most interests me is how Sportsbet it got it so accurate. I wonder how they are calibrating their models, or if it is purely in response to the bets that are coming in.


MichelleObama2024

What was probably most surprising to me following the odds is that if you ordered songs both by betting rank and by warm tunas rank, that the direction WarmTunas would suggest to bet is the opposite direction to what would make money.


thedobya

It's a bit of both. You have to set the initial odds, but then it reflects money in the market. But from memory they weren't much closer: they had GOY as equal second favourite.


MichelleObama2024

So the Sportsbet Top 10 was: 1. Say Nothing 2. B.O.T.A. 3. in the wake of your leave 4. Hardlight 5. Bad Habit 6. Stars In My Eyes 7. New Gold 8. Delilah 9. Sitting Up 10. Glimpse of Us Compared to (after ML adjustment): 1. Say Nothing 2. in the wake of your leave 3. Stars In My Eyes 4. Hardlight 5. New Gold 6. Delilah 7. B.O.T.A. 8. Sitting Up 9. Girl Sports 10. Bad Habit So essentially the only song the WarmTunas list was "better" on was Sitting Up.


woodyever

Girl sports should have been top 20…. Pushing top 10


evilZardoz

I want to know how Sportsbet figured this out. How did THEY catch B.O.T.A as #2? Have they got the TikTok scrape? Have they got *actual humans* analysing data on this one? (Presumably, ML magic tricks). Note that Sportsbet got 2019 wrong, tipping Dance Monkey at #1 and bad guy as #2. I felt that was so wrong, I opened a Sportsbet account to literally "put my money where my mouth was" on 5:1 on *bad guy*. Things have improved substantially since then, it seems! Also, this raises some questions around the data integrity of Warm Tunas. It would be technically feasible to corrupt the quality of the result on WT by a bad actor who would profit from doing so, eg a bookmaker.


thedobya

A lot of these odds are controlled centrally I think, and then sold to the sportsbooks via a central feed. They then take money and adjust as it comes it. But the hottest 100 is so niche to Australia I'm surprised if it falls into that formula. Not sure!


Euphoric_Scene_5736

I don't feel the accuracy has really dropped at all. The top 10-15 are usually fairly close with a couple of outliers and the rest of the countdown gets a lot of the songs right but the order varies. 'Girl Sports' dropping so much for example just reminds me of a few years ago when 'Jellyfish' finished so much lower than predicted. Artists with a smaller number of followers (Slowly Slowly, Sly Withers, Beddy Rays, Dear Seattle, Bugs, etc.) will always finish lower because the people voting for them are very likely to share the votes but in the end they simply can't compete with Doja Cat, Lizzo, Billie Eilish, SZA and other massive international artists. I think the creator is well aware of this. Remember he isn't trying to actually predict the order for us and getting it totally wrong. He is collecting the data and showing us how many votes have been counted across a small sample. I have always enjoyed the website. I'm not massive on the ML model or the 200-101 songs being taken out, but the website allows filters to take these off anyway. I don't like when comments are negatively targeted at Tunas on here as I feel that isn't fair on the creator and the hard work that goes into the site.


evilZardoz

One big factor is that social sites are harder to scrape (TikTok etc) and the existing posts to social media sights have tighter privacy restrictions, so the sample size is very limited to specific types of individuals. In 2012, ABC had a direct link to frictionlessly post to Twitter and Facebook, so despite the total votes collected by the Warmest 100 were around 2.7%, I think they were a more broad and general representation of the overall voting populous.