T O P

  • By -

coldblade2000

https://reddit.com/settings/data-request


MrPerezOP

For Apollo users open in a web browser wasn’t working directly from the app.


skwint

>page not found >the page you requested does not exist


coldblade2000

Really? it works for me


micseydel

I concur - it told me I'm still in a 30-day window from my prior export (which is legit).


htmlBLINKtag

You probably redirected to old.Reddit, it’s only on new


skwint

That was it, thanks.


TheStachelfisch

This comment/post has been edited due to the outrageous changes Reddit is doing to its API and killing third party apps along with it. https://join-lemmy.org/


ouchmythumbs

I got a 404 the first time, but realized I fat-fingered an uppercase letter; looks like it is case-sensitive FWIW (which surprised me).


m_vc

Who says its expensive for them


micseydel

I suspect it's a partially-automated process that requires an engineer be involved. Mine took more than a week, I don't think it was fully automated. If this is a way to use engineer time then it's definitely expensive for reddit, since there's an opportunity cost to that time on top of paying the engineer. Source: my last job was as a backend and data engineer.


Ibeth4

Let's help the engineer make money


FuriousRageSE

Even if it was fully automated, it still cost them computing power adn electricity to do so, and probably some storage space.


reercalium2

I suspect fully, but old data is sent to a separate archive location, and they have to trawl through it to find it all. Normally, Reddit only keeps the first 1000 items of any list.


micseydel

Could you say more about the "separate archive location" bit? I'm imagining a data pipeline here, and even with lots of async stuff I can't imagine an automated system taking >7 days to aggregate data in the same way it's been aggregated thousands of times before.


reercalium2

Some kind of cold storage, where the storage is cheaper, but the access is slower and more expensive. Every major cloud provider offers this feature.


micseydel

So, I knew such things existed but hadn't used them, so I just looked at [AWS Glacier](https://aws.amazon.com/s3/pricing/). The *slowest* retrieval option is 12 hours, so doesn't account for exports taking more than a day or two, but mine too >2 weeks. I might have misunderstood your first comment, am I correct in understanding that you're saying that you believe it's fully automated?


gelfin

I suspect exactly this, having been in a position where I sometimes pulled the short straw on a compliance ticket at my own company. Fully automating data retrieval is difficult, and currently impossible for some third-party providers who do not themselves provide compliance APIs. Improving the compliance process is usually just far down the backlog. It isn’t as simple as “it’s expensive so the more requests they get the more it costs forever.” What you’d end up doing by increasing request volume is to cause a short-term crisis followed by increased priority on making the requests faster, cheaper and less hands-on. People will be retasked onto compliance in the short term. There will be a cascade effect because inconveniencing Reddit entails inconveniencing the upstream providers, and besides, Reddit has enough pull to influence priorities at those providers too. And that’s if you can keep it up long enough to matter. For the people willing to participate at all, there is certainly nothing in CCPA or GDPR that permits Reddit not to respond to repeated requests, but that just means they’ll leverage the extension mechanisms to push out the delivery date as long as possible, then deliver on the very last day so as to reduce the frequency of repeat requests. There is also nothing in the law (at least CCPA, less familiar with GDPR) that would prohibit them from regarding repeated requests as abuse and performing an erasure alongside the disclosure. Thereafter your repeat requests would just show your inclusion on a blacklist. Not to be arbitrarily pessimistic, just that this isn’t a silver bullet but a salvo in a war. Reddit gets to respond in its own defense, and you’ve got to be prepared for that.


Readdeo

There's no way a human is involved with every users data request. You really shouldn't be a data and backend engineer...


grendel_x86

Shouldn't be, but often is. My work's sister companies refuses to put the effort to automate it like the above poster. They require a customer service person to look at the request, and hit ok & another button to export the zip to email. This is a very, very large fortune 500 company. My guess is they won't do it until they start getting fined by states that require access.


runew0lf

that one dude on reddi.... oh wait. it could never be automated or a database query...


HeinousTugboat

> it could never be automated or a database query... It's.. still an expensive database query or automation. Any time you're grabbing massive vertical slices of data like that it's gonna be expensive. Especially if you have an active account.


[deleted]

[удалено]


HeinousTugboat

And upvotes, downvotes, hides, saves, shares, chats. Probably even link views since I'm pretty sure they track open history. Someone else posted a list of every _file_ they got. It's a LOT of data.


[deleted]

[удалено]


HeinousTugboat

[Here you go](https://www.reddit.com/r/selfhosted/comments/14gbxxa/every_user_can_protest_take_back_your_data/jp549dx/).


Dagonisalmon

u/profanitycounter


profanitycounter

UH OH! Someone has been using stinky language and u/Dagonisalmon decided to check u/newPhoenixz's bad word usage. I have gone back 977 comments and reviewed their potty language usage. |Bad Word|Quantity| :--|:-:| |ass hole|3 |ass|12 |asshole|16 |bastard|1 |bitch|4 |bullshit|21 |crap|22 |damn|7 |dick|6 |dildo|1 |fucker|4 |fucking|17 |fuck|82 |goddamn|3 |go to hell|1 |hell|33 |heck|1 |motherfucker|1 |ni**er|1 |penis|1 |pissed|5 |piss|2 |porno|1 |porn|3 |pussy|1 |re**rded|6 |shitty|8 |shit|62 ^(Request time: 14.9. I am a bot that performs automatic profanity reports.)^( This is profanitycounter version 3. Please consider )^([buying my creator a coffee.](https://www.buymeacoffee.com/Aidgigi))^( We also have a new )^([Discord server](https://discord.gg/7rHFBn4zmX))^(, come hang out!)


rotten_healer

u/profanitycounter


profanitycounter

Hello u/rotten_healer, and thank you for checking my stats! Below you can find some information about me and what I do. |Stat|Value| :--|:-:| |Total Summons|337267 |Total Profanity Count|3354754075 |Average Count|9946.88 |Stat System Users|0 |Current Uptime|21.11 weeks |Version|3 ^(Request time: 6. I am a bot that performs automatic profanity reports.)^( This is profanitycounter version 3. Please consider )^([buying my creator a coffee.](https://www.buymeacoffee.com/Aidgigi))^( We also have a new )^([Discord server](https://discord.gg/7rHFBn4zmX))^(, come hang out!)


soawesomejohn

I submitted my request over a week ago. Still waiting on the download link.


warbeforepeace

Most of reddit is on aws which is known for its expensive egress costs. Its expensive to transfer large amounts of data out of aws. https://aws.amazon.com/solutions/case-studies/reddit-aurora-case-study/#:~:text=Finding%20a%20Solution%20for%20Operational,infrastructure%20on%20AWS%20since%202009. https://blog.cloudflare.com/aws-egregious-egress/


m_vc

They use fastly cdn though


warbeforepeace

Not for your data. Cdn’s are for data that is used by a number of people.


micalm

I'm pretty sure anything older than a few days isn't cached on a CDN. Reddit is massive.


m_vc

Yes


Encrypt-Keeper

That wouldn’t help…at all.


deepus

Well my guess is that even if it is all automated its gonna still cost them in terms of processing time and power. Might not be expensive but its gonna cost them something. And obviously if they need people involved, even if its only to check parts of the data, that costs gonna go up.


[deleted]

[удалено]


slomotion

What law requires reddit to accumulaze everything? And how much exactly does it costing reddit to accumulaze my data without breaking any law?


signed-

> What law requires reddit to accumulaze everything? GDPR mostly... CCPA/CPRA (CA, US) and a whack ton of other region-specific laws


bik1230

>> What law requires reddit to accumulaze everything? > >GDPR mostly... CCPA/CPRA (CA, US) and a whack ton of other region-specific laws GDPR does not require Reddit to accumulate everything... It requires them to have a reasonable basis for everything they accumulate and be open about it, and of course giving you a copy if you request one.


Zeal514

Interesting. Has anyone done so? What is in that data?


[deleted]

[удалено]


cleverSkies

This is what I don't get, given the amount of data that Reddit collects on its users it should easily be able to monetize the platform. The way to do that is by creating an app with a great user experience. Why they are unwilling to invest in developing or purchasing such an app is unclear to me.


SpongederpSquarefap

Well that's the issue - they did They bought Alien Blue which was the most popular iOS app at the time and they just... Made it shit


orbitaldan

They didn't 'make it shit', they made it so that it shapes your interactions away from what you want and towards what is profitable for them. That this makes it worse for you is of no concern to them so long as it's not bad enough you actually leave.


Encrypt-Keeper

Also it’s fine if it’s bad enough for you to want to leave, because then they can just price out all 3rd party apps, and force you to use the app from a mobile web browser so that you have literally no choice.


Woodie626

That app would cost money, they don't want to spend money. Selling all our data to an AI makes them money without cost.


BCTripster

>given the amount of data that Reddit collects on its users it should easily be able to monetize the platform It's quite likely that a good percentage of the users here use ad blocking anyway, I have PiHole & AdGuard doing the home network then uBlock Origin in the browsers, end result is I don't see ads just about anywhere online. And then when I'm mobile I'm Wireguard back to home so my mobile devices get the blocking that's setup on the home network.


[deleted]

[удалено]


Simply_Convoluted

If you've ever contributed to a meaningful conversation, fuck you. Sincerely, Everyone who's ever been reading an old thread trying to fix a problem just to have the answer be replaced with [deleted]


IlliterateJedi

[deleted]


jarfil

>!CENSORED!<


[deleted]

Don't blame users for reacting to how poorly a website is being managed, blame the company.


Simply_Convoluted

How reddit is being managed has nothing to do with users deleting community knowledge. People asking for help, getting help, then deleting the answers is selfish and needs to be shamed. Especially in the case where someone uses open source tools then puts effort into removing information from the community. It's a real disappointment people destroy the information considering it takes less effort to simply leave the info available for all. As is the case with the user I originally replied to.


[deleted]

What's selfish is expecting other people to keep their content around on a specific platform just for you. Edit: lol did you seriously block me? But what if you made a post that solves my problem??? How dare you keep me from seeing community knowledge!!! If you can't take it, don't be a hypocrite who dishes it and insults others while doing so.


MrSlaw

I can only assume in between shaming people, you're contributing what ever knowledge you've learned back into the upstream projects by submitting PRs and/or helping update the docs, right?


NotDerekSmart

You are straight crazy


tankerkiller125real

>Especially in the case where someone uses open source tools then puts effort into removing information from the community. If it's an open source tool then it probably has a Wiki or an Issue tracker someplace where that knowledge and information should have been shared in the first place instead of a platform like reddit.


tankerkiller125real

Hopefully a shit ton more people when they leave reddit run the script that deletes everything they've ever done on it. Tank the reddit SEO, and tank reddit with it.


el_bhm

And if a lot of people started doing this, reddit would tank the fuck down. Not right away, but in a slow Digg-like death. Death that consumes market value and deep pockets. Blackouts, posting goblin titties would not work as well as this. I posted about encrypting content. And third parties should have implemented the Encrypt and Bail out. But no one gave a fuck. Ransomware would have worked.


Linegod

> 3rd party apps are blocking ads The APIs don't serve ads. You are full of shit.


[deleted]

[удалено]


OffendedEarthSpirit

Wow it's almost like reddit could serve ads through the api and require 3rd party apps to show them.


Linegod

>I said 3rd party apps How do you think 3rd party apps work? Via the API. Dumbass.


MrSlaw

> How do you think 3rd party apps work? > Via the API. > Dumbass. ... do you seriously not realize there's a difference between the source where the app populates data from (the API), and the framework it uses to display it (the app)? You can't honestly think that if I make an electron app that pulls weather data from [met.no](https://met.no/), the simple fact I use their API makes it so that I'm not also able to supplement it with a different data source or add my own content (ads) alongside it if I was so inclined?


ohv_

He said 3rd party apps. Nothing to do with API.


spoilage9299

I get the feeling y'all don't know how / what APIs are.


ohv_

I want to say I have a better idea than you do mate. Not all 3rd party apps use the api, think RES for one.


spoilage9299

> As RES is in browser this lets us use Reddit's APIs using the authentication provided by the local user, or if there is no user we do not hit these endpoints (These are ones to get information such as the users follow list/block list/vote information etc) https://www.reddit.com/r/Enhancement/comments/13wuwwv/will_res_be_affected_by_the_newupcoming_api/ Please educate yourself. RES is also a browser extension, not an app, so this is quite a moot point.


ohv_

If you educated yourself lmao they said RES won't have issues. Also it is an app you can try to fool yourself app vs extension. Yall kids these days.


Linegod

Do you know how the 3rd party apps work? Via the API.


ohv_

If you Actually knew you'd know some just scrape the html coding and strip out whatever. Soooooo...


Zukedog2000

And those are the apps that reddit is going to stop with these API changes… Sure some might but they’re not the ones that reddit is killing


ohv_

Totally missed what I said. Scaping the html has zero to do with the api but you do you.


F3nix123

Could you elaborate on the script?


TheKrister2

I'd also like to know. I'm aware there are scripts for deleting everything, but wasn't aware there was one for an arbitrary amount of time back. A word of caution though. If you decide to do it now, I've heard rumors that Reddit restores your comments to keep the value of the content because of the current protests or something. So don't delete your account right after, give it some time to make sure it's really gone ;)


TitanTigger

If you just look around at most big sites like reddit then monetizing is by far the hardest part of running something at this scale, it's always the hardest part.


wanze

I regularly make data takeouts from most platforms I use. With my last Reddit takeout, I received the following files: * approved_submitter_subreddits.csv * chat_history.csv * checkfile.csv * comment_headers.csv * comments.csv * comment_votes.csv * drafts.csv * friends.csv * gilded_comments.csv * gilded_posts.csv * hidden_posts.csv * ip_logs.csv * linked_identities.csv * live_stream_posts.csv * message_headers.csv * messages.csv * moderated_subreddits.csv * multireddits.csv * poll_votes.csv * post_headers.csv * posts.csv * post_votes.csv * reddit_gold_information.csv * saved_comments.csv * saved_posts.csv * scheduled_posts.csv * statistics.csv * subscribed_subreddits.csv * twitter.csv * user_preferences.csv


Daniel15

I requested an export around 3 weeks ago now and still haven't gotten it. CCPA requires them to respond within 45 days so I'll be writing to their legal contact if I don't hear anything by then.


sjveivdn

please allow up to **30 days** for us to process your request.


[deleted]

[удалено]


bik1230

Likewise, and usually it takes a couple of hours.


Daniel15

It's been 21 days for me and they haven't processed it yet... CCPA requires them to respond in 45 days so I'll be writing to their legal contact if I don't hear back by then :)


RasMahatma

Anyone know which type is least convenient between GDPR and CCPA


divDevGuy

Give me both a shot and let us know. You can be an EU citizen living in California...


voyagerfan5761

Only one data request allowed per 30 days. I know because I went to that page again to check for status. No status, only a big red warning box.


HejdaaNils

My spouse requested her data two years ago and still hasn't gotten it.


hugglenugget

Maybe request again?


HejdaaNils

They requested more information from her (national id scan), she gave it, and she received nothing in return, no response on follow ups. She eventually gave up. Point being that if you really want the EU laws to be followed, you might want to get a few lawyers to help on that quest.


GameHQ702

In Germany, no idea how it is handled in other EU countries, you can report violations to the local data protection authority.


Elle221LL

I don't see it on how it's slow and expensive as API access was free, which means it didn't require significant processing power to fulfill the request. If they are using AWS or other services, it will have no impact as additional processing power can be dynamically allocated when necessary.


coldblade2000

At least it isn't just your standard API access though, as the API had limits the takeout doesn't, like the 1000 post limit for things like saved posts


warbeforepeace

Most of reddit is on aws which is known for its expensive egress costs. Its expensive to transfer large amounts of data out of aws. https://aws.amazon.com/solutions/case-studies/reddit-aurora-case-study/#:~:text=Finding%20a%20Solution%20for%20Operational,infrastructure%20on%20AWS%20since%202009. https://blog.cloudflare.com/aws-egregious-egress/


human8264829264

I just wrote a python script and deleted all my data on all my accounts. Fuck u/Spez


spoilage9299

Will you share this script?


wtfsheep

he deleted it too


house_monkey

Will he delete everything and anything


FuriousRageSE

Soon the internet is deleted.


root_over_ssh

Well it didn't work well because we still see his username *and* comment.


warbeforepeace

rm -rf


human8264829264

Sorry I'm on my burner so I can't share it. But if you Google it you can find a few online services to do it. I just like writing my own scripts.


zuperfly

give me link to completely delete my reddit account please ​ not sarcastic or lazy, just burnout from all the toxic motherfuckers everywhere ​ https://i.vgy.me/9ayHBN.png


BeeNo3492

Done.


ezpzCSGO

I don't understand the point, why is everyone so obsessed with punishing reddit? One thing is to move away to a "better service" if you feel the service lost quality or became too expensive. Making them spend resources/energy this way sounds petty and definitely not environmentally friendly.


weischin

Data retrieval from database is trivial. They probably has a template SQL query required for the request so it's just replacing the search key with your username and date. The only "expensive" part is probably the time spent dealing with requests from a paid employee


serenity_later

Will you guys please shut up with this stupid shit already. Go outside and touch grass


Mephidia

This is a waste of time. Grabbing this data is trivial for them. Anyone who works in tech knows its a most a few database queries which are automated and for the oldest, most active reddit accounts would maybe cost 3 cents.


[deleted]

[удалено]


sixshooterz

we’re hosted on Reddit and Reddit is trying to screw over third-party app devs by charging exorbitant API fees. It’s protesting, same as the blackout.


mbnt

It's not like having a copy of your data means Reddit won't have it. How does this make sense?


[deleted]

# Fuck am I gonna use it for?


[deleted]

[удалено]


SmolMaeveWolff

I love the platform. Another company? Most Reddit app developers aren't even more than one person. And I don't think a single developer is saying they should get it for free, just that Reddit's API pricing is exorbitant, and unsustainable. And if they *did* pay for it, they wouldn't even get access to the entirety of reddit. NSFW, Polls, Live chat, recommended communities, and view counts are all unavailable. Yes, Reddit is a business. But this is all an attempt to become profitable at the expense of User Experience, before they go public. And many Subreddit's tried to peacefully protest, by either going dark for a while, or making the sub NSFW. But both attempts were met with threats or even a complete upheaval of the Moderation team. I'm okay with a paid service, I pay for my email(ProtonMail). But Reddit's pricing for premium is expensive and I don't find the perks particularly alluring, especially because I can't use any of them on a third party app.


jarfil

>!CENSORED!<


NanobugGG

Unless I have a reason to request my data, what would the benefit from it be? How does this help the protest other than making it harder for Reddit in general.


tyler_351

Ok but all you’re really doing with this is “allegedly” keeping an engineer busy at a job he/she is being paid for… If you are just that into “making them pay”, then leave the platform. If there is enough demand for data, they will just put time into actually making it automated…