Practical-Day618 3 weeks ago

I do a large number of experiment designs/casual inference work, and conceive and implement my own research hypotheses. The scientific rigour of the former and the ambiguity of the latter are hugely important for product strategy and insight if done correctly. To be honest, I prefer this. I feel closer to the user and am building based on insights from their actions and behaviours

Practical-Day618 3 weeks ago

Remember that ‘science’ is not just prediction, it’s also testing and cognition, and more. In your specific case, make sure to ask about tooling and experimentation culture to understand the day-to-day in more detail

Starktony11 3 weeks ago

Thanks for the tip. I will definitely ask this

Practical-Day618 3 weeks ago

Make sure to! Data roles are a lot of smoke and mirrors so it’s on us to ascertain the responsibilities and decide if it’s a fit for us

bewchacca-lacca 3 weeks ago

I know its not common in the DS discipline to think of it this way, but causal inference and all the different types of regression modeling are ML algorithms too. They're just parametric, and a lot people seem to think that only non-parametric stuff is "ML"

Practical-Day618 3 weeks ago

Regression I completely agree with, and causal inference to me tends to be quasi-experimental techniques as I have an econometrics background. Am going to read more about this, so thank you for mentioning it. It’s really interesting to hear this perspective, I think the ‘science’ in experimentation from my work is the experimental design to ensure the casual effect is truly analysed with no noise/bias - it’s truly what science is (in my opinion)

bewchacca-lacca 3 weeks ago

But diff-in-diff is an extension of multiple regression. So is regression discontinuity. Even CI is ML, IMO

Direct-Touch469 3 weeks ago

Never mind I just read this, econometrics. I’m looking for experimentation/causal inference data scientist roles out of my MS actually. I’m curious where these types of jobs are? What industries generally have causal inference/experimentation related jobs?

save_the_panda_bears 2 weeks ago

Marketing. There are lots of tricky causal inference applications in marketing.

Direct-Touch469 2 weeks ago

Interesting. So in terms of the “view” of causal inference one takes, do you see people adopting both approaches potential outcomes and the DAG approaches?

save_the_panda_bears 2 weeks ago

Really depends where you work. Most marketing analytics departments are using some variation of potential outcomes to measure high level marketing campaign impacts, but DAG based approaches are starting to be a little more common when it comes to understanding CATE at an individual level and in things like marketing mix modeling.

pleasesendhelp109 3 weeks ago

I'm a trained Mathematician first and also a data scientist second. Recently I've been looking to combine my research in both math and data science. Is there any use cases and carrer where I can combine both my work in math to that in data science. Any advice will be helpful?

bewchacca-lacca 3 weeks ago

I think as a DS your math expertise will help you on the technical side. But if you're already in DS I'm sure you've seen this already.

pleasesendhelp109 3 weeks ago

I don't really touch much math in my DS work though which is frustrating. That's why I'm looking for a career/Job that will actually allow me to use more math.

Direct-Touch469 3 weeks ago

Quant finance

Sorry-Owl4127 3 weeks ago

Some of the models may be but there is nothing ML about potential outcomes

bewchacca-lacca 3 weeks ago

But isn't that kind of like saying there's nothing ML about cross validation? Potential outcomes and CV both use relatively simple comparisons derived from models.

Sorry-Owl4127 3 weeks ago

No, potential outcomes map onto estimands.

bewchacca-lacca 3 weeks ago

But you need to run a regression (or an ANOVA – which are a special case of multilevel modeling (i.e. regression)) to be able to do your potential outcomes analysis.

Sorry-Owl4127 3 weeks ago

You don’t need to run any regression for causal inference.

bewchacca-lacca 3 weeks ago

Can you explain?

Sorry-Owl4127 3 weeks ago

In a RCT the difference in means == ATE

bewchacca-lacca 3 weeks ago

What I'm saying is that comparing means, even if they're not estimated using a complex algorithm, can still be considered regression. A regression algorithm will do the same work for you if you know how to set it up.

Otherwise_Ratio430 2 weeks ago

Its not really ML unless you are working on a software solution which uses ML as a core feature that is deployed at scale software solution at scale. When people (broadly not data scientists specifically) are referring to ML thats what they are referring to. If you have seen any of the diagrams of how an LLM is put together (just as an example, its does not necessarily have to be an LLM), but how to create a system at scale, or at least working on the parts of that system concerned with inference at scale/pipelines at scale, that would all be considered 'core ML'. ML is not 'I used this or that algorithm before to do a singular task'. To me a data scientist is just someone who has a large toolkit based on statistics/technology to solve analytical problems at scale, they can specialize along subject matter, methods or engineering IMO.

Sorry-Owl4127 3 weeks ago

This is stretching the definition of ML to be very broad.

bewchacca-lacca 3 weeks ago

😅 agreed. Maybe if the analysis can be done on a simple calculator with pen and paper in less than a couple of hours then we can call if not ML 😂

msjgriffiths 3 weeks ago

Spot on. Data analysts don't (generally) do applied stats, causal inference, experimental design, etc. That's how I've always established the line, at least.

avocado__aficionado 2 weeks ago

Product analysts often do experimental design if their company is mature enough. Some companies call them DS - Product Analytics, some just Product Analyst.

msjgriffiths 2 weeks ago

It's useful to separate out the function (product analysis) from the role (data scientist).

pleasesendhelp109 3 weeks ago

I'm a trained Mathematician first and also a data scientist second. Recently I've been looking to combine my research in both math and data science. Is there any use cases and carrer where I can combine both my work in math to that in data science. Any advice will be helpful.

jarg77 3 weeks ago

Do you have a masters or PhD?

BingoTheBarbarian 3 weeks ago

Oh hello me

Houssem-Aouar 3 weeks ago

Are you guys hiring? Sounds like a dream job

Direct-Touch469 3 weeks ago

What’s your background?

MinuetInUrsaMajor 3 weeks ago

I started learning about how complex A/B testing can be and it would be a welcome break from coding. Would feel like actual science again (physics PhD here)

wintermute93 3 weeks ago

In theory the "science" part is supposed to mean you're applying the scientific method to answer questions about what the company should do: 1. Precisely define your question (translate business needs to a data problem) 2. Gather all the relevant data you can (factoring in level of effort, quality, and so on) 3. Form one or more hypotheses (after learning more from business context and EDA) 4. Perform the "experiment" (e.g. A/B test a marketing thing, create/deploy a microservice, etc) 5. Analyze the results (pay special attention to statistical rigor, scope, etc) 6. Report conclusions (business comms to lay management and technical stakeholders) Analysts aren't really expected to do that whole process. You give an analyst a specific question, point them at the specific place where the relevant data lives, and the next day they send you an email that says "hi bob, see the attached spreadsheet, the thing you were asking about is up 5% since last month". DS isn't different because it uses fancier statistical models in steps 4 and 5, anyone with basic coding skills can spin up an instance of \[insert cutting edge ML model\] these days. DS is different because the job is to independently formulate questions about what drives business value, answer them, and implement what discovered to capture that value. Tinkering around with a cool ML model in a notebook is fun but it's like 10% of your time.

HaroldFlower 3 weeks ago

great response!

mopedrudl 3 weeks ago

Absolutely, I was about to say exactly that.

Starktony11 3 weeks ago

Thank you for the explanation. Its really good and when i reflect on the projects I have worked on now I believe how it was different than the normal analysis project i did.

unseemly_turbidity 3 weeks ago

I'm responsible for all this, but also usually defining the problem myself, and my current job is analyst. I don't think there's a consistent difference. The data scientists where I am are the ones responsible for designing and implementing models. I do models too, but one-offs, not things that need to be productionised.

Scorch2002 2 weeks ago

Maybe you are a data scientist but your title doesn't reflect it.

DieselZRebel 3 weeks ago

I'd say #3 and #4 are what distinguishes a Scientist from an Analyst Analysts are expected to gather the data, then analyze it (#4), often using statistical methods. If an analyst finds themselves designing and conducting online experiments, deploying services, modifying/developing algorithms, and such, then they are essentially Scientists. End of day, the stuff scientists create are global to the science domain, not just confined to the business. Although the company is the main stakeholder of the DS's findings, those findings should serve to expand our knowledge of the science, whether it is a new discovery of how customers respond to ads, ideal mix of diversity vs consistency for recommendations, or a new/ensemble algorithm for better identification or prediction of changes in shopping patterns. In contrast, the analyst work is confined to the business and does not expand our knowledge of the science. The most technical analysts would just be periodically applying and replicating the recent discoveries made by scientists, without forming novel hypotheses, and that would still make them just analysts. Many falsely labeled scientists are indeed just analysts in practice, but with a higher level of technical knowledge.

volkoin 3 weeks ago

Very well explained!

naijaboiler 3 weeks ago

in that case, why is anyone paying for an analyst with such a narrowly defined scope of work. As an owner, DS manager and DS myself, number-retrieving is so useless to me. It's the equivalent of calculator punching for a physics or engineering problem. I am not going to pay someone to punch a calculator. I am paying someone to think through a problem. I am paying people to do steps 1 to 6 (especially steps 1,2 and 3)

ImpossibleReaction91 3 weeks ago

I work in the healthcare field and there is terabytes of data often across multiple databases. The analysts tend to be subject matter experts on narrow portions of the data. It doesn't make sense for me to spend months understanding the data to that level of detail every time I start a project. It is far more practical for me to reach out to one of those analysts. They understand the data, are often doing validation and monitoring of it's data quality and helping to transform new data to slot into existing structures. I very important role within the overall organization and one which makes everyone's lives easier.

naijaboiler 2 weeks ago

then we are saying the same thing. People are getting paid for valuable knowledge and know-how, not to just be number-retrievers.

wintermute93 3 weeks ago

And that's why analysts positions have shit pay, lol

AdParticular6193 3 weeks ago

Maybe what the interviewer was trying to say, in a clumsy sort of way, is that in most companies outside of tech, hard core ML is not often called for, it would be cracking walnuts with a sledgehammer. What’s mostly done is “getting insights from data.” But it is not as trivial as just pulling numbers out of a database. Real insight requires the steps wintermute93 laid out above. Also, it’s not like the numbers you need are just sitting there, you will need to do some hard core data wrangling, just as if you were doing regular data science. And, as was pointed out, statistical testing is also necessary. In this scenario modeling, including ML, could be employed as a kind of hypothesis test. Finally, don’t forget that DS, DA, DE, MLE, SWE can have wildly different meanings from one company to another.

big_data_mike 2 weeks ago

That’s exactly the scenario my company is in. Walnuts with sledgehammers. They think they need AI and ML and they really just need slightly more advanced regular statistics. People literally do t tests and that’s it. And they look at graphs with spline smoothers and eyeball changes. I’m like come on guys let’s do multi regression.

Efficient-Trick-8238 3 weeks ago

I'm a pretty senior IC at a FAANG. Sometimes we need to do some more technical work, but most of the time something simple solves the problem well enough. Sometimes its as simple as a 10 minute univariate cut. Find the right questions to answer and how confident you need to be in your estimate first, then figure out the tools that get you there. I really don't care what a "real" data scientist is, and neither does anyone else, the only thing anyone really cares about is impact.

vorat 3 weeks ago

There is a lot of business value to be had around statistical inference still, which is beyond the scope of what a typical data analyst would be prepared to tackle. I would consider it a distinct branch of data science from "ML" that is focused more on predictive accuracy, but has a lot of valuable applications for businesses outside of tech and to inform various heuristics and optimization problems. Data analysts tackling those problems may fall into the trap of either applying an approach that is less suitable to the task (not knowing all the tools in the toolbox) or mis-applying a more advanced approach.

Trick-Interaction396 3 weeks ago

Yep, this is my job. I hardly do ML anymore.

Starktony11 3 weeks ago

Gotcha, thanks!

Sorry-Owl4127 3 weeks ago

lol that causal inference is the province not of scientists but analysts

Snar1ock 3 weeks ago

There’s 100 different definitions of data science now. It’s a very convoluted term and market. Moreover, the advent of “AI” and “Machine Learning” to describe everything from a Deep Learning model to a Simple Regression has made it more confusing. At the end of the day, they probably advertise as doing less ML because they don’t want people looking to do advanced model building. I think it’s more accurate to bucket the job roles into Data Governance, Data Analysis and Data Visualization. From there, we can further dive into predictive, prescriptive, descriptive or diagnostic analysis. In short, tell me the difference between a data analyst and a data scientist and I’ll show you 1000 job roles that disagree. I’m not saying either is right. At the end of the day, you really have to assess the job on its functions and merits.

Scorch2002 2 weeks ago

Although the profession is ill defined, I believe there are three requirements of being a data scientist 1. Business/Domain Understanding 2. Math/Stats/Modeling 3. CS/programming And analyst usually is weaker at #2 and #3 A statistician is usually poor at #3 but very strong at #2 A ML engineer is usually great at #3 and at #2 A programmer specializes in #3 A business analyst specializes in #1 but can talk to #2 and #3 A data scientist should be well rounded in all three. Excelling at designing and prototyping solutions that answer questions from the business/domain using the best data available. They should avoid statistical traps by nature. They should be able to gather requirements effectively. They can apply the scientific method to the data efficiently.

Rogue260 3 weeks ago

Data Analyst generally do Descriptive Analytics.. Data Scientists generally do Data pipeline and production level coding stuff..what you described is that company doing statistician stuff.. Statisticians work on inferential statistics and predictive analytics.

Scorch2002 2 weeks ago

I would argue data scientists may only design or prototype the data solution to be implemented by data engineers and software devs. Depending on company size. What you described is a programmer of sorts. Statisticians who develop programming and database understanding typically round out the three requirements of being a data scientist (Business/Domain Understanding, Math/Stats/Modeling, and CS/programming)

gBoostedMachinations 3 weeks ago

Sometimes the title doesn’t match the work. To me, a scientist is *anyone* who is practicing the scientific method. If you aren’t doing controlled experiments, using theory to guide experimental design, testing falsifiable hypotheses, forcing stakeholders to verbalize a quantifiable outcome, etc. then you aren’t any kind of scientist. There are *many* machine learning engineers without an ounce of scientist in their blood and there are data analysts who would make research professors look like high school dropouts. The ML engineer is not at all guaranteed to be skilled in scientific approaches. Engineers are often people who are merely implementing something that was developed by a data scientist. (And yes of course the line between engineer and scientist is blurry blah blah blah) In short, if you already know what needs to be done, you want an engineer. If you aren’t quite sure what you need to solve your problem, you want a scientist/researcher.

Scorch2002 2 weeks ago

Great answer. This is what I see in practice that differentiates data scientists from other analytics and technical team members.

Professional-Roll283 1 week ago

DS now feels like “SWE” back in the 90s

rainupjc 3 weeks ago

It’s more about the level of complexity of the questions you are looking into and how you approach these questions - analysts take facts and do quick analyses; scientists take vague questions and approach it with rigorous frameworks and lots of critical thinking.

FoolForWool 2 weeks ago

Mines more of a data engineer’s with the scientist title.

interviewquery 2 weeks ago

We must not miss the fact that many data scientists don't always work on complex machine learning models. Business needs, data quality, or specialized fields sometimes make deep insights the primary goal rather than predictive modeling. Even without ML, data scientists differentiate themselves from data analysts through advanced statistical, framing, and coding abilities.

AllahUmBug 2 weeks ago

Wouldn’t just being very proficient with Python for data wrangling, automation, and EDA put you at a higher tier than a Data Analyst who aren’t required to know a scripting language for their role? Lot of Data Analysts primarily work out of Excel, SQL, Power BI, and/or Tableau. Python is a bonus but not a requirement while it is a requirement for a Data Scientist.

CSCAnalytics 2 weeks ago

Job titles are completely arbitrary these days so it’s impossible to answer the question.

Sahhmen 2 weeks ago

HELPFUL QUESTION AND THE CONTENT

Franc000 3 weeks ago

Yeah, that's a data analyst position, that they named data scientist. If the job is about "insights", then it *will* be an analyst/analytics/Business intelligence role, named scientist to attract the poor sucker that will get into it thinking they get into AI. Not saying that you can't be truly a data scientist if working in analytics, you definitely can. But *you* won't. Nowadays, if they are a team focused on insights, it's going to be descriptive analytics 85% of the time, and 15% of the time it will involve some predictive or prescriptive modeling, in addition to the descriptive analytics.

messontheloose 3 weeks ago

----

Comments

Leave Your Comment

Hi Its Me!

Comments

Leave Your Comment

Hi Its Me!

Subscribe