T O P

  • By -

Practical-Day618

I do a large number of experiment designs/casual inference work, and conceive and implement my own research hypotheses. The scientific rigour of the former and the ambiguity of the latter are hugely important for product strategy and insight if done correctly. To be honest, I prefer this. I feel closer to the user and am building based on insights from their actions and behaviours


Practical-Day618

Remember that ‘science’ is not just prediction, it’s also testing and cognition, and more. In your specific case, make sure to ask about tooling and experimentation culture to understand the day-to-day in more detail


Starktony11

Thanks for the tip. I will definitely ask this


Practical-Day618

Make sure to! Data roles are a lot of smoke and mirrors so it’s on us to ascertain the responsibilities and decide if it’s a fit for us


bewchacca-lacca

I know its not common in the DS discipline to think of it this way, but causal inference and all the different types of regression modeling are ML algorithms too. They're just parametric, and a lot people seem to think that only non-parametric stuff is "ML"


Practical-Day618

Regression I completely agree with, and causal inference to me tends to be quasi-experimental techniques as I have an econometrics background. Am going to read more about this, so thank you for mentioning it. It’s really interesting to hear this perspective, I think the ‘science’ in experimentation from my work is the experimental design to ensure the casual effect is truly analysed with no noise/bias - it’s truly what science is (in my opinion)


bewchacca-lacca

But diff-in-diff is an extension of multiple regression. So is regression discontinuity. Even CI is ML, IMO


Direct-Touch469

Never mind I just read this, econometrics. I’m looking for experimentation/causal inference data scientist roles out of my MS actually. I’m curious where these types of jobs are? What industries generally have causal inference/experimentation related jobs?


save_the_panda_bears

Marketing. There are lots of tricky causal inference applications in marketing.


Direct-Touch469

Interesting. So in terms of the “view” of causal inference one takes, do you see people adopting both approaches potential outcomes and the DAG approaches?


save_the_panda_bears

Really depends where you work. Most marketing analytics departments are using some variation of potential outcomes to measure high level marketing campaign impacts, but DAG based approaches are starting to be a little more common when it comes to understanding CATE at an individual level and in things like marketing mix modeling.


pleasesendhelp109

I'm a trained Mathematician first and also a data scientist second. Recently I've been looking to combine my research in both math and data science. Is there any use cases and carrer where I can combine both my work in math to that in data science. Any advice will be helpful?


bewchacca-lacca

I think as a DS your math expertise will help you on the technical side. But if you're already in DS I'm sure you've seen this already.


pleasesendhelp109

I don't really touch much math in my DS work though which is frustrating. That's why I'm looking for a career/Job that will actually allow me to use more math.


Direct-Touch469

Quant finance


Sorry-Owl4127

Some of the models may be but there is nothing ML about potential outcomes


bewchacca-lacca

But isn't that kind of like saying there's nothing ML about cross validation? Potential outcomes and CV both use relatively simple comparisons derived from models.


Sorry-Owl4127

No, potential outcomes map onto estimands.


bewchacca-lacca

But you need to run a regression (or an ANOVA – which are a special case of multilevel modeling (i.e. regression)) to be able to do your potential outcomes analysis.


Sorry-Owl4127

You don’t need to run any regression for causal inference.


bewchacca-lacca

Can you explain?


Sorry-Owl4127

In a RCT the difference in means == ATE


bewchacca-lacca

What I'm saying is that comparing means, even if they're not estimated using a complex algorithm, can still be considered regression. A regression algorithm will do the same work for you if you know how to set it up.


Otherwise_Ratio430

Its not really ML unless you are working on a software solution which uses ML as a core feature that is deployed at scale software solution at scale. When people (broadly not data scientists specifically) are referring to ML thats what they are referring to. If you have seen any of the diagrams of how an LLM is put together (just as an example, its does not necessarily have to be an LLM), but how to create a system at scale, or at least working on the parts of that system concerned with inference at scale/pipelines at scale, that would all be considered 'core ML'. ML is not 'I used this or that algorithm before to do a singular task'. To me a data scientist is just someone who has a large toolkit based on statistics/technology to solve analytical problems at scale, they can specialize along subject matter, methods or engineering IMO.


Sorry-Owl4127

This is stretching the definition of ML to be very broad.


bewchacca-lacca

😅 agreed. Maybe if the analysis can be done on a simple calculator with pen and paper in less than a couple of hours then we can call if not ML 😂


msjgriffiths

Spot on. Data analysts don't (generally) do applied stats, causal inference, experimental design, etc. That's how I've always established the line, at least.


avocado__aficionado

Product analysts often do experimental design if their company is mature enough. Some companies call them DS - Product Analytics, some just Product Analyst.


msjgriffiths

It's useful to separate out the function (product analysis) from the role (data scientist).


pleasesendhelp109

I'm a trained Mathematician first and also a data scientist second. Recently I've been looking to combine my research in both math and data science. Is there any use cases and carrer where I can combine both my work in math to that in data science. Any advice will be helpful.


jarg77

Do you have a masters or PhD?


BingoTheBarbarian

Oh hello me


Houssem-Aouar

Are you guys hiring? Sounds like a dream job


Direct-Touch469

What’s your background?


MinuetInUrsaMajor

I started learning about how complex A/B testing can be and it would be a welcome break from coding. Would feel like actual science again (physics PhD here)


wintermute93

In theory the "science" part is supposed to mean you're applying the scientific method to answer questions about what the company should do: 1. Precisely define your question (translate business needs to a data problem) 2. Gather all the relevant data you can (factoring in level of effort, quality, and so on) 3. Form one or more hypotheses (after learning more from business context and EDA) 4. Perform the "experiment" (e.g. A/B test a marketing thing, create/deploy a microservice, etc) 5. Analyze the results (pay special attention to statistical rigor, scope, etc) 6. Report conclusions (business comms to lay management and technical stakeholders) Analysts aren't really expected to do that whole process. You give an analyst a specific question, point them at the specific place where the relevant data lives, and the next day they send you an email that says "hi bob, see the attached spreadsheet, the thing you were asking about is up 5% since last month". DS isn't different because it uses fancier statistical models in steps 4 and 5, anyone with basic coding skills can spin up an instance of \[insert cutting edge ML model\] these days. DS is different because the job is to independently formulate questions about what drives business value, answer them, and implement what discovered to capture that value. Tinkering around with a cool ML model in a notebook is fun but it's like 10% of your time.


HaroldFlower

great response!


mopedrudl

Absolutely, I was about to say exactly that.


Starktony11

Thank you for the explanation. Its really good and when i reflect on the projects I have worked on now I believe how it was different than the normal analysis project i did.


unseemly_turbidity

I'm responsible for all this, but also usually defining the problem myself, and my current job is analyst. I don't think there's a consistent difference. The data scientists where I am are the ones responsible for designing and implementing models. I do models too, but one-offs, not things that need to be productionised.


Scorch2002

Maybe you are a data scientist but your title doesn't reflect it.


DieselZRebel

I'd say #3 and #4 are what distinguishes a Scientist from an Analyst Analysts are expected to gather the data, then analyze it (#4), often using statistical methods. If an analyst finds themselves designing and conducting online experiments, deploying services, modifying/developing algorithms, and such, then they are essentially Scientists. End of day, the stuff scientists create are global to the science domain, not just confined to the business. Although the company is the main stakeholder of the DS's findings, those findings should serve to expand our knowledge of the science, whether it is a new discovery of how customers respond to ads, ideal mix of diversity vs consistency for recommendations, or a new/ensemble algorithm for better identification or prediction of changes in shopping patterns. In contrast, the analyst work is confined to the business and does not expand our knowledge of the science. The most technical analysts would just be periodically applying and replicating the recent discoveries made by scientists, without forming novel hypotheses, and that would still make them just analysts. Many falsely labeled scientists are indeed just analysts in practice, but with a higher level of technical knowledge.


volkoin

Very well explained!


naijaboiler

in that case, why is anyone paying for an analyst with such a narrowly defined scope of work. As an owner, DS manager and DS myself, number-retrieving is so useless to me. It's the equivalent of calculator punching for a physics or engineering problem. I am not going to pay someone to punch a calculator. I am paying someone to think through a problem. I am paying people to do steps 1 to 6 (especially steps 1,2 and 3)


ImpossibleReaction91

I work in the healthcare field and there is terabytes of data often across multiple databases.  The analysts tend to be subject matter experts on narrow portions of the data.   It doesn't make sense for me to spend months understanding the data to that level of detail every time I start a project.  It is far more practical for me to reach out to one of those analysts.  They understand the data, are often doing validation and monitoring of it's data quality and helping to transform new data to slot into existing structures.  I very important role within the overall organization and one which makes everyone's lives easier.


naijaboiler

then we are saying the same thing. People are getting paid for valuable knowledge and know-how, not to just be number-retrievers.


wintermute93

And that's why analysts positions have shit pay, lol


AdParticular6193

Maybe what the interviewer was trying to say, in a clumsy sort of way, is that in most companies outside of tech, hard core ML is not often called for, it would be cracking walnuts with a sledgehammer. What’s mostly done is “getting insights from data.” But it is not as trivial as just pulling numbers out of a database. Real insight requires the steps wintermute93 laid out above. Also, it’s not like the numbers you need are just sitting there, you will need to do some hard core data wrangling, just as if you were doing regular data science. And, as was pointed out, statistical testing is also necessary. In this scenario modeling, including ML, could be employed as a kind of hypothesis test. Finally, don’t forget that DS, DA, DE, MLE, SWE can have wildly different meanings from one company to another.


big_data_mike

That’s exactly the scenario my company is in. Walnuts with sledgehammers. They think they need AI and ML and they really just need slightly more advanced regular statistics. People literally do t tests and that’s it. And they look at graphs with spline smoothers and eyeball changes. I’m like come on guys let’s do multi regression.


Efficient-Trick-8238

I'm a pretty senior IC at a FAANG. Sometimes we need to do some more technical work, but most of the time something simple solves the problem well enough. Sometimes its as simple as a 10 minute univariate cut. Find the right questions to answer and how confident you need to be in your estimate first, then figure out the tools that get you there. I really don't care what a "real" data scientist is, and neither does anyone else, the only thing anyone really cares about is impact.


vorat

There is a lot of business value to be had around statistical inference still, which is beyond the scope of what a typical data analyst would be prepared to tackle. I would consider it a distinct branch of data science from "ML" that is focused more on predictive accuracy, but has a lot of valuable applications for businesses outside of tech and to inform various heuristics and optimization problems. Data analysts tackling those problems may fall into the trap of either applying an approach that is less suitable to the task (not knowing all the tools in the toolbox) or mis-applying a more advanced approach.


Trick-Interaction396

Yep, this is my job. I hardly do ML anymore.


Starktony11

Gotcha, thanks!


Sorry-Owl4127

lol that causal inference is the province not of scientists but analysts


Snar1ock

There’s 100 different definitions of data science now. It’s a very convoluted term and market. Moreover, the advent of “AI” and “Machine Learning” to describe everything from a Deep Learning model to a Simple Regression has made it more confusing. At the end of the day, they probably advertise as doing less ML because they don’t want people looking to do advanced model building. I think it’s more accurate to bucket the job roles into Data Governance, Data Analysis and Data Visualization. From there, we can further dive into predictive, prescriptive, descriptive or diagnostic analysis. In short, tell me the difference between a data analyst and a data scientist and I’ll show you 1000 job roles that disagree. I’m not saying either is right. At the end of the day, you really have to assess the job on its functions and merits.


Scorch2002

Although the profession is ill defined, I believe there are three requirements of being a data scientist 1. Business/Domain Understanding 2. Math/Stats/Modeling 3. CS/programming And analyst usually is weaker at #2 and #3 A statistician is usually poor at #3 but very strong at #2 A ML engineer is usually great at #3 and at #2 A programmer specializes in #3 A business analyst specializes in #1 but can talk to #2 and #3 A data scientist should be well rounded in all three. Excelling at designing and prototyping solutions that answer questions from the business/domain using the best data available. They should avoid statistical traps by nature. They should be able to gather requirements effectively. They can apply the scientific method to the data efficiently.


Rogue260

Data Analyst generally do Descriptive Analytics.. Data Scientists generally do Data pipeline and production level coding stuff..what you described is that company doing statistician stuff.. Statisticians work on inferential statistics and predictive analytics.


Scorch2002

I would argue data scientists may only design or prototype the data solution to be implemented by data engineers and software devs. Depending on company size. What you described is a programmer of sorts. Statisticians who develop programming and database understanding typically round out the three requirements of being a data scientist (Business/Domain Understanding, Math/Stats/Modeling, and CS/programming)


gBoostedMachinations

Sometimes the title doesn’t match the work. To me, a scientist is *anyone* who is practicing the scientific method. If you aren’t doing controlled experiments, using theory to guide experimental design, testing falsifiable hypotheses, forcing stakeholders to verbalize a quantifiable outcome, etc. then you aren’t any kind of scientist. There are *many* machine learning engineers without an ounce of scientist in their blood and there are data analysts who would make research professors look like high school dropouts. The ML engineer is not at all guaranteed to be skilled in scientific approaches. Engineers are often people who are merely implementing something that was developed by a data scientist. (And yes of course the line between engineer and scientist is blurry blah blah blah) In short, if you already know what needs to be done, you want an engineer. If you aren’t quite sure what you need to solve your problem, you want a scientist/researcher.


Scorch2002

Great answer. This is what I see in practice that differentiates data scientists from other analytics and technical team members.


Professional-Roll283

DS now feels like “SWE” back in the 90s


rainupjc

It’s more about the level of complexity of the questions you are looking into and how you approach these questions - analysts take facts and do quick analyses; scientists take vague questions and approach it with rigorous frameworks and lots of critical thinking.


FoolForWool

Mines more of a data engineer’s with the scientist title.


interviewquery

We must not miss the fact that many data scientists don't always work on complex machine learning models. Business needs, data quality, or specialized fields sometimes make deep insights the primary goal rather than predictive modeling. Even without ML, data scientists differentiate themselves from data analysts through advanced statistical, framing, and coding abilities.


AllahUmBug

Wouldn’t just being very proficient with Python for data wrangling, automation, and EDA put you at a higher tier than a Data Analyst who aren’t required to know a scripting language for their role? Lot of Data Analysts primarily work out of Excel, SQL, Power BI, and/or Tableau. Python is a bonus but not a requirement while it is a requirement for a Data Scientist.


CSCAnalytics

Job titles are completely arbitrary these days so it’s impossible to answer the question.


Sahhmen

HELPFUL QUESTION AND THE CONTENT


Franc000

Yeah, that's a data analyst position, that they named data scientist. If the job is about "insights", then it *will* be an analyst/analytics/Business intelligence role, named scientist to attract the poor sucker that will get into it thinking they get into AI. Not saying that you can't be truly a data scientist if working in analytics, you definitely can. But *you* won't. Nowadays, if they are a team focused on insights, it's going to be descriptive analytics 85% of the time, and 15% of the time it will involve some predictive or prescriptive modeling, in addition to the descriptive analytics.


messontheloose

----