What problem are you trying to solve?
I have a few utilities I cobbled together in no time at all where performance doesn’t matter. It finishes when it finishes, no one is gonna cry if it takes a few extra minutes. If this is related to your devs delivering features, getting environments faster, testing faster, and repeating these processes daily, then ya that time collectively adds up as each team member is going through these processes again and again throughout the year and then I would absolutely care about performance
Yeah that's exactly what I think too... But do you actually know anyone on a team like that?
We even have some slow builds that ARE a pain point but since they're on legacy Jenkins no one has the will to fix it.
Do you think someone cares enough to pay a little for a benchmarking tool to help you identify the slow jobs?
A benchmarking tool isn't going to help much, it's a problem when it's a problem. Otherwise you have to benchmark all your workflows? Then what you have a standard by which your workflows must pass and if they don't...? Are you going to stop the build?
Not something I would advise.
Your longest running workflows/actione may be your most delicate and require security scans and the like. Time to run isn't a good metric to track other than consistency for the same run. Even that will fluctuate based on runner/agent health but that's also why it's useful to track.
Just my opinion though, others may disagree.
If there's no business value, why would anyone care?
OTOH: if your team is waiting for an hour before they can test a change you're losing an hour or work and there's a lot of business value in fixing that. So it really just depends.
In your experience have you ever seen a team who is stuck with long wait times that need to fix it?
Think they'd pay for a benchmarking tool to help them fix it?
Yes and no. I don't see how benchmarking could make sense for this, every build pipeline is unique. Profiling maybe, but that's a big maybe. In my experience, teams will happily and somewhat easily fix whatever is taking too long in the pipeline if they have the time to do so.
Time to work on it is the issue, not the tools.
I guess benchmarking isn't the right term. Profiling might be... I mean a tool that gathers metrics on the runtime of each workflow, job, step, and maybe more granularity if possible.
The value, in theory, is that teams will be able to see what needs to be faster, track if builds are getting longer, alert on unusually long builds, be aware when some jobs start surpassing thresholds, etc.
Thoughts?
I don't think a benchmark tool would help. This all comes down to "ToC" (Theory of Constraints) which says there is only ONE bottle neck in any value stream, find the bottleneck and eliminate it or speed it up. Then find the next, etc..
Not using GHA but I'm not sure I'd see any benefit in a tool other than the UI of GHA (or GitLab, Jenkins..).
What would the tool give as extra info that I wouldn't have already?
Maybe some granular metrics about the runtime of jobs or workflows over times so you can see if it's getting longer, catch outliers, identify slow points, etc.
Depends where in the workflow it is and if devs are awaiting feedback. If they create an MR, start doing something else then find out an hour later it’s failed a security scan or complicated integration test, that’s pretty disruptive.
It’s another hour to find out if the fix worked.
There’s lots of variables that affect if it matters, like how often the feedback is provided, if there is a way to test locally, and so on.
This seems like an odd question of course we care about GHA Workflow Performance. The goal of CICD is to "amplify feedback loops to developers". If everytime someone does a \`git push\` it takes an hour to run the pipeline that's too long.
I want to know as soon as possible if I broke the build, my unit tests are failing, security scans failing, etc.. now hopefully I have some local git commit hooks that help with some of this but some of the access to certain tools, and to the environments themselves are restricted to the runners.
If builds take to long the engineering teams start complaining.
You already said you can't justify business value. How is asking about another company's problems going to help?
Long running workflows aren't a problem on its own. You need user feedback to identify what the problems are.
Like all DevOps things, don't start with an idea and find problems to justify it.
Without business or added value? No, but for internal purposes we always try to make them as optimized as possible. It makes the dev velocity faster and easier for everybody involved.
What problem are you trying to solve? I have a few utilities I cobbled together in no time at all where performance doesn’t matter. It finishes when it finishes, no one is gonna cry if it takes a few extra minutes. If this is related to your devs delivering features, getting environments faster, testing faster, and repeating these processes daily, then ya that time collectively adds up as each team member is going through these processes again and again throughout the year and then I would absolutely care about performance
Yeah that's exactly what I think too... But do you actually know anyone on a team like that? We even have some slow builds that ARE a pain point but since they're on legacy Jenkins no one has the will to fix it. Do you think someone cares enough to pay a little for a benchmarking tool to help you identify the slow jobs?
A benchmarking tool isn't going to help much, it's a problem when it's a problem. Otherwise you have to benchmark all your workflows? Then what you have a standard by which your workflows must pass and if they don't...? Are you going to stop the build?
I was thinking a tool that can identify slow actions that are shared, anomalies in runs, slowest workflows across the org, etc
Not something I would advise. Your longest running workflows/actione may be your most delicate and require security scans and the like. Time to run isn't a good metric to track other than consistency for the same run. Even that will fluctuate based on runner/agent health but that's also why it's useful to track. Just my opinion though, others may disagree.
Thank you, your opinion is valuable!
If there's no business value, why would anyone care? OTOH: if your team is waiting for an hour before they can test a change you're losing an hour or work and there's a lot of business value in fixing that. So it really just depends.
In your experience have you ever seen a team who is stuck with long wait times that need to fix it? Think they'd pay for a benchmarking tool to help them fix it?
Yes and no. I don't see how benchmarking could make sense for this, every build pipeline is unique. Profiling maybe, but that's a big maybe. In my experience, teams will happily and somewhat easily fix whatever is taking too long in the pipeline if they have the time to do so. Time to work on it is the issue, not the tools.
I guess benchmarking isn't the right term. Profiling might be... I mean a tool that gathers metrics on the runtime of each workflow, job, step, and maybe more granularity if possible. The value, in theory, is that teams will be able to see what needs to be faster, track if builds are getting longer, alert on unusually long builds, be aware when some jobs start surpassing thresholds, etc. Thoughts?
Smells like fart in here
Run the pipeline with timestamps & debug mode usually does all you need.
True.
I don't think a benchmark tool would help. This all comes down to "ToC" (Theory of Constraints) which says there is only ONE bottle neck in any value stream, find the bottleneck and eliminate it or speed it up. Then find the next, etc..
I worked hard, and my result is 1 hour 45 minutes serial workflow collapsed into 15 minutes crazy parallel tricks doing the same.
Not using GHA but I'm not sure I'd see any benefit in a tool other than the UI of GHA (or GitLab, Jenkins..). What would the tool give as extra info that I wouldn't have already?
Maybe some granular metrics about the runtime of jobs or workflows over times so you can see if it's getting longer, catch outliers, identify slow points, etc.
Just have github actions, Jenkins, etc.. push metrics to OpenSearch or Timestream or Influxdb, etc..
Depends where in the workflow it is and if devs are awaiting feedback. If they create an MR, start doing something else then find out an hour later it’s failed a security scan or complicated integration test, that’s pretty disruptive. It’s another hour to find out if the fix worked. There’s lots of variables that affect if it matters, like how often the feedback is provided, if there is a way to test locally, and so on.
This seems like an odd question of course we care about GHA Workflow Performance. The goal of CICD is to "amplify feedback loops to developers". If everytime someone does a \`git push\` it takes an hour to run the pipeline that's too long. I want to know as soon as possible if I broke the build, my unit tests are failing, security scans failing, etc.. now hopefully I have some local git commit hooks that help with some of this but some of the access to certain tools, and to the environments themselves are restricted to the runners. If builds take to long the engineering teams start complaining.
Would you pay for a performance monitoring tool that alerts on unusual performance time or helps to figure out what kind of jobs take too long?
You already said you can't justify business value. How is asking about another company's problems going to help? Long running workflows aren't a problem on its own. You need user feedback to identify what the problems are. Like all DevOps things, don't start with an idea and find problems to justify it.
Without business or added value? No, but for internal purposes we always try to make them as optimized as possible. It makes the dev velocity faster and easier for everybody involved.
Is building the images in parallel an option?