My approach might be called "Risk Driven Development"–the idea is to view software projects as a process of iteratively reducing uncertainty, and to structure your workflow around mitigating the most important risks. Agile is a special case of this, where the main risk is subtle details of customer requirements, but many times this is not the biggest risk in a project. I manage a data science team, and this is almost never our main risk. Instead, we've had:
–Technology risk: the project requires new libraries, some of which may be immature or have show-stopping bugs. You want to identify these by building PoCs quickly, and switch to alternatives early if needed.
–Assumption risk: one project focused on taking a statistical algorithm that was producing "ok" results, and making it better. We assumed that this mostly involved fixing the algorithm. However, the algorithm's performance was actually quite good, and most of the issues were caused by downstream bugs. We would have saved several weeks if we ran some quick tests to verify this early on.
–Coordination risk: if you have a lot of engineers working on different components of a project, you need to make sure those components mesh well together. I usually try to specify the interfaces of each component in great detail before any work starts, e.g. specifying all the columns in a table, or lots of types in code.
–Data size risk. With distributed systems, often you have a system that works on 10gb of input and fails on 100gb. This is usually due to some piece of code that's convenient but not performant. Thus you want to test with large datasets early and often, so that you don't end up baking the convenient but not-performant patterns into your project.
–Technology risk: the project requires new libraries, some of which may be immature or have show-stopping bugs. You want to identify these by building PoCs quickly, and switch to alternatives early if needed.
–Assumption risk: one project focused on taking a statistical algorithm that was producing "ok" results, and making it better. We assumed that this mostly involved fixing the algorithm. However, the algorithm's performance was actually quite good, and most of the issues were caused by downstream bugs. We would have saved several weeks if we ran some quick tests to verify this early on.
–Coordination risk: if you have a lot of engineers working on different components of a project, you need to make sure those components mesh well together. I usually try to specify the interfaces of each component in great detail before any work starts, e.g. specifying all the columns in a table, or lots of types in code.
–Data size risk. With distributed systems, often you have a system that works on 10gb of input and fails on 100gb. This is usually due to some piece of code that's convenient but not performant. Thus you want to test with large datasets early and often, so that you don't end up baking the convenient but not-performant patterns into your project.