Well, I'm biased but also experienced (former academic, now industrial practitioner doing a startup). My advice to the people who think that industrial ML/AI is "just applying some base cunning to 1970s problems" is to try to trade equities and generate durable above-benchmark profits over 2-3 years (where the industrial state of the art controls hundreds of billions per year and the competition can pay $300k+/year for fresh-out-of-grad-school talent). Algorithmic equities trading is sort of the "UFC" of ML; money talks and, er, bovine byproducts walk. Having said that, I'm now out of equities trading because I figured having a startup was more than enough risk for me; so I'm somewhere between talking and walking, I guess :P.
In terms of books: I recommend grabbing as many domain-specific books as possible rather than general-purpose ML books. Look at bioinformatics, speech processing, text processing, image processing, algorithmic trading, epidemiology, system identification, adaptive filtering, etc.; each of these disciplines has its own approach to signal/feature extraction, and ML gives you a unified way to fuse multiple signals into an estimate/decision. In my experience the tricks of the trade arise from learning lots of domain-specific "hacks" and thinking about how they generalize to other problems (one example: look at the Viola-Jones feature extractor for images and think about how you might apply that in, e.g., equities trading).
Just like with programming, practical ML is best learned by just solving a bunch of problems and learning what works (informed by a theoretical framework about what can't possibly work :-)
In terms of books: I recommend grabbing as many domain-specific books as possible rather than general-purpose ML books. Look at bioinformatics, speech processing, text processing, image processing, algorithmic trading, epidemiology, system identification, adaptive filtering, etc.; each of these disciplines has its own approach to signal/feature extraction, and ML gives you a unified way to fuse multiple signals into an estimate/decision. In my experience the tricks of the trade arise from learning lots of domain-specific "hacks" and thinking about how they generalize to other problems (one example: look at the Viola-Jones feature extractor for images and think about how you might apply that in, e.g., equities trading).
Just like with programming, practical ML is best learned by just solving a bunch of problems and learning what works (informed by a theoretical framework about what can't possibly work :-)