Demonstrating Your Machine Learning Expertise: Optimizing Breadth vs. Depth

Developing Expertise

When it comes to developing your expertise in Machine Learning, there seem to be two schools of thought:

  • Exemplified by articles that purport to have listed, for example, the 10-most important methods you need to know to ace a Machine Learning interview, the School of Breadth emphasizes content-oriented objectives. By amping up with courses/workshops to programs (e.g., certificates, degrees) then, the justification for broadening your knowledge of Machine Learning is self-evident.
  • Find data that interests you, and work with it using a single approach for Machine Learning. Thus the School of Depth emphasizes skills-oriented objectives that are progressively mastered as you delve into data, or better yet, a problem of interest.

Depending upon whichever factors you currently have under consideration then (e.g., career stage, employment status, desired employment trajectory, …), breadth versus depth may result in an existential crisis when it comes to developing and ultimately demonstrating your expertise in Machine Learning – with a modicum of apologies if that strikes you as a tad melodramatic.

Demonstrating Expertise

Somewhat conflicted, at least, is in all honesty how I feel at the moment myself.

On Breadth

Even a rapid perusal of the Machine Learning specific artifacts I’ve self-curated into my online, multimedia Data Science Portfolio makes one thing glaringly evident: The breadth of my exposure to Machine Learning has been somewhat limited. Specifically, I have direct experience with classification and Natural Language Processing in Machine Learning contexts from the practitioner’s perspective. The more-astute reviewer, however, might look beyond the ‘pure ML’ sections of my portfolio and afford me additional merit for (say) my mathematical and/or physical sciences background, plus my exposure to concepts directly or indirectly applicable to Machine Learning – e.g., my experience as a scientist with least-squares modeling counting as exposure at a conceptual level to regression (just to keep this focused on breadth, for the moment).

True confession: I’ve started more than one course in Machine Learning in a blunt-instrument attempt to address this known gap in my knowledge of relevant methods. Started is, unfortunately, the operative word, as (thus far) any attempt I’ve made has not been followed through – even when there are options for community, accountability, etc. to better-ensure success. (Though ‘life got in the way’ of me participating fully in the fast.ai study group facilitated by the wonderful team that delivers the This Week in Machine Learning & AI Podcast, such approaches to learning Machine Learning are appealing in principle – even though my own engagement was grossly inconsistent.)

On Depth

What then about depth? Taking the self-serving but increasingly concrete example of my own Portfolio, it’s clear that (at times) I’ve demonstrated depth. Driven by an interesting problem aimed at improving tsunami alerting by processing data extracted from Twitter, for example, the deepening progression with co-author Jim Freemantle has been as follows:

  1. Attempt to apply existing knowledge-representation framework to the problem by extending it (the framework) to include graph analytics
  2. Introduce tweet classification via Machine Learning
  3. Address the absence of semantics in the classification-based approach through the introduction of Natural Language Processing (NLP) in general, and embedded word vectors in particular
  4. Next steps …

(Again, please refer to my Portfolio for content relating to this use case.) Going deeper, in this case, is not a demonstration of a linear progression; rather, it is a sequence of outcomes realized through experimentation, collaboration, consultation, etc. For example, the seed to introduce Machine Learning into this tsunami-alerting initiative was planted on the basis of informal discussions at an oil and gas conference … and later, the introduction of embedded word vectors, was similarly the outcome of informal discussions at a GPU technology conference.

Whereas these latter examples are intended primarily to demonstrate the School of Depth, it is clear that the two schools of thought aren’t mutually exclusive. For example, in delving into a problem of interest Jim and I may have deepened our mastery of specific skills within NLP, however we have also broadened our knowledge within this important subdomain of Machine Learning.

One last thought here on depth. At the outset, neither Jim nor I had as an objective any innate desire to explore NLP. Rather, the problem, and more importantly the demands of the problem, caused us to ‘gravitate’ towards NLP. In other words, we are wedded more to making scientific progress (on tsunami alerting) than a specific method for Machine Learning (e.g., NLP).

Next Steps

Net-net then, it appears to be that which motivates us that dominates in practice – in spite, perhaps, of our best intentions. In my own case, my existential crisis derives from being driven by problems into depth, while at the same time seeking to demonstrate a broader portfolio of expertise with Machine Learning. To be more specific, there’s a part of me that wants to apply LSTMs (foe example) to the tsunami-alerting use case, whereas another part knows I must broaden (at least a little!) my portfolio when it comes to methods applicable to Machine Learning.

Finally then, how do I plan to address this crisis? For me, it’ll likely manifest itself as a two-pronged approach:

  1. Enrol and follow through on a course (at least!) that exposes me to one or more methods of Machine Learning that compliments my existing exposure to classification and NLP.
  2. Identify a problem, or problems of interest, that allow me to deepen my mastery of one or more of these ‘newly introduced’ methods of Machine Learning.

In a perfect situation, perhaps we’d emphasize breadth and depth. However, when you’re attempting to introduce, pivot, re-position, etc. yourself, a trade off between breadth versus depth appears to be inevitable. An introspective reflection, based upon the substance of a self-curated portfolio, appears to be an effective and efficient means for roadmapping how gaps can be identified and ultimately addressed.

Postscript

In many settings/environments, Machine Learning and Data Science in general, are team sports. Clearly then, a viable way to address the challenges and opportunities presented by depth versus breadth is to hire accordingly – i.e., hire for depth and breadth in your organization.