Ian Lumb’s Data Science Portfolio
I had the very fortunate opportunity to present some of my research at GTC 2017 in Silicon Valley. Even after 3 months, I found GTC to be of lasting impact. However, my immediate response to the event was to reflect upon my mathematical credentials – credentials that would allow me to pursue Deep Learning with the increased breadth and depth demanded by my research project. I crystallized this quantitative reflection into a very simple question: Do I need to go back to school? (That is, back to school to enhance my mathematical credentials.)
There were a number of outcomes from this reflection upon my math creds for Deep Learning. Although the primary outcome was a mathematical ‘gap analysis’, a related outcome is this Data Science Portfolio that I’ve just started to develop. You see, after I reflected upon my mathematical credentials, it was difficult not to broaden and deepen that reflection; so, in a sense, this Data Science Portfolio is an outcome of that more-focused reflection.
As with the purely mathematical reflection, the effort I’m putting into self-curating my Data Science Portfolio allows me to showcase existing contributions (the easy part), but simultaneously raises interesting challenges and opportunities for future efforts (the difficult part). More on the future as it develops …
For now, the portfolio is organization into two broad categories:
- Data Science Practitioner – intended to showcase my own contributions towards the practice of Data Science
- Data Science Enabler – intended to showcase those efforts that have enabled other Data Scientists
At the end, and for now, there is a section on my academic background – a background that has shaped so much of those intersections between science and technology that have been captured in the preceding sections of the portfolio.
Although I expect there’ll be more to share as this portfolio develops, I did want to share one observation immediately: When placed in the context of a portfolio, immune to the chronological tyranny of time, it is fascinating to me to see themes that form an arc through seemingly unrelated efforts. One fine example is the matter of semantics. In representing knowledge, for example, semantics were critical to the models I built using self-expressive data (i.e., data successively encapsulated via XML, RDF and ultimately OWL). And then again, in processing data extracted from Twitter via Natural Language Processing (NLP), I’m continually faced with the challenge of ‘retaining’ a modicum of semantics in approaches based upon Machine Learning. I did not plan this thematic arc of semantics; it is therefore fascinating to see such themes exposed – exposed particularly well by the undertaking of portfolio curation.
There’s no shortage of Data Science portfolios to view. However one thing that’s certain, is that these portfolios are likely to be every bit as diverse and varied as Data Science itself, compounded by the uniqueness of the individuals involved. And that, of course, is a wonderful thing.
Thank you for taking the time to be a traveller at the outset of this journey with me. If you have any feedback whatsoever, please don’t hesitate to reach out via a comment and/or email to ian [DOT] lumb [AT] gmail [DOT] com. Bon voyage!