Disclosures Regarding My Portfolios: Attributing the Contributions of Others

‘Personal’ Achievement?

October 8, 2018 was an extremely memorable night for Drew Brees at the Mercedes-Benz Superdome in New Orleans. Under the intense scrutiny of Monday Night Football, the quarterback of the New Orleans Saints became the leading passer in the history of the National Football League. (For those not familiar with this sport, you can think of his 72,103-yard milestone as a lifetime-achievement accomplishment of ultramarathon’ic proportions.) The narrative on Brees’ contributions to ‘the game’ are anything but complete. In fact, the longer he plays, the more impressive this milestone becomes, as he continues to place distance between himself and every other NFL QB.

Of course the record books, and Brees’ inevitable induction into the Pro Football Hall of Fame, will all position this as an individual-achievement award. Whenever given the opportunity to reflect upon seemingly personal achievements such as the all-time passing leader, Brees is quick to acknowledge those who have enabled him to be so stunningly successful in such a high-profile, high-pressure role – from family and friends, to teammates, coaches, and more.

As I wrote about another NFL quarterback in a recent post, like Tom Brady, Brees remains a student-of-the-game. He is also known for his off-the-field work ethic that he practices with the utmost intensity in preparing for those moments when he takes the main stage along with his team. Therefore, when someone like Brees shares achievements with those around him, it’s clearly an act that is sincerely authentic.

Full Disclosure

At the very least, self-curating and sharing in public some collection of your work has more than the potential to come across as an act of blatant self-indulgence – and, of course, to some degree it is! At the very worst, however, is the potential for such an effort to come across as a purely individual contribution. Because contribution matters so much to me personally, I wanted to ensure that any portfolio I self-curate includes appropriate disclosures; disclosures that acknowledge the importance of collaboration, opportunity, support, and so on, from my family, friends and acquaintances, peers and co-workers, employers, customers and partners, sponsors, and more. In other words, and though in a very different context, like Brees I want to ensure that what comes across as ‘My Portfolio’ rightly acknowledges that this too is a team sport.

In the interests of generic disclosures then, the following is an attempt to ensure the efforts of others are known explicitly:

  • Articles, book chapters and posters – Based on authorships, affiliations and acknowledgements, portfolio artifacts such as articles, book chapters and posters make explicit collaborators, enablers and supporters/influencers, respectively. In this case, there’s almost no need for further disclosure.
  • Blog posts – Less formal than the written and oral forms of communication already alluded to above and below, it’s through the words themselves and/or hyperlinks introduced that the contributions of others are gratefully and willingly acknowledged. Fortunately, it is common practice for page-ranking algorithms to take into account the words and metadata that collectively comprise blog posts, and appropriately afford Web pages stronger rankings based upon these and other metrics.
  • Presentations – My intention here is to employ Presentations as a disclosure category for talks, webinars, workshops, courses, etc. – i.e., all kinds of oral communications that may or may not be recorded. With respect to this category, my experience is ‘varied’ – e.g., in not always allowing for full disclosure regarding collaborators, though less so regarding affiliations. Therefore, to make collaborators as well as supporters/influencers explicit, contribution attributions are typically included in the materials I’ve shared (e.g., the slides corresponding to my GTC17 presentation) and/or through the words I’ve spoken. Kudos are also warranted for the organizations I’ve represented in some of these cases as well, as it has been a byproduct of this representation that numerous opportunities have fallen into my lap – though often owing to a sponsorship fee, to be completely frank. Finally, sponsoring organizations are also deserving of recognition, as it is often their mandate (e.g., a lead-generation marketing program that requires a webinar, a call for papers/proposals) that inspires what ultimately manifests itself as some artifact in one of my portfolios; having been on the event-sponsor’s side more than a few times, I am only too well aware of the effort involved in creating the space for presentations … a contribution that cannot be ignored.

From explicit to vague, disclosures regarding contribution are clearly to barely evident. Regardless, for those portfolios shared via my personal blog (Data Science Portfolio and Cloud Computing Portfolio), suffice it to say that there were always others involved. I’ve done my best to make those contributions clear, however I’m sure that unintentional omissions, errors and/or (mis)representations exist. Given that these portfolios are intentionally positioned and executed as works-in-progress, I look forward to addressing matters as they arise.

Ian Lumb’s Cloud Computing Portfolio

When I first introduced it, it made sense to me (at the time, at least!) to divide my Data Science Portfolio into two parts; the latter part was “… intended to showcase those efforts that have enabled other Data Scientists” – in other words, my contributions as a Data Science Enabler.

As of today, most of what was originally placed in that latter part of my Data Science Portfolio has been transferred to a new portfolio – namely one that emphasizes Cloud computing. Thus my Cloud Computing Portfolio is a self-curated, online, multimedia effort intended to draw together into a cohesive whole my efforts in Cloud computing; specifically this new Portfolio is organized as follows:

  • Strictly Cloud – A compilation of contributions in which Cloud computing takes centerstage
  • Cloud-Related – A compilation of contributions drawn from clusters and grids to miscellany. Also drawn out in this section, however, are contributions relating to containerization.

As with my Data Science Portfolio, you’ll find in my Cloud Computing Portfolio everything from academic articles and book chapters, to blog posts, to webinars and conference presentations – in other words, this Portfolio also lives up to its multimedia billing!

Since this is intentionally a work-in-progress, like my Data Science Portfolio, feedback is always welcome as there will definitely be revisions applied !

Revisiting the Estimation of Fractal Dimension for Image Classification

Classification is a well-established use case for Machine Learning. Though textbook examples abound, standard examples include the classification of email into ham versus spam, or images of cats versus dogs.

Circa 1994, I was unaware of Machine Learning, but I did have a use case for quantitative image classification. I expect you’re familiar with those brave souls known as The Hurricane Hunters – brave because they explicitly seek to locate the eyes of hurricanes using an appropriately tricked out, military-grade aircraft. Well, these hunters aren’t the only brave souls when it comes to chasing down storms in the pursuit of atmospheric science. In an effort to better understand Atlantic storms (i.e., East Coast, North America), a few observational campaigns featured aircraft flying through blizzards at various times during Canadian winters.

In addition to standard instrumentation for atmospheric and navigational observables, these planes were tricked out in an exceptional way:

For about two-and-a-half decades, Knollenberg-type [ref 4] optical array probes have been used to render in-situ digital images of hydrometeors. Such hydrometeors are represented as a two-dimensional matrix, whose individual elements depend on the intensity of transmitted light, as these hydrometeors pass across a linear optical array of photodiodes. [ref 5]

In other words, the planes were equipped with underwing optical sensors that had the capacity to obtain in-flight images of

hydrometeor type, e.g. plates, stellar crystals, columns, spatial dendrites, capped columns, graupel, and raindrops. [refs 1,7]

(Please see the original paper for the references alluded to here.)

Even though this is hardly a problem in Big Data, a single flight might produce tens to hundreds to thousands of hydrometeor images that needed to be manually classified by atmospheric scientists. Working for a boutique consultancy focused on atmospheric science, and having excellent relationships with Environment Canada scientists who make Cloud Physics their express passion, an opportunity to automate the classification of hydrometeors presented itself.

Around this same time, I became aware of fractal geometrya visually arresting and quantitative description of nature popularized by proponents such as Benoit Mandlebrot. Whereas simple objects (e.g., lines, planes, cubes) can be associated with an integer dimension (e.g., 1, 2 and 3, respectively), objects in nature (e.g., a coastline, a cloud outline) can be better characterized by a fractional dimension – a real-valued fractal dimension that lies between the integer value for a line (i.e., 1) and the two-dimensional (i.e., 2) value for a plane.

Armed with an approach for estimating fractal dimension then, my colleagues and I sought to classify hydrometeors based on their subtle to significant geometrical expressions. Although the idea was appealing in principle, the outcome on a per-hydrometeor basis was a single, scalar result that attempted to capture geometrical uniqueness. In isolation, this approach was simply not enough to deliver an automated scheme for quantitatively classifying hydrometeors.

I well recall some of the friendly conversations I had with my scientific and engineering peers who attended the conference at Montreal’s Ecole Polytechnique. Essentially, the advice I was given, was to regard the work I’d done as a single dimension of the hydrometeor classification problem. What I really needed to do was develop additional dimensions for classifying hydrometeors. With enough dimensions then, the resulting multidimensional classification scheme would be likely to have a much-better chance of delivering the automated solution sought by the atmospheric scientists.

In my research, fractal dimensions were estimated using various algorithms; they were not learned. However, they could be – as is clear from the efforts of others (e.g., the prediction of fractal dimension via Machine Learning). And though my pursuit of such a suggestion will have to wait for a subsequent research effort, a learned approach might allow for the introduction of much more of a multidimensional scheme for quantitative classification of hydrometeors via Machine Learning. Of course, from the hindsight of 2018, there are a number possibilities for quantitative classification via Machine Learning – possibilities that I fully expect would result in more useful outcomes.

Whereas fractals don’t receive as much attention these days as they once did, and certainly not anything close to the deserved hype that seems to pervade most discussions of Machine Learning, there may still be some value in incorporating their ability to quantify geometry into algorithms for Machine Learning. From a very different perspective, it might be interesting to see if the architecture of deep neural networks can be characterized through an estimation of their fractal dimension – if only to tease out geometrical similarities that might be otherwise completely obscured.

While I, or (hopefully) others, ponder such thoughts, there is no denying the stunning expression of the fractal geometry of nature that fractals have rendered visual.

Data Science: Celebrating Academic Personal Bias

Data Science: Celebrating My Academic Bias

In a recent post, I introduced my Data Science Portfolio. After describing the high-level organization of the Portfolio, I noted:

At the end, and for now, there is a section on my academic background – a background that has shaped so much of those intersections between science and technology that have been captured in the preceding sections of the portfolio.

Even in this earliest of drafts, I knew that I was somewhat uncomfortable with a section dedicated to academics in my Portfolio. After all shouldn’t a portfolio place more emphasis on how my knowledge and skills, academic or otherwise, have been applied to produce some tangible artifact?

Upon further reflection, I currently believe what’s material in the context of a portfolio is some indication of the bias inherent in the resulting curated showcase of one’s work. Of course to some degree the works presented, and the curation process itself, will make self-evident such personal bias.

Whereas it may make sense for an artist not to overtly disclose any bias with respect to their craft, or a curated collection their work, I currently perceive absolutely no downside in sharing my personal bias – a bias that in my own case, I believe reflects only in positive ways on the Portfolio as well as the individual items included in it.

To this end, and in the spirit of such a positive self-disclosure, my personal bias reflects my formative years in science – a background to which I well recall significant contributions from high school, that were subsequently broadened and deepened as an undergraduate and then graduate student. Even more specifically in terms of personal bias was my emphasis on the physical sciences; a bias that remains active today.

As I’ve started to share, through such posts as the one on the mathematical credentials I bring to Data Science, my choice to pursue the physical sciences was an excellent one – even through the self-critical lens of personal hindsight. An excellent choice, but albeit a biased one.

The very nature of Data Science is such that each of us carries with us our own, wonderfully unique personal bias. As we necessarily collaborate in team, project and organizational settings, I believe it’s important to not only ensure each of us preserves their personal bias, but that we leverage this perspective as fully and appropriately as possible. As a consequence it is much more likely that everyone we work with, and everything we work on, will derive maximal value.