Data Science Portfolio

An (under construction) self-curated multimedia showcase of my contributions in the area of Data Science that includes Machine Learning.

Disclosures Regarding My Portfolios: Attributing the Contributions of Others


  • Introduction to this Portfolio BLOG
    • What is Data Science? What is a Data Scientist? TODO
    • Am I a Data Scientist? BLOG
  • Personal Bias – Academic | Professional + Joining Sylabs
    • How I Ended Up in Geophysical Fluid Dynamics BLOG
    • On Math Creds for Deep Learning: Back to School Time? [Univa BLOG]
    • Prob & Stats Gaps: Sprinting for Closure BLOG | TEMPLATE – BLANK, EXAMPLE
  • Organization of this Portfolio – explained in the intro BLOG
    • Data Science Practitioner
      • Knowledge Representation
      • Machine Learning
    • Data Science Enabler

Data Science Practitioner

Knowledge Representation (aka. Semantics)

Knowledge Representation Framework: Development

  • Grid-Enabling the Global Geodynamics Project: The Introduction of an XML-Based Data Model PUB MS
  • Grid-Enabling the Global Geodynamics Project: Automatic RDF Extraction from the ESML Data Description and Representation via GRDDL PUB MS
  • Semantically Enabling the Global Geodynamics Project: Incorporating Feature-Based Annotations via XML Pointer Language (XPointer) PUB MS
  • Annotation Modelling with Formal Ontologies: Implications for Informal Ontologies PUB MS
  • Semantic e-Science: From Microformats to Models ABSTRACT

Knowledge Representation Framework: Application

  • Evolving Semantic Frameworks into Network-Enabled Semantic Platforms  BLOG | UNPUB MS
  • Creating Actionable Data from an Optical Depth Measurement Network using RDF  ABSTRACT
  • Knowledge Maps for Campus IP Networks: From Relational Databases to Relationship-Centric Semantic Models  BOOK CHAPTER (BOOK)
  • Towards a Relationship-Centric Knowledge Representation Framework for Situational Awareness in Computer Network Defence  BOOK CHAPTER PROPOSAL
  • Towards Earthquake-Tsunami Causality via Data Science: Giraph-Derived Credibility Scores for Data from Twitter  POSTER

Machine Learning

Demonstrating Your Machine Learning Expertise: Optimizing Breadth vs. Depth BLOG


  • Fractals and Machine Learning TODO
    • Revisiting the Estimation of Fractal Dimension for Image Classification BLOG
      • Quantitative Classification of Cloud Microphysical Imagery via Fractal-dimension Calculations PUB
  • Refactoring Earthquake-Tsunami Causality and Messaging via Big Data Analytics: The Transformative Potential of Credible Tweets UNPUB MS | SLIDES | GITHUB


  • An Experimental Study of Inertial Waves in a Spheroidal Shell of Rotating Fluid PUB
  • Might Earth’s solid inner core contribute to a precessionally driven geodynamo? ABSTRACT | SLIDES

Natural Language Processing

  • Mitigating Disasters with GPU-Based Deep Learning From Twitter? ABSTRACT | SLIDES | VIDEO
  • GTC 2017 in Silicon Valley: An Event of Lasting Impact [Univa BLOG]
  • Towards Deep Learning from Twitter for Improved Tsunami Alerts and Advisories ABSTRACT | SLIDES
  • Refactoring Earthquake-Tsunami Causality with Big Data Analytics: Towards Establishing The Transformative Potential of Machine Learning from Twitter (Lecture Notes in Computer Science, Springer, in press)
  • Towards Tsunami Informatics: Applying Machine Learning to Data Extracted from Twitter BLOG


  • Genetic Aesthetics: Generative Software Meets Genetic Algorithms BLOG
  • PyTorch Then & Now: A Highly Viable Framework for Deep Learning BLOG

Numerical Modeling

Parallel Computing

Exploration Seismology

  • Seismic migration using Hadoop: How did I get here? [Bright BLOG]
  • RTM using Hadoop: Is There a Case for Migration? ABSTRACTVIDEO | SLIDESPOSTER
  • Reverse Time Migration via Resilient Distributed Datasets: Towards In-Memory Coherence of Seismic-Reflection Wavefields Using Thunder via Apache Spark ABSTRACT | VIDEO
  • Reverse Time Migration via Resilient Distributed Datasets: Towards In-Memory Coherence of Seismic-Reflection Wavefields using Apache Spark SLIDES
  • Possibilities for Reverse-Time Seismic Migration (RTM) using Apache Spark BLOG
  • Refactoring Reverse Time Migration with Apache Spark: 1. Towards In-Memory Coherence of Seismic-Reflection Wavefields UNPUB MS

Geophysical Fluid Dynamics

  • The period of the free core nutation: towards a dynamical basis for an ‘extra-flattening’ of the core-mantle boundary PUB
  • Task Geometry for Commodity Linux Clusters and Grids: A Solution for Topology-Aware Load Balancing of Synchronously Coupled, Asymmetric Atmospheric Models BOOK CHAPTER

Visual Communication

  • Teaching/Learning Weather and Climate via Pencasting BLOG
  • Pencasting During Lectures in Large Venues BLOG
    • Livescribe Pencasting: Seizing Uncertainty from Success BLOG
  • Pencasting with a Wacom tablet: Time to revisit this option BLOG
  • Nurturing Quantitative Skills for the Physical Sciences through use of Scientific Models BLOG

Data Science Enabler

This second part of my Portfolio focuses on enabling Data Science. Directly below, Machine Learning receives consideration; my Cloud Computing Portfolio showcases work with a less-tangible connection to Machine Learning … and possibly Data Science.

Machine Learning

  • 8 Reasons Apache Spark is So Hot PUB
  • Machine Learning for Big Data Analytics: Scaling In with Containers while Scaling Out on Clusters [Univa SLIDESWEBINAR]
  • Drilling Deep with Machine Learning as an Enterprise Enabled Micro Service ABSTRACT | BLOGSLIDESVIDEO | POSTER
  • Univa and SUSE at SC17: Scaling Machine Learning for SUSE Linux Containers, Servers, Clusters and Clouds  [Univa BLOG] | VIDEO
  • Univa and SUSE at SC17: Managing Containerized HPC and AI Workloads on TSUBAME 3.0  [Univa BLOG] | VIDEO