machine learning

Just the Beginning for AI & Science

Something very exciting is happening right now across the landscape of the physical and mathematical sciences: we are finally starting to learn fundamentally new things about how the universe works because of the direct and purposeful use of artificial intelligence/machine learning (AI/ML). Last week, two major results (discussed below) suggests that the long-sought, oft-discussed hope of an AI-driven revolution in science may now be coming to fruition. To be sure, AI/ML has been an integral part of our scientific workflows for years.

Serverless Distributed Decision Forests with AWS Lambda

Within the team in GE Digital, we have monthly "edu-hackdays" where the entire tech team spends the entire day trying to learn and implement new promising approaches to some portion of our machine-learning based workflow. In the past, we worked on algorithm hacks and on methods for distributed featurization. Some of what we start those days eventually go into production, but most does not. The main goal (apart from the team building that comes with the fun and pain of all-day hacks) is to create collective knowledge and experience around important components of our stack. Recently we had an edu-hackday on strategies for distributed learning. This post captures (and hopefully provides some motivation for) the work I did at that hackday in April.

Towards Cost-Optimized Artificial Intelligence

If accuracy improves with more computation, why not throw in more time, people, hardware, and the concomitant energy costs? Seems reasonable but this approach misses the fundamental point of doing machine learning (and more broadly, AI): as a means to an end.  And so we need to have a little talk about cost-optimization, encompassing a much wider set of cost-assignable components than usually discussed in academia, industry, and the press. Viewing AI as a global optimization over cost (ie., dollars) puts the work throughout all parts of the value chain in perspective (including the driving origins of new specialized chips—like IBM TrueNorth Google's Tensor Processing Unit). Done right it will lead to, by definition, better outcomes.

Cache Ugly Reporting Queries With Materialized Views and Docker

Confidence and trust in your SaaS product depends, in part, on the continual conveyance of the value of the solution you provide. The reporting vectors (web-based dashboards, daily emails, etc.) obviously depend upon the specifics of your product and your engagement plan with your customers. But underlying all sorts of reporting is the need to derive hard metrics from databases: What's the usage of your application by seat? How has that driven value/efficiency for them? What are the trends and anomalies worth calling out?  The bad news is that many of the most insightful metrics require complex joins across tables; and as you scale out to more and more customers, queries across multitenant databases will take longer and longer. The good news is that, unlike for interactive exploration and real-time monitoring and alerting use cases, many of the queries against your production databases can be lazy and done periodically. At, we needed a way to cache and periodically update long-running/expensive queries so that we could have more responsive dashboards for our customers and our implementation engineers. After some research, including exploration with 3rd party vendors, we settled on leveraging materialized views. This is a brief primer on a lightweight caching/update solution that uses materialized views coupled with Docker.

A Test for Artificial Creativity

I just posted on Medium a blog about using crosswords as a Turing-like test for artificial creativity. It just happens to coincide with the week of the 100th anniversary of the crossword puzzle and Alan Turing's pardon. Read on…

Wealth and Labor in the Cognitive Automation Era

Disruptive technologies have always been greeted with a concern—and many times a back reaction—by the institutions that they are, or are meant to, disrupt. In the startup world, we think about disruption as replacing established technologies and ways of doing things with compelling (and better) alternatives, challenging incumbent market dominants. But disruption also means changing how people work, and therefore also means upheaval in labor markets. Yesterday, in the Uncharted Forum here in Berkeley, I discussed artificial intelligence on stage with former Deputy Assistant Secretary of the Treasury Brad DeLong (also a professor with me at Berkeley).

Is there an Uncanny Valley of Machine Intelligence?

Roboticists know a lot about the uncanny valley, that uncomfortable place in utility and appearance where robots look and act almost—but not exactly—lifelike. On one side of the valley diligent self-propelled vacuum cleaners make our domestic lives easier and on the other side (the stuff of sci-fi for now) is the promise of human-replicants doing manual work than no real human could do or would want to for the pay involved.