MIT Technology Review, on the challenges to reproducibility in AI research:
According to the 2020 State of AI report, […] only 15% of AI studies share their code. Industry researchers are bigger offenders than those affiliated with universities. In particular, the report calls out OpenAI and DeepMind for keeping code under wraps. Then there’s the growing gulf between the haves and have-nots when it comes to the two pillars of AI, data and hardware. Data is often proprietary, such as the information Facebook collects on its users, or sensitive, as in the case of personal medical records. And tech giants carry out more and more research on enormous, expensive clusters of computers that few universities or smaller companies have the resources to access.
All that private data, though, is the symptom, not the disease. To call AI research a revolving door between industry and academia is to abuse the metaphor: Most researchers have a foot in both worlds. The result is a predictable clash between a race for winner-take-all profits and the norms of science. The evolution of OpenAI—from nonprofit to secretive money-grubber—is a fractal version of what’s unfolding. Data transparency won’t fix that.