Research Coding Accountability

  1. I believe that the single biggest reason why scientists do not make their code generally available is that they are ashamed of it.
  2. .@luispedrocoelho @iddux Then, scientists need to internalize that releasing code is more important than their personal egos.
  3. @luispedrocoelho @rbuels There's much effort involved in getting code up from "works fine in the lab" to "good enough to put on sourceforge"
  4. .@rbuels @iddux Actually, they need to write better code :) I release my code mostly *to force myself* to write good code.
  5. @iddux @rbuels if it's not good enough to make public, why is it good enough to base publications on?
  6. @luispedrocoelho @rbuels But I'm not supposed to spend my lab's research time on handling input contingencies and user documentation.
  7. @luispedrocoelho @rbuels If no resources exist to write hardened code, a journal mandate is hollow.
  8. @luispedrocoelho @rbuels IMHO bringing code to distro level == 0.2-0.4 FTE. ~$50K/yr. That's means losing 1 postdoc (benefits inc.)
  9. @luispedrocoelho @rbuels Until funders mandate & fund code distro, research(labs writing good codebase) << research(labs that don't).
  10. @luispedrocoelho @rbuels Most labs do not have $$$ to produce distributed level code. Most scientific code == pipeline hacks.
  11. @luispedrocoelho @rbuels Same reason you should trust clinical results produced from non-distributable patients.
  12. @rbuels @luispedrocoelho Lab that does that ends up with rep for distributing messy code & not supporting it. Liability issues aside.
  13. @luispedrocoelho @rbuels because in the lab you do not need INSTALL README and doc/ files to the extent you need on sourceforge.
  14. @iddux @rbuels of course, it's easy to fake experimental data, but I think unintentional biases are more common.
  15. @iddux @rbuels I also do both as do most people around me. You get away with code practices that would get you told off at the bench.
  16. @iddux @rbuels econometrics journals mostly require code. And people do check and invalidate others results based on analysis bugs
  17. @iddux @rbuels journals can make that call too. In some areas, it's already standard.
  18. @rbuels @iddux also, typically wet lab experiments are given a lot more thought and are more standardized than computational pipelines.
  19. @rbuels @iddux you are supposed to describe your experimental methods in detail in a publication.
  20. @iddux @rbuels not the same. In clinical trials, the data disclosure standards are pretty high.
  21. @rbuels @iddux it needn't be reusable, just reproducible and verifiable. If it's such a mess, why should I trust the results?
  22. @iddux @rbuels making it reusable is hard work, making it public is not. All it takes is adding an extra supplemental file to your paper

Did you find this story interesting? Be the first to or comment.

Liked!

Iddo Friedberg

Computational biologist. Interested in gene and protein function prediction, prediction of function from structure, metagenomics http://iddo-friedberg.net

Total views
25

Storify

@Storify