The way to manage this is to make your code more modular, to factor out subcomponents, to share and have them critiqued separately. For example, in machine learning frameworks, autograd is a separate package for automatic differentiation. I actively post questions and answers for subcomponents on Stack Overflow.
There are still idiots on Github using Python 2 for new projects. These are usually academics who don't actually share any love for technology. Their code will never be used by anyone.
Critical FORTRAN libraries are still the center of certain types of analysis, and they're often used by _everyone_ in the field.
The reason for that stems from two issues:
1. Making accurate simulation libraries for things like weather formation, magnetic field formation or hydraulic fracturing requires crazy in depth knowledge of the systems you are modelling, and takes years of implementing and hypothesis testing to ensure they are accurate enough for academic use.
2. Academics are not software developers, they don't get paid for the software they write, they get paid for papers. They get no reward for rewriting those crazy indepth libraries every decade in the new language de-jour.
This is a perfect example of why the python2 to python3 hard-cutoff transition was a terrible idea. "Just update your software" is something you can say to professional software developers with a straight face. You can't say it to academics, or mechatronics engineers, or real estate agents, or photographers, or _anyone else_ who has a job that benefits from a little bit of software but is mostly about delivering something other than software.
There has been a pledge by the largest group of Python developers not to mess around with Python 2. So if you want your code to be able to just run forever without any concern for all the new and exciting language ideas being loaded into Python 3 then Python 2 is the logical choice.
I've heard of this - people considering Python 2.7 as an "LTS" version of the language, receiving bug fixes but otherwise remaining unchanged for many years.
I wonder, once 2.7 is EOL in a years time, whether the language developers will consider a "3.x LTS" release?
Uber's software doesn't exist in the public realm, and so there is nothing to learn from it. It is imaginary fluff. I have posted a list of the ones that do actually exist in another comment. And you should've at least got a 1080.
It actually gives a lot of idea and validation for concepts that I was thinking about for my project https://github.com/polyaxon/polyaxon, namely for going from trained model to serving those models on other platforms in a seamless way. I also think that high-level articles like this one are really good for people who are trying to build internal tools around ML/AI ops, people can get more or less ideas about what and how other companies are managing their ML pipelines and how they impact the productivity of their teams.
You should be getting ideas from actual open source projects that actually exist, not from imaginary/fake ones. There are numerous serving packages that actually exist and offer ideas.
I wanted a portable laptop and just saw 1070s on what was available. What brand of GPU laptop do you suggest?
I got some ideas from reading the article, but yes, I agree with you that the article should have a link to a public git repo for at least some of their tooling.
EDIT: I wanted the multi-GPU Lambda Labs box, but with my lifestyle I require some mobility. I agree that the 1070 is a bit weak, but is still better than no GPU and I like The System76 laptop I bought even when I am not using the GPU.
You will get a lot more ideas from other software that actually do exist. I can easily write a fake blog post with fake thoughts giving you fake ideas, but that's not what you want; you deserve ideas that work.
Not every company needs to open source their tools. Uber has contributed a substantial amount of open source tools. This product is clearly new and 99% likely ties to their stack meaning it is useless to you anyway.
Make your own tool and open source it if you are going to be so mean. I would never want to share my hard work with people who act this way.
Who cares about this garbage if the tool isn't even open source? There are lots of ML deployment tools that are open source. I know haters will downvote my post, but it's the truth. If I can't actually fork and evaluate a tool, it is hyped up garbage to me.
Meanwhile, here is a list of open source ML deployment packages:
FWIW, the "Guidelines" ask you to please not dismiss work out of hand. Appearances matter, too - you can point out that a tool is proprietary and has lots of plausible FLOSS alternatives without being uncivil about it ("garbage").
Edited: Commenting about the voting, especially in the way you did here, is inappropriate too. We're here for substantive discussion.
I agree with you, but I also think there’s room to point out that Uber has lost credibility and it wouldn’t be surprising or inconsistent if it was just a PR post for recruiting hype. Without details (not mere surface comments) on how it is differentiated from the many other available solutions and deep dives into what use cases it is specifically better suited for, it seems reasonable to treat it with a lot of skepticism.
But to be clear, I totally agree in terms of the degree and tone. I don’t think the parent comment you responded to was “uncivil” in any way, but overly dismissive instead of just noting to be skeptical.
Skepticism is one thing, lack of civility is quite another (and I'm sorry, but calling stuff "garbage" in such an off-hand way is clearly uncivil to me, especially when shallow dismissals are expressly pointed out as a problem). See brylie's comment in this thread for how to do it right - take that, add a stronger caveat about this tool's proprietary status, in contrast to so much other stuff in this space being FLOSS, and it would make your point quite well without degrading discussion.
https://github.com/tensorflow/adanet looks effectively modular to me, although I wouldn't consider it the epitome of the same. One needs to be able to merge, move, and remove modules too.
But in general, I don't think there's a requirement to share code alongside a paper submission. There are arguments for [1] doing so, but it's definitely not a universal practice.
IMHO this has to become a universal practice. If you don’t wanna share code with your publication, how are other researchers going to corroborate your findings?
Publishing your research and not the code is only like taking a step behind. Even Yoshua Bengio is a staunch proponent of code sharing with publication.
I completely agree. For a DataStore in Python, I may often use a Pandas DataFrame or such. For random label-based access it can have an index too.
Unfortunately, when asked to design an OOP architecture in a job interview, if you don't adhere to its religious enterprisy notions, you can risk failing the interview.