I'm skeptical...
'Writing computer programs could become as easy as searching the Internet. A Rice University-led team of software experts has launched an $11 million effort to create a sophisticated tool called PLINY that will both “autocomplete” and “autocorrect” code for programmers, much like the software that completes search queries and corrects spelling on today’s Web browsers and smartphones.'
Interested, but very, very skeptical...
"When someone says 'I want a programming language in which I need only say what I wish done,' give him a lollipop."
If we look at duplicate code across all projects on the Internet, maybe can simply pull from a larger database.
http://stackoverflow.com/questions/191614/how-to-detect-code...
Anyway, it's easy to say that it can't be done. It's probably more worthwhile to try and tackle the problem and make some forward progress.
[Update]
Just saw this HN submission.
https://news.ycombinator.com/item?id=8562635
Microsoft can now autocomplete C# code by using Bing Code Search Engine based off of your comments.
///how to read file pth line by line<TAB>
Pretty slick.
Can anyone enlighten me as to what they are actually trying to do, from the perspective of an actual software developer?
Or did they just successfully string together the correct sequence of buzz words to unlock the grant money?
This, absolutely. Big data, data mining, and machine learning are really cool topics but the words became overused, overhyped, used out of context, especially by people who don't really understand what these are.
I have an old co-worker who spent a lot of time working with large excel spreadhseets, using some formulas, sorting it to look for things, etc.
He lists "data mining" on his linkedin, and has a ton of people who endorsed him for it.
This became a little off topic, but I hate how venture capital, grant funds, the media, and misinformed people completely butcher these topics.
https://github.com/capitaomorte/yasnippet
I do wonder if you gave $11M to João Távora what the end result would be. Probably pretty cool.
If they are thinking of sourcing the internet itself, there had better be some kind of omniscient, all powerful proofreader in place, because there are a lot of people that submit a lot of code that is HORRIBLY insecure, inaccurate, prone to breakage or just plain spaghetti.
I'd hate to be working on a missile guidance system, only to press <tab> to complete a code block and end up getting some Intel Pentium FDIV instructions.
Program synthesis: There has been a lot of interest in the formal methods community to automatically generate programs (for small instances) with the target specification coming from input-output examples (e.g., Excel Flash Fill [3]), program templates or holes (called Sketches [4]), reactive models of adversarial environments, formal invariants etc. Also the solution techniques used vary considerably: from game theoretic solving, SAT solvers, model checkers, to version-space algebras and others. The community has not yet fixated on a specification language, or a solving technology. The industrial nature of the tools being leveraged (e.g., model checkers and SAT solvers from the hardware community) gives hope for promising developments. A Berkeley course [5] covers a good spectrum of the current developments.
If I were to guess, maybe the Rice researchers are approaching the code completion/correction problem as mining for fragments of large codebases that are incomplete/incorrect and applying program synthesis to fill those fragments. Of course that would mean that they would also need to mine the specification requirements for those fragments. All of this is easier said than done, and it would be an ambitious project. Swarat has also done some really cool work on "probabilistic reasoning for programs" and "verification of probabilistic programs", so that might be part of it too. (Of course, I may be completely off-base! After all, we are commenting on a non-technical funding announcement here.)
[1] Swarat's publications: http://www.cs.rice.edu/~sc40/pubs/
[2] Moshe's publications: http://www.cs.rice.edu/~vardi/papers/index.html
[3] Excel's FlashFill from Sumit Gulwani, researcher@MSR: http://research.microsoft.com/en-us/um/people/sumitg/flashfi...
[4] The Sketch program synthesizer: https://bitbucket.org/gatoatigrado/sketch-frontend/wiki/Home
[5] Ras Bodik/Emina Torlak: Berkeley course material on Program Synthesis: http://www.cs.berkeley.edu/~bodik/cs294fa12
I feel that the reason DARPA is willing to fund this is because of that last part: "vulnerabilities".
Not that there is anything wrong with autocomplete. I certainly use it, but I've seen a lot of programmers that barely understand the code they are writing.