Hey, GitHub – Waiting list signup (opens in new tab)

(githubnext.com)

316 pointsrcshubhadeep3y ago241 comments

241 comments

197 comments · 78 top-level

dustedcodes3y ago· 26 in thread

When I talk to my Google Home then 50% of my brain power is engaged in predicting and working out how to best phrase something so that the "AI" understands what I mean and the other 50% is used to actually think about what I want to accomplish in the first place. This is just about okay for things like switching lights on/off or requesting a nice song I want to listen to, but I could never be productive programming like this. When I'm in the zone I don't want to have to waste any mental capacity on supplementing an imperfect AI, I want to be thinking 100% about what I want to code and just let my fingers do the work.

For that reason I think this will be less appealing to developers than GitHub may think, otherwise I think it's a cool idea.

chipgap983y ago

I think the biggest use case for this is accessibility. There are plenty of people who permanently or temporarily cannot use a keyboard (and/or mouse). This will be great for those users.

For the average dev, I agree this is more of a novelty.

jesterswilde3y ago

I am highly suspicious of new tech coming in the guise of 'accessibility'. As someone goin blind, a lot of things toted as good for me are cumbersome and bad.

Maybe this will be different, and that'd be neat. Though I just think more expressions of code is neat. I also know the accessibility you're talkin about isn't for blindness.

That being said I can talk about code decently well, but if you've never heard code come out of text-to-speech, well, it's painful.

I bring up the text-to-speech because if speech is input, it would make sense for speech to also be the output. Selfishly, getting a lot of developers to spend time coding through voice might end up with some novel and well thought out solutions.

3 more replies

melling3y ago

“I think there is a world market for maybe five computers.” - Thomas Watson

I bet if we use our imaginations, we’ll think of a lot of places were using voice to code could come in handy.

Personally, I’ve been waiting for it for a few decades.

The creator of TCL has RSI and has been using voice since the late 1990’s

https://web.stanford.edu/~ouster/cgi-bin/wrist.php

Thought we were really close 10 years ago when Tavis Rudd developed a system:

https://youtu.be/8SkdfdXWYaI

GitHub seems to be more high-level. It figures out the syntax and what you actually want to write.

This would help if you barely knew the language.

Time to learn Rust or Scala with a little help from machine learning.

2 more replies

awslattery3y ago

As a new dad, I would love to have the voice-to-text accuracy and speed I get on my Pixel phone on my desktop OS. Done right, I could easily see myself using it more often than when I have my youngling in one arm as I've been WFH for the better part of the last 6 years of work.

cdrini3y ago

This looks to be much more heavily using GPT3/Codex/Copilot, which I've found to be eerily effective. It basically feels like a voice interface to Copilot. The main difference between these and something like Google Home is how effectively they pick up on context. "Hey Github" would be able to use all the code in the file as context, so when you say "wrap this in a function", it'll have an idea of what you mean, without that function having to be explicitly programmed. Voice assistants have to _always_ be in a voice space, so context is very limited. And generally the way Google home-style voice assistants are created is by programming specific actions linked to specific phrases. ML helps make the phrase matching flexible, but the action is usually entirely explicitly coded. Using Codex would let the action be ML influenced as well.

If Copilot is any indicator of effectiveness, then I have high hopes for this! I've always wanted to program while stationary biking :)

bryanrasmussen3y ago

I think yes this could be a real multiplier for seniors, you're doing something you have done lots of times before just a bit different you know pretty much everything you need to do, describe it until it is in a state where you can through and finish it off. Exactly like a stationary bike or out in the garden with your kid type thing.

IF the voice analysis was any good of course. But maybe it will also be able to be better than typical voice analysis because the syntax is limited, when programming I use a much more limited vocabulary than when writing literary criticism. So while text to speech is total crap for handling complex literary phrasing it might be adequate for programming structures.

1 more reply

raylad3y ago

Around 1998 I broke my collarbone and had to use Dragon Dictate.

I found that for general subjects it was quite difficult to use because of the fairly poor recognition rate.

But when I talked about computers, it got almost everything right. I assumed it must have been trained by the developers, who talked about computers mostly.

This is another special purpose vocabulary, so it seems as if it would have a good chance of a high recognition rate.

eurasiantiger3y ago

It’s most likely just Cortana bolted on to Copilot.

1 more reply

rpastuszak3y ago

I don't use voice assistants any more due to privacy concerns but I wrote some similar software in 2010s. I'm fluent in English, but with the current tech, the success rate for me giving commands to a machine is still 50/50.

> I could never be productive programming like this.

It's likely to work much better than a generic speech-to-text model due to fine-tuning.

Plus, consciously or not, we will adapt our human language to the English-ML "pidgin" (e.g. by introducing a more efficient grammatical structures, using a specific subset of vocabulary).

The way I see it is that it's not much different from giving commands to your dog, writing a Google query, writing a Stable Diffusion prompt. It'll get better. Manual input is not as fast as speech though and that's where I see the issue.

atdrummond3y ago

I am happy to take a severe deficit over not being able to work at all. When my back was acting up, I could not physically use my left side. Dictation was the ONLY way I could code. By the end of this period, my output was back up to 95% of my typed output - especially as I don’t type code nearly as fast as I do general language writing.

rahulpandita3y ago

GitHubNext here! We would love to hear more about your experience. Please help us out by signing up for this experiment :)

mrtksn3y ago

The voice interface experience(in general) so far is like trying to make a really stupid person do something for you. Out of context misunderstandings are the worst because it breaks your flow trying to understand why that happens and how to fix it.

I imagine that voice to code would be like standing over the shoulders of a junior coder who knows the syntax and some techniques just enough to follow orders but has no idea whats doing and when gets it wrong will be very wrong.

smartmic3y ago

"Writing is thinking. To write well is to think clearly. That’s why it’s so hard." ~David McCullough

This not only holds for literature but also for programming. Concerning the hard part, I would argue that is the reason why it is not called "talking is thinking".

yi_xuan3y ago

"If you're thinking without writing, you only think you are thinking." -Leslie Lamport

Even though now speech recognition rate is really high, but I wonder how many authors use speech to write articles. The comparison may make sense. And I think there's few.

mjburgess3y ago

I think there's a difference between communicating your intent to a machine, which is hopeless since it has no model of intention; and commanding a machine to reproduce something.

Ie., when you're managing your house you want something that can be communicated in an infinite number of ways, but the "AI" accepts a tiny finitude of ways.

However when programming it seems like we arent asking the machine to "write a function to do X", but rather saying, "def open-paren star args...."

This seems like a pretty trivial problem to solve.

dustedcodes3y ago

> However when programming it seems like we arent asking the machine to "write a function to do X", but rather saying, "def open-paren star args...."

Click the link first and take a look at what is being showcased, because your comment is the exact opposite of what they demo when you visit the HN link.

1 more reply

wizardofmysore3y ago

It's really useful for those who have challenges typing (arthritis, disabilities etc..), perhaps not best for general audience as typing with auto complete is faster.

alvis3y ago

For repetitive tasks like preparing a report in the demo, saying is definitely faster than typing. It's quite impressive if your boss ask you to prepare one and the report is done in less than two minutes.

However, I too really doubt if there's any better use cases than simple tasks, let alone everyone would hear what you ask the AI to do in the office. Oh my! How embarrassing am I?

danielbln3y ago

The assistants (Google, Alexa, Siri) are not great at NLP. Compare how you speak to them vs speaking to a LLM like gpt-3, there is a world of difference. The latter feels like speaking to a human, the former more like your trying to get your voice commands into a state machine.

Kiro3y ago

Blind people are already very productive using voice-to-code.

jfk133y ago

There may well be examples of this, but while the blind developers I have known (a small sample, I admit) typically use screen-reader technologies to navigate and read code, they use a keyboard to write and edit it.

dustedcodes3y ago

I don't disagree with that. I just meant I don't think it's going to have mainstream appeal. A wheel chair also makes a disabled person super productive if the alternative is not being able to go anywhere at all, but it doesn't make wheelchairs super appealing to people with healthy two legs if you see what I mean.

dkns3y ago

I think this is great for: a) people who are visually impaired or have issues with their hands/fingers b) people who aren't programmers; if you could make it more Scratch-like then this is amazing tool for showing off power of programming

0-_-03y ago

The mental load would reduce with practice very quickly

CrociDB3y ago

they're creating a new job "prompt engineers" to replace the engineers. this is 2022.

ykonstant3y ago

The rise of the... t...talking monkey? cognizes intensely

singularity20013y ago· 26 in thread

Drop the "Hey Github" nonsense (hopefully it's only for illustration purposes anyways) and … this will be a generational paradigm change in how to write code… if it works. The hard part will be editing code with your voice too. Like "no, I meant …" etc.

VERY PROMISING, in any case you can just manually fill the gaps with the keyboard!

tgv3y ago

Generational? Idk. I work for a company that regularly sends out surveys, and there are several tools to integrate voice into it. Willingness to speak instead of type is quite low across respondents (which is a representative population sample). It looks as if speaking to a machine does not hold the same appeal as speaking to a human (something that can also be seen in telephone queue screener questions).

Semaphor3y ago

I hate talking to machines. Sometimes it’s the best option (I love using a voice assistant in the kitchen), but almost always I’d have a full keyboard as an interface instead.

If machines were amazing at Speech-to-Text, okay, sure. But while the capabilities are impressive, they still kinda suck at it.

3 more replies

klabb33y ago

Exactly. We've seen a constant improvement in the tech for decades. I remember before color lcd phones that they had "voice control" and today we have assistants, which are orders of magnitude more sophisticated.

Yet, it hasn't stuck. I'm exclusively using Siri to set timers. Most people are like me, or don't use it at all. Some use assistants for googling factoids or something. Fidelity wise, it's really underwhelming.

It's not a social acceptance issue, because people would still use it at home, and they don't. It's a small chance there's some key UI insight missing (discoverability for one), but I doubt it. Even with perfect UI, natural language is quite flawed when you're dealing with technical details (see exhibit on variable naming).

Anyway, the chances of Github solving this in an exceptionally difficult subdomain, as a side project, seems like a... Let's say, long shot.

That said, the silver lining in all these billions spent on voice interfaces is accessibility. For some people, these things are a life saver.

pflenker3y ago

This is not marketed as an alternative input mechanism for people who have otherwise no difficulty typing code. It's an input mechanism for people whose abilities to type are limited.

1 more reply

chrisandchris3y ago

If it works equally good like Apple Siri or Google Hey (or whatever its's called), then it will be ... totally useless? I can't imagine that they bring a better product than two of the richest companies in the world even can't figure out (perfectly). And if I need to read and adjust all my code for typos, I can just write it myself.

Because in my experience it is very often like "Call Peter" -> "Today it's sunny in NY".

smcleod3y ago

To be fair Siri was really good before iOS15 on the phone - very rarely got a word wrong then I don't know what they changed but it went belly up for me and many other people have said the same.

On macOS it still seems pretty good - I have carpal tunnel syndrome and by Thursday or Friday most weeks I end up using Siri to dictate not code but a lot of conversations in Slack, pull requests, iMessage, etc. In fact, I wrote this reply with Siri right now.

2 more replies

bamboozled3y ago

I've been trying to use Siri while driving more and more, it's amazing how distracting it is compared to peaking at the screen (it's naughty, I know, I try not to do it).

But yeah, something about talking to a device which gets things wrong all the time is ridiculously distracting, at least for me.

Sometimes I look back at the road after trying to workout what it interpreted and I feel scared how focused on the phone I became.

coldtea3y ago

>I can't imagine that they bring a better product than two of the richest companies in the world

Code is much more constrained by language syntax though.

Even for the "call peter" example, while the input is easy, the expected range of inputs that Siri should handle and be able to differentiate it from is huge.

Of course this is still a problem for e.g. defining variable names, where you could say anything.

ggerganov3y ago

In my experience, OpenAI's Whisper speech recognition is beyond anything currently out there. Likely Github will use it on the backend.

insanitybit3y ago

> I can't imagine that they bring a better product than two of the richest companies in the world even can't figure out

Are either of those companies investing particularly heavily into voice agents? Certainly neither of them has anywhere near the kind of power of something like Copilot.

Also, a general agent is way different from one that's specific to writing code.

isthisthingon993y ago

Somehow Google has gotten worse in the last couple of years.

creata3y ago

It seems wonderful for people who can't as easily use a keyboard, but for most people, this doesn't seem any easier than using a keyboard. Am I missing something?

Toutouxc3y ago

I use a Czech keyboard layout on my Mac, because Czech has some letters that don't exist on a US keyboard, and I don't like switching between layouts. So basically all "programming" characters (braces, brackets, parentheses, apostrophes, quotation marks, pipes, colons) are behind modifiers.

I would totally enjoy being able to tell my IDE to "call foo with bar and string hello there end string with a block of gee times two" or something, instead of:

  foo(:bar, "Hello there") { |gee| gee * 2 }

Just that, not having to think about typing different symbols would be a serious quality of life feature for me.

3 more replies

TheUndead963y ago

https://www.bbc.com/news/science-environment-52094111

jonathanstrange3y ago

If voice dictation was a killer feature, everybody would use it all the time for ordinary texts. But for some reason only few (lawyers? doctors?) use it.

1 more reply

bamboozled3y ago

"… this will be a generational paradigm change in how to write code… if it works."

Why?

Can't really see myself working like this in an office, plane, cafe, with music on (my favorite way to code), in the house where my partner is also working. Then as others have said, editing might suck.

If it was a neural link then I'd be in agreement.

imron3y ago

> The hard part will be editing code with your voice too

The hard part will be open plan offices.

It’s bad enough that so many meetings are now zoom/teams and proximity to coworkers means you end up hearing their side of their meetings.

Just wait until all the devs are coding this way too.

klabb33y ago

It's the future I always imagined as a child. A vast divider-less cubicle scape of people in Patagonia vests who define all caps constants by yelling at their standing desks.

"USER!! UNDERSCORE LIMIT!! EQUALS TWO THOUSAND AND FORTY EIGHT!"

willsmith723y ago

I dunno, we already have stuff like Krisp AI background voice cancellation. I don't think it's far away to completely cancel background talking out. This is already huge for things like pair programming while one person's in the office, one is at home. If you have noise cancelling headphones for the person in the office too (with a bit of white noise), you can have a pretty perfect call in a noisy room. (not sponsored)

https://www.youtube.com/watch?v=ILfTrUreS00

1 more reply

mlajtos3y ago

Crazy idea – whisper to use your computer. Might produce some quality ASMR in open plan office.

wiseowise3y ago

> this will be a generational paradigm change in how to write code… if it works

Why?

mkmk33y ago

I could see it maybe being important once github codepilot is embedded in it? You tell it roughly what you want and then adapt by hand. But it is kinda funny seeing parent make such claims so early

1 more reply

wokwokwok3y ago

How can it not be a paradigm change when it changes the way people write code from “write by hand” to “generated by ai with natural language”?

The problem with speech to code has always been that precise syntax is hard, but AI codegen solves that.

So, no, it might not take off, but I feel like if it does, then it means ai-codegen will become the dominant way code is crafted.

That would be paradigm shifting.

It’s inconceivable that it wouldn’t be.

3 more replies

jpalomaki3y ago

The example puts it quite well. You kind of know what you want to achieve, step by step, but are not so comfortable with your tools.

Usually this kind of exploratory work involves a lot of Googling and copy-pasting snippets from Stackoverflow without putting too much time in trying to deeply understand things. If you get out what you want - great, if not, back to Google.

stackbutterflow3y ago

Works only in remote. Used in the office that'd be madness.

Maken3y ago

I can't wait to edit my Unreal blueprints using voice commands. Truly the future of programming.

tweetle_beetle3y ago· 19 in thread

From memory, there was a time (end of millenium?) when using voice recognition to write documents was the next big thing. There was a pricey bit of software for Windows that was popular with power users and they would spend hours training it to their voice.

Then it seemed to just die off. I don't think it was bad technology, because I don't think novelty value was enough to account for its popularity - you had to put hours in to get it to work well, it wasn't a casual toy.

What's changed since then in terms of technology? Unless it's very significant, I suspect it will go the same way. Apart from an assistive technology viewpoint, my gut instinct is that it's not that satisfying or rewarding talking to a computer all day.

junon3y ago

Ah yes, Dragon NaturallySpeaking. Training for hours and hours and getting incredibly subpar results. It was a fun toy but there's a reason it didn't really take off in corporate settings.

tecleandor3y ago

Dragon NaturallySpeaking is still alive at least in medical practice. Its Nuance Dragon Medical One product is fairly popular in some regions for medical report dictation as radiologists don't like to write them down (sorry ;) ). I've seen a Philips product in that field too. Seems like LG's TVs used their recognition engine for a while (don't know if that's still valid)

The story for the most mainstream-popular dictation softwares is kind of funny. Back in the late 90's there was Dragon NaturallySpeaking and IBM's ViaVoice. In early 00s, after a financial fraud and bankruptcy involving both the then current Dragon owners and Goldman Sachs, they got bought by Scansoft. Scansoft bought Nuance, began to use its name, and then got exclusive rights for ViaVoice (!) from IBM.

Now, in March this year, Nuance has been acquire by Microsoft.

1 more reply

Aeolun3y ago

The best recognition rates on the words ‘scratch that’, which you used every other sentence.

mjochim3y ago

> What's changed since then in terms of technology? Unless it's very significant, I suspect it will go the same way. Apart from an assistive technology viewpoint, my gut instinct is that it's not that satisfying or rewarding talking to a computer all day.

Training data is now abundant compared to twenty years ago, and so is computation power. That means training can be much more complex now.

The underlying technology is now typically neural networks (broadly speaking), whereas twenty years ago it might have been Hidden Markov Models.

Overall, recognition quality, even without speaker-specific training, is now on a very different level than back then. Whether it’s considered good is a matter of opinion. But it’s significantly better than twenty years ago.

rommel9173y ago

WFH become more common. You cant program with voice in office full of people who also program with voice. At home you can multytask work with voice while doing home stuff with hands. You just need projector do display on wall while you cook or assemble IKEA table.

JW_000003y ago

I cannot even concentrate on reading a text while the radio is playing; let alone programming while assembling furniture.

Closi3y ago

It's hugely significant - look at this graph of Google's speech model accuracy across 2013 to 2017:

https://sonix.ai/packs/media/images/corp/articles/history-of...

Or this that shows a similar pattern:

https://cdn.static-economist.com/sites/default/files/externa...

danjc3y ago

Unfortunately 95% isn’t a lot of nines

1 more reply

arez3y ago

one thing that changed that you can see in a demo, that it's speech recognition paired with a neural network. He doesn't have to say "write titanic = titanic.dropDuplicated()" he just says, "drop duplicates from titanic" so you "in theory" have to write less. It probably falls apart when you have more complex things to write and then you have to fallback to speaking it out word for word, but it is an interesting development

xigoi3y ago

There is a reason we don't have many programming languages where you can say “drop duplicates from titanic”.

2 more replies

cdrini3y ago

Main things that have changed are:

1) Improvements in speech to text, as others have mentioned

2) Improvements in language models (and model size) allowing for more flexible interpretation of speech. This isn't dictation anymore. It's more like instruction. You don't have to tell the computer exactly what to write, you tell it in much more broader terms. Eg "pull this out into a function". Or "delete the cookie before creating the transaction". Or "lint the file".

That's my guess anyways! This mostly feels like a voice interface to Copilot in a lot of ways. Can't say whether it'll be effective, but I'd love to be able to program while I'm e.g. on a stationary bike!

Double_a_923y ago

Isn't that a very niche application simply because it's voice-based? I.e. it can only be used if you are alone in an office, otherwise you would be annoying your coworkers and the voices would get mixed up.

Aeolun3y ago

The only time I’ve ever successfully used voice recognition was teaching skyrim to recognize my words of power. Shouting fus-roh-dah ! was incredibly satisfying.

kreddor3y ago

It might have been Dragon NaturallySpeaking. I remember toying around with it 20 years ago or so. Apparently it has just been bought by Microsoft.

tjixxu3y ago

I have good memories spending 4+ hours training Dragon to end up with what seemed like 30% accuracy.

mmikeff3y ago

I sometimes use voice recognition in Notes on my Mac to write up my meeting notes, but I find that my waffly speech results in very verbose notes. Also still quite a few mis-hears where I read my notes back later and have to work out what I was actually saying.

Can I have transcription that can then turn my rambling into neat and concise prose?

bagels3y ago

Many people dictate messages on their phones. Doctors use it extensively.

towawy3y ago

Not a doctor, but I also started to using the dictation feature on my iPhone more and more recently. It's often more convenient than typing when I'm walking (which I do a lot) and the pickup of voice messaging, talking to your phone like it's a mic, made me more comfortable doing it in public.

msaharia3y ago

Dragon speech? I used it quite a bit 10 years ago!

Sheeny963y ago· 11 in thread

I feel like this is being misunderstood - the long term view of this wouldn't be for code scribing, it'd be for non technical people to be able to instantly create things. Imagine being able to say outloud to your phone "hey, create me a view of all the weather data from the past year correlated against x and y in z view format". The code is the means, not the product.

bodge50003y ago

> "hey, create me a view of all the weather data from the past year correlated against x and y in z view format"

Ironically, I think only technical people would even want to do something like that. The less technical you get, the more high level (and ambitious) you need to go.

You can see this a lot in game dev questions. Beginner questions will be "How do I make an MMORPG?" and the more advanced questions will be "how do I return x from y" or whatever, and then it scales between the two ends of the spectrum.

golergka3y ago

There's plenty of technical people who don't code — like, in your example, a game designer.

makeitdouble3y ago

Why would it be more successful than the past countless efforts to make that a reality ? (it could as well be, but why do you think so?)

As a more simple problem space, building programs from UML charts was one of Java's promise, and it failed miserably, not because the technology was lacking, but because it's just a damn hard problem.

As of now ee have nothing approaching "non technical people to be able to instantly create things" if the "things" you want are useful in any way.

joenot4433y ago

The big difference is that Copilot generates real, functioning code based off past implementations of real, functioning code. A no-code tool like Webflow is an entirely different product all together. The ML transformers powering Copilot are the secret sauce here, that technology just literally did not exist when Java and UML were all the rage.

I'd encourage anyone who hasn't tried it to give Copilot a try. There really has never been anything like it in my memory, and while I totally agree there have been dozens of efforts to allow non-technical people to generate code, I think Copilot may be on to something very special.

1 more reply

yi_xuan3y ago

The problems are more than how to express.

You have to read code and debug it that is inevitable, you can't say that there will not be any bugs if you use voice instead of writing.

nfRfqX5n3y ago

I think it’ll be better because Copilot is pretty good for typing and this builds upon that framework

wwilim3y ago

People who can think of weather in terms of plotting and correlating it are usually technical enough to code

urthor3y ago

Exactly.

The crazy thing is... this probably will work.

In 20 years. But, it probably will work.

There is absolutely no reason you cannot use a neural network to transcribe appropriately phrased requirements into an AST.

tovej3y ago

There are several reasons this wouldn't work:

1. How do you check the output of the voice to code step? If you need as much expertise as you do now to actually review the code, then the voice to code step is just a layer that adds confusion

2. How would debugging work? Again, would you still need to be able to understand the code? Same issue.

3. What if you have to pause and think? This will affect how the voice to code interface interprets your speech.

4. How would you make a precise edit to your source audio using a voice interface?

5. How would you make changes which touch multiple components across the project? How would you coordinate this?

6. Precisely defining interfaces between components and using correct references to specific symbols is very difficult to do in natural speech, which typically uses context to resolve ambiguous references. The language you would be using would still have to resemble the strictness of a programming language even when spoken, but you have replaced a reliable checkable channel (input through keyboard, transfer as-as to text buffer, feedback from visual view of source) with an unreliable channel (input through microphone, transfer through complex signal processing and multiple neural network language models, through multiple representations, where you have to be able to check multiple representations for feedback about the structure of your program (initial speech-to-text step, text to source))

1 more reply

nuccy3y ago

It may work. Since it may be just a new "programming language" (somewhat literally), i.e. a new level of high level abstraction. We already know examples of such transition to higher abstraction levels: binary code -> assembly languages -> c/lisp/fortran/etc -> c++/javascript/go/python/r -> np/torch/react/whatever frameworks/libraries. For an average programmer nowadays knowledge of frameworks/libraries is as important (if not more important even) as actual knowledge of the programming language they use. The only disadvantage of this is that people will need to adapt to something generated and updated via a machine learning. So far there are not much examples of that, except maybe people adapting to Tesla Autopilot with every new release. Before we were adapting to a new c++/python/framework version, in future there will be GitHubNext v1, v2 and v3 with known features and bugs.

1 more reply

BiteCode_dev3y ago

Before copilot, we were far from it.

But now, given how magical this thing is, it opens many doors to what's possible with no code.

I never really believe in anything no code before apart from Excel and RAD.

But basic tasks are going to get accessible to a lot of people sooner than I expected

1 more reply

chocolatkey3y ago· 8 in thread

If this works well, I would pay a seriously high amount of money. My daily coding time is currently limited by the pain in my hand/fingers that eventually becomes too uncomfortable, and I have to wait for a "cooldown" period of days to "reset" my hands back to normal. I can't even code on a normal keyboard or trackpad for a long time anymore.

The problem with current voice programming systems is they're just too slow so I end up getting impatient and using my fingers anyway

eajakobsen3y ago

I imagine you have done extensive research on your own, but in any case I found this article by Josh Comeau on coding with voice commands and eye tracking very interesting: https://www.joshwcomeau.com/blog/hands-free-coding/

langsoul-com3y ago

I wonder if the quest pro could do the eye tracking as well? It has pretty extensive eye tracking cameras, not sure how precise they're though.

Could it also do a virtual keyboard, but a custom layout to not trigger arm, elbow and hand pains?

1 more reply

ron223y ago

If you want to code with your voice, also checkout https://github.com/cursorless-dev/cursorless

Hesinde3y ago

Have you tried an organ MIDI pedalboard and a script to translate MIDI to keystrokes? You could also put a micro controller between the pedalboard and your computer so that it looks like a normal keyboard to the computer. I do not know whether that would be pratical, but some sort of feet keyboard is in my idea space for what if.

rahulpandita3y ago

GitHubNext here! We appreciate your support. Please consider helping us by signing up for the experiment on the website and providing feedback. :)

langsoul-com3y ago

I'm assuming you have an ergonomic keyboard? If so which one?

tgv3y ago

And a movable monitor, decent chair and desk at the appropriate height...

1 more reply

chocolatkey3y ago

Ergodox EZ

FloatArtifact3y ago· 3 in thread

The challenges that remain in speech coding is not generating code as much as it is navigating through existing code or an application.

There's only two ways to do this effectively and unfortunately no one has taken the true path to accessibility. The more common way is plugins/extensions to grab a information from the editor.

Accessibility is more than just one editor. It's the OS and all the applications. Microsoft needs to take the hard route to make an accessibility UI automation server to grab that information and only make up the difference through plugins as needed.

It's all about grabbing information from the application and generating on the fly commands, not just parsing free dictation in order to get the best accuracy.

It takes a lot of expertise to make any sort of UI automation, fast and efficient for navigating and selecting text or out of focus menu items.

I've fussed around and managed to get tree sitter to navigate across code. For example generic commands are like 'next function'. Code simply isn't pronounceable when it's written by others. Therefore, navigating across generic tokens is really the best method. Then other methods can be used for fine navigation if needed.

My hope is that they develop a grammar system that is open source and integrates with accessibility frameworks focused on performance.

I wish I could have a phone call with the development team.

skydhash3y ago

I think an accessibility a la vim or with something like tree sitter, would help immensely like:

  “Top of file
  Down 5 lines
  Modify import source to …
  inside first class
  Down 5 methods
  Insert new method after
  Inside arg list
  Append arg named … of type …
  …
  …”

And add a way to indentify types and parameters with special pronunciation.

FloatArtifact3y ago

I recognize that all applications are not accessible through accessibility APIs. However, there is no high level access to accessibility APIs. There are quite a few for automated testing UI. However, none of them are performant enough for speech to code or screen readers. Testing automation frameworks don't really require high performance.

Accessibility accessing the content of the application and the context is what's important. It's more important than the speech recognition backend.

Speech recognition shines work best with a narrow context. (when those commands are available)

The type of performance we need as a speech recognition community and screen reader community is quite high. By the beginning of speech and just before decode time information needs to be available to be parsed for navigation/editing. That way these tokens can be weighted as commands for recognition.

Commands could be modeled after vim functionality though.

Outside of tree sitter it would be interesting to hook into hooking into as a client a language protocol server. However, I think they only expect one client. In addition, I still see that as a lesser approach without dedicated support for high performance UI automation server for speech recognition engine to leverage.

FloatArtifact3y ago

Yes, minimizing number of command and specificity as much as possible for navigation by understanding the context of where the user is optimizes the user's time in navigation.

Imagine even more precise commands 'next function' followed by a letter. That allows you to navigate to only a function with that letter defined. Really the possibilities are endless when we have complete context of the screen and the structure of the code.

Someday I hope for the release of something like stable diffusion for voice coding. An open complete pipeline that users can illiterate fast and innovate!

geewee3y ago· 3 in thread

Having programmed and navigated my PC via voice exclusively for about 6 months, done a ton of research and written several articles about it and what options are out there [0][1], I think might be pretty ground-breaking stuff.

Inputting code with voice is generally difficult, often due to variable names, casing, punctuation etc being hard to get right in voice-to-text. I think this might help quite a lot with that.

_However_, some of the hardest things in voice coding isn't necessarily just the input. Navigating large codebases is hard, and particularly editing existing code can be extremely difficult, probably much more difficult than just inputting new code.

I have my doubt that with the demonstration shown here, that it's able to make complex editing tasks simple, but if it does - I cannot overstate how huge of a leap forward it is.

[0]: https://www.gustavwengel.dk/state-of-voice-coding-2017/ [1]: https://www.gustavwengel.dk/state-of-voice-coding-2019/

jovial_cavalier3y ago

>Having programmed and navigated my PC via voice exclusively for about 6 months...

I'm curious, why have you done this?

FloatArtifact3y ago

I can't speak for his use case. However, people with medical conditions like RSI, stroke or anything that limits their action between keyboard and mouse.

However, the average developer doesn't need those fine-grained navigation controls but can still benefit from enhanced input through voice. Some have mental disabilities who interface differently. Others are simply supplement their input as an average developer by voice as a preventative measure for repetitive strain RSI. The day the hope is develop something that every developer could see the value and leverage. In a way accessibility is for everyone.

In general I see accessibility as a hierarchy that could benefit everyone. Accessibility APIs, close to real time OCR, Eye tracking, alternative inputs (eg pedal, touch pad, stylus) allowing for the broadest possible input and APIs to extract information from applications. Extraction of information from applications and input to applications allows the user to specialize for their use case.

My experience as people will become experts in voice their command vernacular shortens as they carve out their niche use case. It goes beyond singular shortcuts too series of actions to get stuff done. However, what really means to happen is voice systems need access to the OS and to the application to really shine. That would empower not only navigation for those that are disabled but context-specific commands that are intuitive and abstracted like next function or parameter.

geewee3y ago

I had very bad RSI

pfd19863y ago· 3 in thread

I think commenters here are -- as usual -- missing the point. This is the training ground (literally) for better models able to respond to commands like "take the CSV from me desktop, plot columns A and D and check if the KL divergence os close to zero". And from that to more complex tasks. You always need the first step and this is it.

I'm bullish.

BiteCode_dev3y ago

Exactly.

Copilot is getting better everyday, because it's learning from the way we are using it.

rahulpandita3y ago

GitHubNext here! We appreciate your support.

rahulpandita3y ago

GitHubNext here! We appreciate your support.

nightski3y ago· 3 in thread

Spoken language is incredibly ambiguous. It's one thing to generate a drawing which can vary wildly in output and still be acceptable. It's another to specify something precisely to a computer. Working with non-programmers on a daily basis it is incredible how difficult it is to communicate even relatively simple things without confusion.

So all the more power to them, but I am very skeptical. Especially since co-pilot has zero knowledge of the formal semantics of programming languages.

This is a lot different than the half ass auto complete that it already does since that at least has some context.

tluyben23y ago

It's the same with copilot; you have to know how to implement things to implement things with copilot (for the most part), but when you are a programmer and you could write the code, then you know the prompt to write to generate 10+ lines of code for 1 comment of text all day long. Especially for data transformation, copilot has been a real magic tool; if you put in a comment:

      /*
      this functions transforms this json from: 

      { 
          ... some complex structure in json 
      }
 
      to this json: 
   
      {
          ... some different structure in json 
      }
 
      */

... copilot comes up with the function that takes in the first and spits out the latter. Even if the fieldnames do not match etc, it usually 'guesses' right what fits on what (so it does have some context from it's learning phase what 'looks alike' or 'might be the same thing'. Example: I had a structure with firstName: string, lastName: string and a target structure with name: string; it just did name: firstName+' '+lastName, which was indeed what I wanted. But it comes up with more intricate stuff as well that is pretty much surprising (too human basically).

What is another bonus; if you generated function transfromAtoB(a: A) above, then you only have to do:

      /*
      do the reverse of function transfromAtoB, accept json structure B as input and return structure A
      */

And it'll come up with the reverse.

It's not hard to write yourself, but it's boring and error prone (some of these structures are huge). Now I press tab a bunch of times, and run the tests to see if it worked. I am also not that worried i'm infringing someone's open source code; this is all way to custom to look like anything else. That's where this shines; things where it verbatim copies something, you should've been using a library anyway.

Statically typing and using typescript definitely works better than other combinations I have tried (C# was pretty bad last I tried it, JS is good but often subtly wrong because of type issues).

singularity20013y ago

With copilot ambiguous language gets transformed into concrete syntax. If the implementation doesn't fit your ambiguous request, you should be able to refine … with ambiguous language. Theoretically this would create a "programming dialog" environment.

nightski3y ago

So you are going to have to verbalize something, interpret the code, build a mental model of how it works, and then if it does not match what you want go back to step 1?

That sounds exhausting when we have spent countless human years developing languages which let us communicate our intentions precisely to computers.

If you don't do this there is no ambiguity detector. Meaning it's entirely possible for the computer to interpret what you are saying completely different than intended, yet it is a perfectly valid interpretation. So the only one who can qualify if it got it right is you.

lakomen3y ago· 3 in thread

Imagine sitting there, talking to your computer, and trying to get the notations right.

If err unequal nil opening bracket, no no don't open the racket opening bracket... BRACKET, do you know what a bracket is No don't do a do while, delete delete. Don't delete everything... sigh

Well something like that, I imagine it being a very painful experience.

dorkwood3y ago

Maybe you could use little clicks and pops with your mouth to signify different characters. The computer could learn which one is which. That way instead of typing

"if (int i = 0; i < count; i++)"

you could say something like

"if beep int i click zero boop i bop count boop i pop pop zing"

This would achieve the same thing, but much faster and with less effort than typing.

xigoi3y ago

If saying that is faster than typing it, you're really slow at typing.

3D304974203y ago

Or debugging. Goodness. I can only imagine.

MauranKilom3y ago· 2 in thread

It does look like we've made some progress in the 15 years since. I do wonder how this would work in an office setting though - so much noise, so much distraction, and so much crosstalk between programmers...

avian3y ago

> I do wonder how this would work in an office setting

Everyone gets a throat mic and the cubicle farm is full of unintelligible whispering instead of clacking of keyboards? Can't wait for the future. /s

Hortinstein3y ago

Hahahah thank you for posting this, I was about to go look for this because I remember being in tears laughing when I saw it this first time and immediately thinonof this whenever I see voice controlled things

Quequau3y ago· 2 in thread

I remember a talk given some years ago by a man who was using voice to text for creating source code. The key point I remember from his talk & demonstration is that it was not casual ordinary speech but instead a very weird mashup of sounds intended to represent the various symbols which we use in source code.

simme_3y ago

I think you're talking about this video: https://youtu.be/8SkdfdXWYaI?t=1049

Quequau3y ago

Yes! That's the talk.

ausudhz3y ago· 2 in thread

Next is thoughts to code. Just read my mind I'm gonna seat there and think

nnurmanov3y ago

Song to code, we shall be singing our next systems:)

singularity20013y ago

dance to code. transform the esthetics of your movements into … whatever your boss requires.

akuji19933y ago· 2 in thread

export const ButtonComponent; FunctionComponent no Github no semicolon i meant colon Github backspace 5 times no backspace delete delete Github Arrrgh goddammit

cdrini3y ago

What you're describing is more like dictation. What you'd probably say is "export the button component", and it would determine the syntax.

akuji19933y ago

Which will probably, outside of small, perfectly planned experiments, work similarly well

1 more reply

Cort3z3y ago· 1 in thread

However weird and seemingly useless this might appear to the normal programmer on here, I see this as a huge accomplishment and an incredibly important tool. Why? Accessibility.

Let’s hope that I never get in a serious accident or get an disabling disease, but if I do I am not planning on giving up coding. What would you do if you lost your hands, or became permanently paralyzed. This is the tool we need to combat that. Hats off to github on this one.

bamboozled3y ago

People who cannot see or use a keyboard already use tools like this to code. Been doing this for a long time.

birriel3y ago· 1 in thread

In the meantime, Talon is pretty good. You can use Vim motions and commands as you normally would, except using your voice (this applies to any editor, really):

https://talonvoice.com/

willjp3y ago

Talon is exceptional, I only wish it was more natural to drive cli commands, I find I need to spell them out which I’m still quite slow at.

pmontra3y ago· 1 in thread

My reactions to the demo (when all is good there is no reaction, so here are only the problematic ones, sorry)

1) import matplotlib.pyplot as plt

Why "as plt"?! Let the import alone. But this is a matter of style.

2) Get titanic csv data from the web [...]

Surprise, it turns out that "the web" is an URL on raw.githubusercontent.com Hopefully I'll be able to spell an URL of my choice

3) clean records from titanic data where age is null

Somehow I already know that there is an Age field and somehow it knows that it must capitalize age into Age

4) fill null values of column Fare with average column values

The generated code looks great but somehow I managed to spell a capitalized Fare this time :-) (this is probably a typo in the demo)

5) Hey,Github! New line

Inserting a new line can't take so many words. We're going to do without new lines or rely on a formatter or something equivalent.

6) plot line graph of age vs fare column

This is where it becomes evident that there was no need to import as plt because I'm not pressing those keys anyway. But this is style and it's going to be uniform across all the users of these tools.

7) Hey, Github! Run program

Good.

Considerations:

A) Why do commands (new line, run) need "Hey, Github!" which is pretty long and terrible to repeat all the day long (just imagine having to say Hey Joe every time we have to say a sentence to Joe, withing a long conversation with Joe) and text-to-code doesn't?

B) We got a graph at the end. Now what should I do to edit the code in those 99% of cases where I got the graph wrong? An acceptable answer could be mouse and keyboard. It's a little underwhelming but voice to code already gave me the structure of the code.

C) Does that mean that Microsoft and GitHub are going to know all the closed source code we'll write for our customers (there might be contractual implications) or is this something that will be self hosted in our machines?

rahulpandita3y ago

GitHubNext here! Here is a little writeup that explains a bit more about the project https://github.com/githubnext/githubnext/tree/main/HeyGitHub

Hope this is helpful :)

jasonlfunk3y ago· 1 in thread

I probably wouldn’t use this to write code, but I could see it being really useful for navigating around a project.

“Go to line 35” “Open the model controller” “Show the get method and set method side by side”

anshumankmr3y ago

if you remember the keyboard shortcuts, you can be quite fast while working with VSCode. Voice will never be perfect.

philmander3y ago· 1 in thread

This is effectively a new higher level programming language without a fixed syntax. Describing more the "what", not "how", and being much closer to natural language over computer language.

The voice part seems like an (albeit important) accessibility add on.

I'm sure it won't be perfect but an amazing step forward in the evolution of programming languages

meowface3y ago

I could be wrong, but I think (minus editor commands) much of this can be emulated in existing Copilot by writing a comment symbol followed by natural language. I wouldn't even be surprised if under the hood "Hey, GitHub!" is basically doing exactly that with the voice input.

falcor843y ago· 1 in thread

I think this, or a future version of this, would have real potential.

I'm thinking about this in terms of the navigator-pilot pair programming approach, and believe that as a senior, if it's even half-as-good as working with a fresh out of uni hire, then it could have real value. When there's a piece of code that I would like written, when I have good test cases in mind, but would prefer to offload it on someone, I could perhaps write the test cases and function signatures (maybe with the bot's help), get the bot to fill in the blanks until it passes the tests, and then give it direct feedback on how to refactor the code.

I've signed up for the waiting list and am excited to try this out.

knutzui3y ago

What you are describing is more akin to what GitHub Copilot already does. It is really good at taking a description and a function signature and producing a solution. Paired with a solid test suite it can definitely speed up development in my experience.

dang3y ago

This is the third such thread in the last 24 hours which consists of nothing but an elaborate waiting list signup. I've changed the titles to make that clear.

GitHub Blocks – waiting list signup - https://news.ycombinator.com/item?id=33537706 - Nov 2022 (41 comments)

GitHub code search – waiting list signup - https://news.ycombinator.com/item?id=33537614 - Nov 2022 (48 comments)

A good HN discussion needs more than a waiting list signup. A good time to have a thread would be when something is actually available.

ggerganov3y ago

Very interesting - I was sort of expecting it to happen soon.

I have been playing with using Whisper + Github Copilot in Vim [0]. The Whisper text transcription runs offline with a custom C/C++ inference and I use Copilot through the copilot.nvim plugin for Neovim. The results were very satisfying.

Edit: And just in case there is interest in this, the code is available [1]. It would be very awesome if someone helps to wrap this functionality in a proper Vim plugin.

[0] https://youtu.be/3flN9kTcZJY

[1] https://github.com/ggerganov/whisper.cpp/tree/master/example...

onion2k3y ago

I've tried writing documentation and fiction using text-to-speech and, for me, it doesn't work because the apparently the of my part brain I use to think about what I'm going to say is the same part I use to actually say it, so I can't do both things at once. I end up writing far more slowly than I can type.

singularity20013y ago

In case anyone else stopped after watching the video, if you scroll down a bit further you see the list of

FEATURES

Write/edit code

Just state your intent in natural language and let Hey, GitHub! do the heavy lifting of suggesting a code snippet. And if you don't like what was generated, ask for a change in plain English. Go to the next method

Code navigation

No more using mouse and arrow keys. Ask Hey, GitHub! to...

    go to line 34
    go to method X
    go to next block

Control the IDE

"Toggle zen mode", “run the program”, or use any other VisualStudio Code command.

Code Summarization Don’t know what a piece of code does? No problem! Ask Hey, GitHub! to explain lines 3-10 and get a summary of what the code does.

Explain lines 3 - 10

susrev3y ago

All i could think of while looking at this was having to tell Siri where every comma and period should go while texting with it.

"insert curly brace", "insert semicolon", "insert insertion", etc. does not sound to fun.

hcnews3y ago

To note, there's a class action lawsuit against GitHub Co-Pilot since it learns from a bunch of open source code with very specific licenses. It's very interesting from establishing copyright in an AI training perspective. Hopefully it goes the distance and some nuanced arguments come out in the court case.

https://www.theverge.com/2022/11/8/23446821/microsoft-openai...

amarant3y ago

Oh cool, my brother used to wish out loud something like this existed a few years back when his wrists were really killing him. He's wrists were so far gone he couldn't even type on a ergonomical keyboard for any greater duration of time, so he used to wish he could just talk instead.

For me, I got a ergonomical keyboard before my wrists went bad, and so far they seem to be holding up!

Moral of the story: get a good keyboard early, or you might need a tool like this one someday!

prima-facie3y ago

Hey Github what did the previous developer actually _mean_ with this piece of legacy code?

evnix3y ago

Eye strain is one reason I have been waiting for something like this. If I could close my eyes and just navigate the codebase through a mental modal and some voice commands, I really wouldn't mind paying!

I have looked at some tools for the blind, but you need just way too much dedication for it to work for you and since you have working eyes it is usually easier to just open your eyes.

glenjamin3y ago

There was an excellent talk at Strange Loop a few years ago by Emily Shea about how she'd learnt to code vim using her voice to combat RSI.

https://www.youtube.com/watch?v=YKuRkGkf5HU

The demos are in Ruby, but I could imagine that languages with strong type-aware auto-completion could be easier to do.

silverlake3y ago

I’m working on something similar. The target market is the 99% of people who want to program ad-hoc domain-specific problems. For example, generating charts w/o having to dig through all the data sources (Wolfram Alpha does a simple version of this). Building a financial risk model for a client’s specific request (you have to be a whiz at Excel, python or some internal ide). Even for home automation, my mom can’t use Alexa’s awful app to customize routines.

I don’t think the voice part is necessary. It’s easy enough to slap ASR on the front. But going from natural language -> full problem spec -> code is hard in the general case, but doable in well-understood domains. Why can’t Scotty talk to a computer? (https://youtube.com/watch?v=hShY6xZWVGE&feature=share)

nxpnsv3y ago

GitHub is doing a whole lot. I think I prefer to edit my code in an editor, not on the website where it's hosted. And I think I don't want fancy AI driven code editor features using my code either. But I guess it is nice they are considering solutions for vision impaired users.

raidicy3y ago

I really hope this is very easy to use. I have severe RSI and can barely surf the web. I tried using other voice to code stuff and it just hurt my voice so I'm hoping I can speak very naturally. I'm really looking forward to seeing if this can help me code again.

pcj-github3y ago

I could see it being useful for things like "goto line 42" or "rename this file as...", or very simple things like that, otherwise, I don't want the cognitive overhead of having to translate coding intent through a voice interpreter.

kgrax013y ago

People can’t seriously believe this is going to be useful at all?

I can see this helping as an accessibility tool, but beyond that I don’t think it will be useful. This kind of assumes you know everything about what you’re doing, most of the time you don’t.

boredumb3y ago

As someone who works remotely from home, the last thing I need is to start babbling to myself in code for 8 hours a day. I imagine that's a one way ticket to developing some sort of disorder.

ddevault3y ago

Someone emailed me the other day to share their FOSS voice control system. I was really impressed. It seems to map syllables onto actions in a modal sense ala vim. If I were to build a voice control system, it would look much like this.

https://numen.johngebbie.com/index.html

It's free software, it's local to your machine, you don't have to sign up for it, and it works today.

okasaki3y ago

Great for accessibility, but I don't see this would work well in an open office, or even at home if other people are around. Seems really annoying.

tempodox3y ago

Imagine using this in a setting where you're not alone in the room. Imagine using this surrounded by other developers who do the same.

danwee3y ago

Curious: In "Clean records from titanic data where age is null", how does it know that the age field is exactly `Age` and not just `age`? You cannot know this without examing the data set (the headers), so is the software inspecting the loaded CSV "on the fly" before us telling it to actually execute the code?

kevmo3143y ago

Why are all the comments here so negative? Maybe typing is a hard sell, but some of the navigation stuff seems quite useful. Even being able to invoke VS Code's command palette would be really cool with this. Something like "Open Dockerfile" would be useful and maybe faster than typing.

lkrubner3y ago

My worst prediction ever was at the end of my book, when I struck a positive note about voice interfaces. The startup I was at in 2015 had the pitch "Let your sales people talk directly to Salesforce" and we pushed the limits of what we could do with NLP. That particular startup had spectacularly bad management and so it flamed out in a series of screaming, raging fights, which I documented here:

https://www.amazon.com/Destroy-Tech-Startup-Easy-Steps/dp/09...

But at the end of the book I struck an upbeat note, about how the technology was advancing quickly and within 3 or 4 years someone would achieve something much greater than our own limited successes.

But I was wrong. 7 years later I'm surprised at how little progress there has been. I don't see any startup that's done much better than what we did in 2015. Voice interfaces remain limited in accuracy and use.

hintymad3y ago

So this is a frontend of Copilot. The example of "import pandas" getting translated into "import pandas as pd" is pretty convincing, as the tool helps developers to state their intentions. On the other hand, "hey, github, a new line" kills me.

lleontop3y ago

We have come a long way. I remember when announcements like this one were done by companies on April 1st!

squarefoot3y ago

If translation is semantic and not literally identical, chances are that the user asks for a piece of code and it outputs something that is 100% identical to code that is copyrighted elsewhere. Big "blame the AI" legal loophole waiting to happen?

karmasimida3y ago

Actually would be useful.

If this is reliable I would pay to use it to some capacity, like add an argument.

crucialfelix3y ago

I spent half an hour today trying to convince the O2 voice agent to get me a real person. Conversational AI is a special kind of hell filled with unhappy paths.

But for a glimpse of the future watch The Expanse or read William Gibson's Agency.

darepublic3y ago

Execution is everything with this. I've wanted something like this so I could actually code while performing other activities or in various states of intoxication. Don't code and drive. Don't drink and code

Tade03y ago

I hope to see the click consonant "‖" adopted as "||" one day.

tabasselejambon3y ago

Let's try to picture the noise in an openspace full of people using that ... focusing is going to be difficult, well at least for people like me who are easily distracted by background noise/conversations.

troelsSteegin3y ago

What if the code in question is a DSL? Something say that is syntactically python, but with a namespace defined through a narrow set of imports. This would be interesting to explore for end-user scripting.

mtkhaos3y ago

Nice attempt and interesting workflow using a prompt based transformer. I would prefer being able to spawn a command palette and skip over the voice, alongside having the choice between different variations.

gopheryourshelf3y ago

Imagine an office where everyone is sitting screaming at their computer.

P5fRxh5kUvp2th3y ago

Programming Perl with speech recognition (an oldie but goodie)

https://www.youtube.com/watch?v=vPXEDW30qBA

mindvirus3y ago

This is awesome. I could see using this to write code on my phone even.

manesioz3y ago

Interesting. I would find this annoying because its so different from what I'm used to, but the potential it has for people with disabilities is huge.

kdmytro3y ago

This is not going to play well with open-space offices.

WormholeCreator3y ago

it is not practical if we have to describe each and every line.

Also, imagine you are sitting in an office with other team mates - what happens if all of them talk together but are working on different projects. It will disturb others in terms of noise pollution.

but it will definitely be a fun project and might work perfectly when you are working alone from home.

iillexial3y ago

Those who say it's useless, what do you think about blind people using this, or those who couldn't type?

dimazhlobo3y ago

Why does the oauth scope requires to “operate on your behalf” but the app is “not owned or operated by GitHub”.

qntmfred3y ago

see also https://serenade.ai/

https://news.ycombinator.com/item?id=22404264

karmasimida3y ago

One concern is in office space, saying things aloud is ... awkward to say the least.

teratron273y ago

I'm sure this will work well with my Scottish accent... (or any non-US accent)

hdjjhhvvhga3y ago

I'd like to see how they do with my creative variable and function names.

jenscow3y ago

    bool success equals user dot no i mean ah fuck stop stop quit

polishdude203y ago

Thank god we're remote. An open office space with this would suck.

v3ss0n3y ago

So software development houses will become call centers.

danjc3y ago

And you thought open plan offices were bad already!

polyterative3y ago

I have rsi, github please make it work well

nicolas_lorenzi3y ago

I imagine happiness in the open space

hbarka3y ago

How does it do with SQL?

eurasiantiger3y ago

I do not want this.

mezobeli3y ago

Copilot -> Pilot

univue3y ago

Addd some comments

anshumankmr3y ago

No. Thanks.

kashanjunaid3y ago

very intresting!

univue3y ago

sgdf

kajaktum3y ago

This feels like Github expanding because it can't find anything else to do...It being a for profit organization means that it's unable to say "you know what we pretty much have everything we wanted so we're just going into a maintenance/optimization mode". This happens all the time in open source project where they simply tell their users to move elsewhere for the better alternatives but will never happen to a for profit organization.

j / k navigate · click thread line to collapse

241 comments

197 comments · 78 top-level

dustedcodes3y ago· 26 in thread

For that reason I think this will be less appealing to developers than GitHub may think, otherwise I think it's a cool idea.

chipgap983y ago

I think the biggest use case for this is accessibility. There are plenty of people who permanently or temporarily cannot use a keyboard (and/or mouse). This will be great for those users.

For the average dev, I agree this is more of a novelty.

jesterswilde3y ago

I am highly suspicious of new tech coming in the guise of 'accessibility'. As someone goin blind, a lot of things toted as good for me are cumbersome and bad.

Maybe this will be different, and that'd be neat. Though I just think more expressions of code is neat. I also know the accessibility you're talkin about isn't for blindness.

That being said I can talk about code decently well, but if you've never heard code come out of text-to-speech, well, it's painful.

3 more replies

melling3y ago

“I think there is a world market for maybe five computers.” - Thomas Watson

I bet if we use our imaginations, we’ll think of a lot of places were using voice to code could come in handy.

Personally, I’ve been waiting for it for a few decades.

The creator of TCL has RSI and has been using voice since the late 1990’s

https://web.stanford.edu/~ouster/cgi-bin/wrist.php

Thought we were really close 10 years ago when Tavis Rudd developed a system:

https://youtu.be/8SkdfdXWYaI

GitHub seems to be more high-level. It figures out the syntax and what you actually want to write.

This would help if you barely knew the language.

Time to learn Rust or Scala with a little help from machine learning.

2 more replies

awslattery3y ago

cdrini3y ago

If Copilot is any indicator of effectiveness, then I have high hopes for this! I've always wanted to program while stationary biking :)

bryanrasmussen3y ago

1 more reply

raylad3y ago

Around 1998 I broke my collarbone and had to use Dragon Dictate.

I found that for general subjects it was quite difficult to use because of the fairly poor recognition rate.

But when I talked about computers, it got almost everything right. I assumed it must have been trained by the developers, who talked about computers mostly.

This is another special purpose vocabulary, so it seems as if it would have a good chance of a high recognition rate.

eurasiantiger3y ago

It’s most likely just Cortana bolted on to Copilot.

1 more reply

rpastuszak3y ago

> I could never be productive programming like this.

It's likely to work much better than a generic speech-to-text model due to fine-tuning.

Plus, consciously or not, we will adapt our human language to the English-ML "pidgin" (e.g. by introducing a more efficient grammatical structures, using a specific subset of vocabulary).

atdrummond3y ago

rahulpandita3y ago

GitHubNext here! We would love to hear more about your experience. Please help us out by signing up for this experiment :)

mrtksn3y ago

smartmic3y ago

"Writing is thinking. To write well is to think clearly. That’s why it’s so hard." ~David McCullough

This not only holds for literature but also for programming. Concerning the hard part, I would argue that is the reason why it is not called "talking is thinking".

yi_xuan3y ago

"If you're thinking without writing, you only think you are thinking." -Leslie Lamport

Even though now speech recognition rate is really high, but I wonder how many authors use speech to write articles. The comparison may make sense. And I think there's few.

mjburgess3y ago

I think there's a difference between communicating your intent to a machine, which is hopeless since it has no model of intention; and commanding a machine to reproduce something.

Ie., when you're managing your house you want something that can be communicated in an infinite number of ways, but the "AI" accepts a tiny finitude of ways.

However when programming it seems like we arent asking the machine to "write a function to do X", but rather saying, "def open-paren star args...."

This seems like a pretty trivial problem to solve.

dustedcodes3y ago

> However when programming it seems like we arent asking the machine to "write a function to do X", but rather saying, "def open-paren star args...."

Click the link first and take a look at what is being showcased, because your comment is the exact opposite of what they demo when you visit the HN link.

1 more reply

wizardofmysore3y ago

It's really useful for those who have challenges typing (arthritis, disabilities etc..), perhaps not best for general audience as typing with auto complete is faster.

alvis3y ago

However, I too really doubt if there's any better use cases than simple tasks, let alone everyone would hear what you ask the AI to do in the office. Oh my! How embarrassing am I?

danielbln3y ago

Kiro3y ago

Blind people are already very productive using voice-to-code.

jfk133y ago

dustedcodes3y ago

dkns3y ago

0-_-03y ago

The mental load would reduce with practice very quickly

CrociDB3y ago

they're creating a new job "prompt engineers" to replace the engineers. this is 2022.

ykonstant3y ago

The rise of the... t...talking monkey? cognizes intensely

singularity20013y ago· 26 in thread

VERY PROMISING, in any case you can just manually fill the gaps with the keyboard!

tgv3y ago

Semaphor3y ago

I hate talking to machines. Sometimes it’s the best option (I love using a voice assistant in the kitchen), but almost always I’d have a full keyboard as an interface instead.

If machines were amazing at Speech-to-Text, okay, sure. But while the capabilities are impressive, they still kinda suck at it.

3 more replies

klabb33y ago

Anyway, the chances of Github solving this in an exceptionally difficult subdomain, as a side project, seems like a... Let's say, long shot.

That said, the silver lining in all these billions spent on voice interfaces is accessibility. For some people, these things are a life saver.

pflenker3y ago

This is not marketed as an alternative input mechanism for people who have otherwise no difficulty typing code. It's an input mechanism for people whose abilities to type are limited.

1 more reply

chrisandchris3y ago

Because in my experience it is very often like "Call Peter" -> "Today it's sunny in NY".

smcleod3y ago

To be fair Siri was really good before iOS15 on the phone - very rarely got a word wrong then I don't know what they changed but it went belly up for me and many other people have said the same.

2 more replies

bamboozled3y ago

I've been trying to use Siri while driving more and more, it's amazing how distracting it is compared to peaking at the screen (it's naughty, I know, I try not to do it).

But yeah, something about talking to a device which gets things wrong all the time is ridiculously distracting, at least for me.

Sometimes I look back at the road after trying to workout what it interpreted and I feel scared how focused on the phone I became.

coldtea3y ago

>I can't imagine that they bring a better product than two of the richest companies in the world

Code is much more constrained by language syntax though.

Even for the "call peter" example, while the input is easy, the expected range of inputs that Siri should handle and be able to differentiate it from is huge.

Of course this is still a problem for e.g. defining variable names, where you could say anything.

ggerganov3y ago

In my experience, OpenAI's Whisper speech recognition is beyond anything currently out there. Likely Github will use it on the backend.

insanitybit3y ago

> I can't imagine that they bring a better product than two of the richest companies in the world even can't figure out

Are either of those companies investing particularly heavily into voice agents? Certainly neither of them has anywhere near the kind of power of something like Copilot.

Also, a general agent is way different from one that's specific to writing code.

isthisthingon993y ago

Somehow Google has gotten worse in the last couple of years.

creata3y ago

It seems wonderful for people who can't as easily use a keyboard, but for most people, this doesn't seem any easier than using a keyboard. Am I missing something?

Toutouxc3y ago

I would totally enjoy being able to tell my IDE to "call foo with bar and string hello there end string with a block of gee times two" or something, instead of:

  foo(:bar, "Hello there") { |gee| gee * 2 }

Just that, not having to think about typing different symbols would be a serious quality of life feature for me.

3 more replies

TheUndead963y ago

https://www.bbc.com/news/science-environment-52094111

jonathanstrange3y ago

If voice dictation was a killer feature, everybody would use it all the time for ordinary texts. But for some reason only few (lawyers? doctors?) use it.

1 more reply

bamboozled3y ago

"… this will be a generational paradigm change in how to write code… if it works."

Why?

If it was a neural link then I'd be in agreement.

imron3y ago

> The hard part will be editing code with your voice too

The hard part will be open plan offices.

It’s bad enough that so many meetings are now zoom/teams and proximity to coworkers means you end up hearing their side of their meetings.

Just wait until all the devs are coding this way too.

klabb33y ago

It's the future I always imagined as a child. A vast divider-less cubicle scape of people in Patagonia vests who define all caps constants by yelling at their standing desks.

"USER!! UNDERSCORE LIMIT!! EQUALS TWO THOUSAND AND FORTY EIGHT!"

willsmith723y ago

https://www.youtube.com/watch?v=ILfTrUreS00

1 more reply

mlajtos3y ago

Crazy idea – whisper to use your computer. Might produce some quality ASMR in open plan office.

wiseowise3y ago

> this will be a generational paradigm change in how to write code… if it works

Why?

mkmk33y ago

I could see it maybe being important once github codepilot is embedded in it? You tell it roughly what you want and then adapt by hand. But it is kinda funny seeing parent make such claims so early

1 more reply

wokwokwok3y ago

How can it not be a paradigm change when it changes the way people write code from “write by hand” to “generated by ai with natural language”?

The problem with speech to code has always been that precise syntax is hard, but AI codegen solves that.

So, no, it might not take off, but I feel like if it does, then it means ai-codegen will become the dominant way code is crafted.

That would be paradigm shifting.

It’s inconceivable that it wouldn’t be.

3 more replies

jpalomaki3y ago

The example puts it quite well. You kind of know what you want to achieve, step by step, but are not so comfortable with your tools.

stackbutterflow3y ago

Works only in remote. Used in the office that'd be madness.

Maken3y ago

I can't wait to edit my Unreal blueprints using voice commands. Truly the future of programming.

tweetle_beetle3y ago· 19 in thread

junon3y ago

Ah yes, Dragon NaturallySpeaking. Training for hours and hours and getting incredibly subpar results. It was a fun toy but there's a reason it didn't really take off in corporate settings.

tecleandor3y ago

Now, in March this year, Nuance has been acquire by Microsoft.

1 more reply

Aeolun3y ago

The best recognition rates on the words ‘scratch that’, which you used every other sentence.

mjochim3y ago

Training data is now abundant compared to twenty years ago, and so is computation power. That means training can be much more complex now.

The underlying technology is now typically neural networks (broadly speaking), whereas twenty years ago it might have been Hidden Markov Models.

rommel9173y ago

JW_000003y ago

I cannot even concentrate on reading a text while the radio is playing; let alone programming while assembling furniture.

Closi3y ago

It's hugely significant - look at this graph of Google's speech model accuracy across 2013 to 2017:

https://sonix.ai/packs/media/images/corp/articles/history-of...

Or this that shows a similar pattern:

https://cdn.static-economist.com/sites/default/files/externa...

danjc3y ago

Unfortunately 95% isn’t a lot of nines

1 more reply

arez3y ago

xigoi3y ago

There is a reason we don't have many programming languages where you can say “drop duplicates from titanic”.

2 more replies

cdrini3y ago

Main things that have changed are:

1) Improvements in speech to text, as others have mentioned

Double_a_923y ago

Aeolun3y ago

The only time I’ve ever successfully used voice recognition was teaching skyrim to recognize my words of power. Shouting fus-roh-dah ! was incredibly satisfying.

kreddor3y ago

It might have been Dragon NaturallySpeaking. I remember toying around with it 20 years ago or so. Apparently it has just been bought by Microsoft.

tjixxu3y ago

I have good memories spending 4+ hours training Dragon to end up with what seemed like 30% accuracy.

mmikeff3y ago

Can I have transcription that can then turn my rambling into neat and concise prose?

bagels3y ago

Many people dictate messages on their phones. Doctors use it extensively.

towawy3y ago

msaharia3y ago

Dragon speech? I used it quite a bit 10 years ago!

Sheeny963y ago· 11 in thread

bodge50003y ago

> "hey, create me a view of all the weather data from the past year correlated against x and y in z view format"

Ironically, I think only technical people would even want to do something like that. The less technical you get, the more high level (and ambitious) you need to go.

golergka3y ago

There's plenty of technical people who don't code — like, in your example, a game designer.

makeitdouble3y ago

Why would it be more successful than the past countless efforts to make that a reality ? (it could as well be, but why do you think so?)

As of now ee have nothing approaching "non technical people to be able to instantly create things" if the "things" you want are useful in any way.

joenot4433y ago

1 more reply

yi_xuan3y ago

The problems are more than how to express.

You have to read code and debug it that is inevitable, you can't say that there will not be any bugs if you use voice instead of writing.

nfRfqX5n3y ago

I think it’ll be better because Copilot is pretty good for typing and this builds upon that framework

wwilim3y ago

People who can think of weather in terms of plotting and correlating it are usually technical enough to code

urthor3y ago

Exactly.

The crazy thing is... this probably will work.

In 20 years. But, it probably will work.

There is absolutely no reason you cannot use a neural network to transcribe appropriately phrased requirements into an AST.

tovej3y ago

There are several reasons this wouldn't work:

1. How do you check the output of the voice to code step? If you need as much expertise as you do now to actually review the code, then the voice to code step is just a layer that adds confusion

2. How would debugging work? Again, would you still need to be able to understand the code? Same issue.

3. What if you have to pause and think? This will affect how the voice to code interface interprets your speech.

4. How would you make a precise edit to your source audio using a voice interface?

5. How would you make changes which touch multiple components across the project? How would you coordinate this?

1 more reply

nuccy3y ago

1 more reply

BiteCode_dev3y ago

Before copilot, we were far from it.

But now, given how magical this thing is, it opens many doors to what's possible with no code.

I never really believe in anything no code before apart from Excel and RAD.

But basic tasks are going to get accessible to a lot of people sooner than I expected

1 more reply

chocolatkey3y ago· 8 in thread

The problem with current voice programming systems is they're just too slow so I end up getting impatient and using my fingers anyway

eajakobsen3y ago

langsoul-com3y ago

I wonder if the quest pro could do the eye tracking as well? It has pretty extensive eye tracking cameras, not sure how precise they're though.

Could it also do a virtual keyboard, but a custom layout to not trigger arm, elbow and hand pains?

1 more reply

ron223y ago

If you want to code with your voice, also checkout https://github.com/cursorless-dev/cursorless

Hesinde3y ago

rahulpandita3y ago

GitHubNext here! We appreciate your support. Please consider helping us by signing up for the experiment on the website and providing feedback. :)

langsoul-com3y ago

I'm assuming you have an ergonomic keyboard? If so which one?

tgv3y ago

And a movable monitor, decent chair and desk at the appropriate height...

1 more reply

chocolatkey3y ago

Ergodox EZ

FloatArtifact3y ago· 3 in thread

The challenges that remain in speech coding is not generating code as much as it is navigating through existing code or an application.

There's only two ways to do this effectively and unfortunately no one has taken the true path to accessibility. The more common way is plugins/extensions to grab a information from the editor.

It's all about grabbing information from the application and generating on the fly commands, not just parsing free dictation in order to get the best accuracy.

It takes a lot of expertise to make any sort of UI automation, fast and efficient for navigating and selecting text or out of focus menu items.

My hope is that they develop a grammar system that is open source and integrates with accessibility frameworks focused on performance.

I wish I could have a phone call with the development team.

skydhash3y ago

I think an accessibility a la vim or with something like tree sitter, would help immensely like:

  “Top of file
  Down 5 lines
  Modify import source to …
  inside first class
  Down 5 methods
  Insert new method after
  Inside arg list
  Append arg named … of type …
  …
  …”

And add a way to indentify types and parameters with special pronunciation.

FloatArtifact3y ago

Accessibility accessing the content of the application and the context is what's important. It's more important than the speech recognition backend.

Speech recognition shines work best with a narrow context. (when those commands are available)

Commands could be modeled after vim functionality though.

FloatArtifact3y ago

Yes, minimizing number of command and specificity as much as possible for navigation by understanding the context of where the user is optimizes the user's time in navigation.

Someday I hope for the release of something like stable diffusion for voice coding. An open complete pipeline that users can illiterate fast and innovate!

geewee3y ago· 3 in thread

Inputting code with voice is generally difficult, often due to variable names, casing, punctuation etc being hard to get right in voice-to-text. I think this might help quite a lot with that.

I have my doubt that with the demonstration shown here, that it's able to make complex editing tasks simple, but if it does - I cannot overstate how huge of a leap forward it is.

[0]: https://www.gustavwengel.dk/state-of-voice-coding-2017/ [1]: https://www.gustavwengel.dk/state-of-voice-coding-2019/

jovial_cavalier3y ago

>Having programmed and navigated my PC via voice exclusively for about 6 months...

I'm curious, why have you done this?

FloatArtifact3y ago

I can't speak for his use case. However, people with medical conditions like RSI, stroke or anything that limits their action between keyboard and mouse.

geewee3y ago

I had very bad RSI

pfd19863y ago· 3 in thread

I'm bullish.

BiteCode_dev3y ago

Exactly.

Copilot is getting better everyday, because it's learning from the way we are using it.

rahulpandita3y ago

GitHubNext here! We appreciate your support.

rahulpandita3y ago

GitHubNext here! We appreciate your support.

nightski3y ago· 3 in thread

So all the more power to them, but I am very skeptical. Especially since co-pilot has zero knowledge of the formal semantics of programming languages.

This is a lot different than the half ass auto complete that it already does since that at least has some context.

tluyben23y ago

      /*
      this functions transforms this json from: 

      { 
          ... some complex structure in json 
      }
 
      to this json: 
   
      {
          ... some different structure in json 
      }
 
      */

What is another bonus; if you generated function transfromAtoB(a: A) above, then you only have to do:

      /*
      do the reverse of function transfromAtoB, accept json structure B as input and return structure A
      */

And it'll come up with the reverse.

Statically typing and using typescript definitely works better than other combinations I have tried (C# was pretty bad last I tried it, JS is good but often subtly wrong because of type issues).

singularity20013y ago

nightski3y ago

So you are going to have to verbalize something, interpret the code, build a mental model of how it works, and then if it does not match what you want go back to step 1?

That sounds exhausting when we have spent countless human years developing languages which let us communicate our intentions precisely to computers.

lakomen3y ago· 3 in thread

Imagine sitting there, talking to your computer, and trying to get the notations right.

If err unequal nil opening bracket, no no don't open the racket opening bracket... BRACKET, do you know what a bracket is No don't do a do while, delete delete. Don't delete everything... sigh

Well something like that, I imagine it being a very painful experience.

dorkwood3y ago

Maybe you could use little clicks and pops with your mouth to signify different characters. The computer could learn which one is which. That way instead of typing

"if (int i = 0; i < count; i++)"

you could say something like

"if beep int i click zero boop i bop count boop i pop pop zing"

This would achieve the same thing, but much faster and with less effort than typing.

xigoi3y ago

If saying that is faster than typing it, you're really slow at typing.

3D304974203y ago

Or debugging. Goodness. I can only imagine.

MauranKilom3y ago· 2 in thread

avian3y ago

> I do wonder how this would work in an office setting

Everyone gets a throat mic and the cubicle farm is full of unintelligible whispering instead of clacking of keyboards? Can't wait for the future. /s

Hortinstein3y ago

Quequau3y ago· 2 in thread

simme_3y ago

I think you're talking about this video: https://youtu.be/8SkdfdXWYaI?t=1049

Quequau3y ago

Yes! That's the talk.

ausudhz3y ago· 2 in thread

Next is thoughts to code. Just read my mind I'm gonna seat there and think

nnurmanov3y ago

Song to code, we shall be singing our next systems:)

singularity20013y ago

dance to code. transform the esthetics of your movements into … whatever your boss requires.

akuji19933y ago· 2 in thread

export const ButtonComponent; FunctionComponent no Github no semicolon i meant colon Github backspace 5 times no backspace delete delete Github Arrrgh goddammit

cdrini3y ago

What you're describing is more like dictation. What you'd probably say is "export the button component", and it would determine the syntax.

akuji19933y ago

Which will probably, outside of small, perfectly planned experiments, work similarly well

1 more reply

Cort3z3y ago· 1 in thread

However weird and seemingly useless this might appear to the normal programmer on here, I see this as a huge accomplishment and an incredibly important tool. Why? Accessibility.

bamboozled3y ago

People who cannot see or use a keyboard already use tools like this to code. Been doing this for a long time.

birriel3y ago· 1 in thread

In the meantime, Talon is pretty good. You can use Vim motions and commands as you normally would, except using your voice (this applies to any editor, really):

https://talonvoice.com/

willjp3y ago

Talon is exceptional, I only wish it was more natural to drive cli commands, I find I need to spell them out which I’m still quite slow at.

pmontra3y ago· 1 in thread

My reactions to the demo (when all is good there is no reaction, so here are only the problematic ones, sorry)

1) import matplotlib.pyplot as plt

Why "as plt"?! Let the import alone. But this is a matter of style.

2) Get titanic csv data from the web [...]

Surprise, it turns out that "the web" is an URL on raw.githubusercontent.com Hopefully I'll be able to spell an URL of my choice

3) clean records from titanic data where age is null

Somehow I already know that there is an Age field and somehow it knows that it must capitalize age into Age

4) fill null values of column Fare with average column values

The generated code looks great but somehow I managed to spell a capitalized Fare this time :-) (this is probably a typo in the demo)

5) Hey,Github! New line

Inserting a new line can't take so many words. We're going to do without new lines or rely on a formatter or something equivalent.

6) plot line graph of age vs fare column

7) Hey, Github! Run program

Good.

Considerations:

rahulpandita3y ago

GitHubNext here! Here is a little writeup that explains a bit more about the project https://github.com/githubnext/githubnext/tree/main/HeyGitHub

Hope this is helpful :)

jasonlfunk3y ago· 1 in thread

I probably wouldn’t use this to write code, but I could see it being really useful for navigating around a project.

“Go to line 35” “Open the model controller” “Show the get method and set method side by side”

anshumankmr3y ago

if you remember the keyboard shortcuts, you can be quite fast while working with VSCode. Voice will never be perfect.

philmander3y ago· 1 in thread

This is effectively a new higher level programming language without a fixed syntax. Describing more the "what", not "how", and being much closer to natural language over computer language.

The voice part seems like an (albeit important) accessibility add on.

I'm sure it won't be perfect but an amazing step forward in the evolution of programming languages

meowface3y ago

falcor843y ago· 1 in thread

I think this, or a future version of this, would have real potential.

I've signed up for the waiting list and am excited to try this out.

knutzui3y ago

dang3y ago

This is the third such thread in the last 24 hours which consists of nothing but an elaborate waiting list signup. I've changed the titles to make that clear.

GitHub Blocks – waiting list signup - https://news.ycombinator.com/item?id=33537706 - Nov 2022 (41 comments)

GitHub code search – waiting list signup - https://news.ycombinator.com/item?id=33537614 - Nov 2022 (48 comments)

A good HN discussion needs more than a waiting list signup. A good time to have a thread would be when something is actually available.

ggerganov3y ago

Very interesting - I was sort of expecting it to happen soon.

Edit: And just in case there is interest in this, the code is available [1]. It would be very awesome if someone helps to wrap this functionality in a proper Vim plugin.

[0] https://youtu.be/3flN9kTcZJY

[1] https://github.com/ggerganov/whisper.cpp/tree/master/example...

onion2k3y ago

singularity20013y ago

In case anyone else stopped after watching the video, if you scroll down a bit further you see the list of

FEATURES

Write/edit code

Code navigation

No more using mouse and arrow keys. Ask Hey, GitHub! to...

    go to line 34
    go to method X
    go to next block

Control the IDE

"Toggle zen mode", “run the program”, or use any other VisualStudio Code command.

Code Summarization Don’t know what a piece of code does? No problem! Ask Hey, GitHub! to explain lines 3-10 and get a summary of what the code does.

Explain lines 3 - 10

susrev3y ago

All i could think of while looking at this was having to tell Siri where every comma and period should go while texting with it.

"insert curly brace", "insert semicolon", "insert insertion", etc. does not sound to fun.

hcnews3y ago

https://www.theverge.com/2022/11/8/23446821/microsoft-openai...

amarant3y ago

For me, I got a ergonomical keyboard before my wrists went bad, and so far they seem to be holding up!

Moral of the story: get a good keyboard early, or you might need a tool like this one someday!

prima-facie3y ago

Hey Github what did the previous developer actually _mean_ with this piece of legacy code?

evnix3y ago

I have looked at some tools for the blind, but you need just way too much dedication for it to work for you and since you have working eyes it is usually easier to just open your eyes.

glenjamin3y ago

There was an excellent talk at Strange Loop a few years ago by Emily Shea about how she'd learnt to code vim using her voice to combat RSI.

https://www.youtube.com/watch?v=YKuRkGkf5HU

The demos are in Ruby, but I could imagine that languages with strong type-aware auto-completion could be easier to do.

silverlake3y ago

nxpnsv3y ago

raidicy3y ago

pcj-github3y ago

kgrax013y ago

People can’t seriously believe this is going to be useful at all?

I can see this helping as an accessibility tool, but beyond that I don’t think it will be useful. This kind of assumes you know everything about what you’re doing, most of the time you don’t.

boredumb3y ago

As someone who works remotely from home, the last thing I need is to start babbling to myself in code for 8 hours a day. I imagine that's a one way ticket to developing some sort of disorder.

ddevault3y ago

https://numen.johngebbie.com/index.html

It's free software, it's local to your machine, you don't have to sign up for it, and it works today.

okasaki3y ago

Great for accessibility, but I don't see this would work well in an open office, or even at home if other people are around. Seems really annoying.

tempodox3y ago

Imagine using this in a setting where you're not alone in the room. Imagine using this surrounded by other developers who do the same.

danwee3y ago

kevmo3143y ago

lkrubner3y ago

https://www.amazon.com/Destroy-Tech-Startup-Easy-Steps/dp/09...

But at the end of the book I struck an upbeat note, about how the technology was advancing quickly and within 3 or 4 years someone would achieve something much greater than our own limited successes.

hintymad3y ago

lleontop3y ago

We have come a long way. I remember when announcements like this one were done by companies on April 1st!

squarefoot3y ago

karmasimida3y ago

Actually would be useful.

If this is reliable I would pay to use it to some capacity, like add an argument.

crucialfelix3y ago

I spent half an hour today trying to convince the O2 voice agent to get me a real person. Conversational AI is a special kind of hell filled with unhappy paths.

But for a glimpse of the future watch The Expanse or read William Gibson's Agency.

darepublic3y ago

Tade03y ago

I hope to see the click consonant "‖" adopted as "||" one day.

tabasselejambon3y ago

troelsSteegin3y ago

mtkhaos3y ago

gopheryourshelf3y ago

Imagine an office where everyone is sitting screaming at their computer.

P5fRxh5kUvp2th3y ago

Programming Perl with speech recognition (an oldie but goodie)

https://www.youtube.com/watch?v=vPXEDW30qBA

mindvirus3y ago

This is awesome. I could see using this to write code on my phone even.

manesioz3y ago

Interesting. I would find this annoying because its so different from what I'm used to, but the potential it has for people with disabilities is huge.

kdmytro3y ago

This is not going to play well with open-space offices.

WormholeCreator3y ago

it is not practical if we have to describe each and every line.

Also, imagine you are sitting in an office with other team mates - what happens if all of them talk together but are working on different projects. It will disturb others in terms of noise pollution.

but it will definitely be a fun project and might work perfectly when you are working alone from home.

iillexial3y ago

Those who say it's useless, what do you think about blind people using this, or those who couldn't type?

dimazhlobo3y ago

Why does the oauth scope requires to “operate on your behalf” but the app is “not owned or operated by GitHub”.

qntmfred3y ago