Emacs Tramp over AWS SSM APIs (opens in new tab)

(martin.baillie.id)

116 pointszshev4y ago57 comments

57 comments

TIL: TRAMP (Transparent Remote Access, Multiple Protocols) is a package for editing remote files [...] Whereas the others use FTP to connect to the remote host and to transfer the files, TRAMP uses a remote shell connection (rlogin, telnet, ssh).

https://www.emacswiki.org/emacs/TrampMode

tptacek4y ago

Tramp is one of those "reasons to use Emacs in the first place" packages. I've been using Emacs since the 1990s, when someone impressed me with syntax highlighting in Lucid Emacs, and I only picked up Tramp last year. In the last 6-9 months or so, almost all of my development has been over Tramp.

What's particularly impressive about Tramp is that other Emacs packages tend to work well with it. For instance, you can Magit over Tramp --- or, better put: Magit just works in Tramp buffers. Same with language server stuff. It's kind of wild when you think about what's happening under the hood.

throwawayboise4y ago

Tramp can use FTP if you tell it to.

And it's still transferring files. It's not remotely editing.

taeric4y ago

Tramp is by far the most magical feeling trick in the emacs toolbox.

Scarbutt4y ago

vscode ate everyone's lunch in this department.

ak2174y ago

Yes, the vscode remote development plugin is a game changer. It's the new benchmark for how client-server IDEs should work. I am (and more importantly, my team is) no longer constrained to the terminal and memorizing incredibly obscure emacs or vi commands to get stuff done on a remote instance. There is no input lag because vscode keeps all the IDE UI local while doing all the heavy lifting remotely. And to the article's point, it treats the remote as "cattle not pets": all of my vscode settings and preferences are local and synced to github, and any time I connect to a new instance, it's able to reinitialize all of my vscode remote state from scratch. Tramp may have been able to do some of that before... but its accessibility was lacking.

natrys4y ago

Emacs isn't constrained by terminal, Tramp doesn't need knowing any "incredibly obscure" command, and since one uses local Emacs there is no input lag to speak of. I feel like you are conflating Emacs and Vi here even though they are not same at all, only in case of Vi you connect via ssh in a terminal and do everything remote side, not in Emacs. I use Tramp to have local Emacs connect to remote docker container, where everything else is in remote side. Even remote language server works in eglot over Tramp.

1 more reply

xemdetia4y ago

I'm confused with claiming obscure emacs or vi commands for TRAMP, I'm from the emacs side and as soon as I understood the file path scheme (e.g.: /ssh:$host:/path/to/file) I didn't have to do anything beyond that.

I would say that the dev client/server setup you're describing and what TRAMP provides are different things overall as well. TRAMP really just provides a way to get a file from a remote, edit it locally, and on save write it back to the remote system. I would not consider it a valid use case for remote dev especially now with how prevalent things like LSP's are and I don't know of a major mode that is designed around a remote LSP I'd just do X forwarding or some other screen share at that point. I would agree that overall it's a gap for emacs that VS Code does better.

2 more replies

dangom4y ago

I'd love to see a workflow comparison between emacs, vim (with remote work via neovim's tcp support + neovide) and vscode. I'm currently using Emacs and have a pretty decent setup for remote work with jupyter-emacs and tramp, and it's pretty much 0 overhead to run the same code on multiple remotes, or have the same remote run code stored in multiple places. With that said, all abstractions end if my SSH connection breaks, since remote stuff dies on disconnect. With neovim the remote can run inside of a tmux pane, so disconnections are not really a problem, but my vim skills are as of yet not as great.

I haven't used VS code yet, simply because of lack of time in relearning another editor. In which particular way do you feel like VS code remote plugin is superior to the alternatives? And is there anything lacking in your VS code experience as of today?

2 more replies

forty4y ago

You seem to imply that using vi or emacs is an inferior UX than using VSCode (as it's a constraint), but I'm sure many people people feel differently (me included, and I have used VSCode as my main ide for a while).

2 more replies

taeric4y ago

How? Tramp has been working at this level for well over a decade, if I'm not mistaken. And you don't have to have anything installed in the "host" that you are connecting to.

squiddev4y ago

If you're working on very slow network connections, or the network just dies entirely, it's not uncommon for Emacs just to hang entirely. You then have to kill Emacs from another terminal. At the time I was running exwm (an Emacs window manager), which made the whole thing even more painful. Emacs is powerful, but polish is not its strong point.

That said, this was a few years ago now. Things may have improved in 26.1 when threads were introduced, and async got even easier.

1 more reply

Scarbutt4y ago

When your dev environment is on a remote server, the DX of vscode is superior, everything just works(like all your plugins), it's seamless and fast.

Tramp is great for editing some remote files here and there, but to match vscode you will have to put a lot of effort to make everything feel equally fast and make all your packages work. Even then it won't feel as seamless as vscode because it "cheats" by installing a remote component, and I don't find that to be a valid complaint since you are already installing your whole dev environment in the remote server.

Having said that (I'm not a vscode user), what I always do is use Emacs on the remote server inside tmux. For me that's better and superior to the vscode remote plugin, my dev environment is local to my editor.

2 more replies

criddell4y ago

> How?

Microsoft has done a lot better job promoting VSCode than GNU has promoting Emacs for the past few years. More mindshare among influential developers / evangelists has lead to massive increases in adoption which leads to better extensions which in turn fuels more adoption.

It probably doesn't hurt that VSCode uses the MIT license.

1 more reply

ithrow4y ago

Every time you save your edits Tramp makes a new connection to the remote server, it's slow(1sec vs 1ms) and becomes annoying waiting for the save all the time. For doing quick edits it doesn't matter but for doing dev all day it does.

2 more replies

daptaq4y ago

Modulo being a proprietary extention.

tyingq4y ago

"Perhaps more interesting, though, is that for the last couple of years AWS has supported tunneling the SSH protocol over their SSM APIs if you use the SSM “document” called AWS-StartSSHSession."

That's interesting. I know some places go to great lengths to keep developers from accessing production without some sort of break-glass procedure through a jump host. I'm curious if they all know about this sort of loophole.

staticassertion4y ago

SSM is much preferred to a jump host for a number of reasons.

1. You don't have to expose a jump host at all, which is one less exposed asset to manage and worry about.

2. Your security team should already be collecting Cloudtrail logs, so they get auditing of SSM/SSH "for free".

3. You can control SSM access via your SSO provider, which means you can trivially enforce a bunch of policies all in one place vs having to configure SSHD.

4. You can control SSM access via IAM.

5. You can limit session duration easily.

6. No more SSH agent hijacking, at least I don't think.

I also wouldn't call this a loophole, you have to explicitly have permissions to use SSM.

tyingq4y ago

>I also wouldn't call this a loophole, you have to explicitly have permissions to use SSM.

Perhaps not the best wording on my part. I was aware of SSM, but not aware of the SSH tunneling features. I'm wondering if that's common. Is the SSH tunneling controlled separately, or on by default if SSM is on?

awsthro009454y ago

It is "on" by default, but the user still has to have the 'ssm:StartSession' permission (and probably others) to open the SSM session, and for some(?) operations you also still need to have the appropriate credentials (ssh keypair or a password) to login via SSH.

SSM Session Manager is one of the (if not the) preferred way to manage SSH access to instances in AWS. It's kinda hairy to set up, but it removes the need for bastion hosts/jump boxes for most use cases. From my experience I would say it is quite common.

wernerb4y ago

Installing Yet Another Agent on your cluster/VMS and ensuring they are updated while the SSM agent got an upgrade I believe from python to go it still does a lot more than just provide ssh sessions correct?

staticassertion4y ago

I don't really know much about the agent. I'm not super concerned with keeping it updated though.

gumby4y ago

Also forgetting to quote ~ commands when going through a jump host leads to unexpected behavior — usually disconnection!

gizdan4y ago

We've been using symops[0] which uses AWS-StartSSHSession document, but what's nice is it allows to set up different workflows for how people access servers. Plus all the advantages of SSM in general (IAM/SSO, CloudTrail etc).

[0] https://symops.com/

WaxProlix4y ago

As the article states, it's completely controlled by IAM and whatever federated identity management you hook up to AWS, and the events are auditable via cloudtrail etc.

narrator4y ago

Accessing production on the command line is an anti-pattern. I can only think of one good reason to do it: If one is investigating a security incident where a hacker has broken into production and screwed around. Even then, one would want to snapshot the instance and take it offline to investigate it.

If there's some tricky bug in production, then one can create some sort of debugging service that runs on another port and deploy it to investigate the bug, or use management and monitoring tools. Copying files up to production is something that should be only done by an automated deployment script.

skissane4y ago

> If there's some tricky bug in production, then one can create some sort of debugging service that runs on another port and deploy it to investigate the bug,

If you are under time pressure to fix an escalation from a high profile customer, and you don't have such a service yet, do you make the customer wait for you to write one, or do you just use command line access? Or else, if you already have such a service, but it doesn't contain the necessary diagnostics to investigate this particular problem, do you make the customer wait for you to enhance it, or do you just use command line access? Or you make your debug service totally generic – allow it to run arbitrary code supplied by the user – in which case it can do anything the command line can, but how is that actually any more secure than more standard means of command line access? Plus, it is going to be adding friction which may slow down resolution.

> or use management and monitoring tools.

Often these work fine for some problems, and then you get a problem which they don't cover adequately, and you need to go beyond them.

nijave4y ago

>Accessing production on the command line is an anti-pattern

Seems to be at odds with

>then one can create some sort of debugging service that runs on another port and deploy it to investigate the bug

In many cases, that's just SSH. In most cases, I'm not copying files around, I want to connect to the real environment where firewall rules, API keys, permission systems, overlay networks, etc are in place. If there's a stuck process (let's say, lock contention) it's much easier to just SSH on and run gdb and check the stack to see what it's doing. Some languages like Java have pretty rich tooling out of the box for remotely connecting to processes. Others, like Python and Ruby, you just use gdb

Either way, there's no copying data necessary--you just need access to the running process. For a large system with hundreds of identical servers, I don't want to deploy a debug service everywhere; I just want to connect to the one with an issue and check that.

Snapshotting works sometimes, but I used stuck processes as an example since that's usually where all this remote/log/etc stuff falls apart. And, as-it-so-happens, things like lock contention tend to be really hard to recreate in synthetic or simulated environments that don't have real, authentic load.

Keep in mind that doesn't mean "go crazy with `root` in production". You can combine that strategy with scripting and tooling to drain/isolate/quarantine servers where the stuck process is still running but they don't have live traffic being routed to them.

I see this "ZOMG NO ONE TOUCH PROD" mentality a lot in highly regulated environments but it's usually more sustainable to try to isolate in-scope system's functionality as narrowly as possible to avoid bringing unnecessarily large amounts of things in scope (e.g. put the billing functionality in a microservice to limit PCI scope)

paulddraper4y ago

That's the way things should work and ought to be done.

But what about when things don't work like they should and ought to?

Want to debug network connectivity issues? See which process is hogging CPU? Investigate installation/delpoy problems? Reinvent the wheel, or use what's already there.

hibbelig4y ago

If ssm-session-manager-plugin gives you a shell, then it should not be too hard to extend Tramp to use it directly. (I know nothing about ssm-session-manager-plugin.)

Tramp does not need scp to transfer files, it can just as easily multiplex them over the shell connection by using base64 or uu encoding.

gunapologist994y ago

> For a good wee while now, AWS SSM (or AWS Systems Manager as I see they are calling it nowadays) has arguably been the most secure way to permit controlled and audited access to an EC2 instance.

SSM is definitely not the most secure way[0]. SSM is super complex and super-integrated into the rest of AWS, and also isn't cross-cloud to GCP, Azure, DO, etc, so now everyone needs an account just to log into a Linux server.

Worse, IAM roles are powerful but easy to misconfigure, and that's before getting into how hard they are to apply with any granularity because of the policy length limitations[1], so you're likely giving everyone access to log into every instance without even knowing it.

0. https://cloudonaut.io/aws-ssm-is-a-trojan-horse-fix-it-now/

1. https://aws.amazon.com/premiumsupport/knowledge-center/iam-i...

tptacek4y ago

What does being cross-cloud have to do with whether SSM is the most secure way to SSH into an AWS instance?

gunapologist994y ago

Because everyone will need a (possibly misconfigured) AWS IAM account just to log into any Linux server.. this increases complexity and reduces isolation, compartmentalization, separation of concerns, least privilege, etc.

I was mentioning that particular misfeature because it was a personal annoyance of mine. Oh well, I suppose everything is about customer lock-in these days.

tptacek4y ago

It sounds like you don't think AWS is the most secure place to host an application. That's not the argument being made here; the argument stipulates AWS.

nijave4y ago

SSM supports BYO doesn't it? Can't you install the agent on any machine to enroll it in SSM or does that limit what you can do?

j / k navigate · click thread line to collapse

57 comments

thamer4y ago

https://www.emacswiki.org/emacs/TrampMode

tptacek4y ago

throwawayboise4y ago

Tramp can use FTP if you tell it to.

And it's still transferring files. It's not remotely editing.

taeric4y ago

Tramp is by far the most magical feeling trick in the emacs toolbox.

Scarbutt4y ago

vscode ate everyone's lunch in this department.

ak2174y ago

natrys4y ago

xemdetia4y ago

dangom4y ago

forty4y ago

taeric4y ago

How? Tramp has been working at this level for well over a decade, if I'm not mistaken. And you don't have to have anything installed in the "host" that you are connecting to.

squiddev4y ago

That said, this was a few years ago now. Things may have improved in 26.1 when threads were introduced, and async got even easier.

1 more reply

Scarbutt4y ago

When your dev environment is on a remote server, the DX of vscode is superior, everything just works(like all your plugins), it's seamless and fast.

2 more replies

criddell4y ago

> How?

It probably doesn't hurt that VSCode uses the MIT license.

1 more reply

ithrow4y ago

2 more replies

daptaq4y ago

Modulo being a proprietary extention.

tyingq4y ago

staticassertion4y ago

SSM is much preferred to a jump host for a number of reasons.

1. You don't have to expose a jump host at all, which is one less exposed asset to manage and worry about.

2. Your security team should already be collecting Cloudtrail logs, so they get auditing of SSM/SSH "for free".

3. You can control SSM access via your SSO provider, which means you can trivially enforce a bunch of policies all in one place vs having to configure SSHD.

4. You can control SSM access via IAM.

5. You can limit session duration easily.

6. No more SSH agent hijacking, at least I don't think.

I also wouldn't call this a loophole, you have to explicitly have permissions to use SSM.

tyingq4y ago

>I also wouldn't call this a loophole, you have to explicitly have permissions to use SSM.

awsthro009454y ago

wernerb4y ago

staticassertion4y ago

I don't really know much about the agent. I'm not super concerned with keeping it updated though.

gumby4y ago

Also forgetting to quote ~ commands when going through a jump host leads to unexpected behavior — usually disconnection!

gizdan4y ago

[0] https://symops.com/

WaxProlix4y ago

As the article states, it's completely controlled by IAM and whatever federated identity management you hook up to AWS, and the events are auditable via cloudtrail etc.

narrator4y ago

skissane4y ago

> If there's some tricky bug in production, then one can create some sort of debugging service that runs on another port and deploy it to investigate the bug,

> or use management and monitoring tools.

Often these work fine for some problems, and then you get a problem which they don't cover adequately, and you need to go beyond them.

nijave4y ago

>Accessing production on the command line is an anti-pattern

Seems to be at odds with

>then one can create some sort of debugging service that runs on another port and deploy it to investigate the bug

paulddraper4y ago

That's the way things should work and ought to be done.

But what about when things don't work like they should and ought to?

Want to debug network connectivity issues? See which process is hogging CPU? Investigate installation/delpoy problems? Reinvent the wheel, or use what's already there.

hibbelig4y ago

If ssm-session-manager-plugin gives you a shell, then it should not be too hard to extend Tramp to use it directly. (I know nothing about ssm-session-manager-plugin.)

Tramp does not need scp to transfer files, it can just as easily multiplex them over the shell connection by using base64 or uu encoding.

gunapologist994y ago

> For a good wee while now, AWS SSM (or AWS Systems Manager as I see they are calling it nowadays) has arguably been the most secure way to permit controlled and audited access to an EC2 instance.

0. https://cloudonaut.io/aws-ssm-is-a-trojan-horse-fix-it-now/

1. https://aws.amazon.com/premiumsupport/knowledge-center/iam-i...

tptacek4y ago

What does being cross-cloud have to do with whether SSM is the most secure way to SSH into an AWS instance?

gunapologist994y ago

I was mentioning that particular misfeature because it was a personal annoyance of mine. Oh well, I suppose everything is about customer lock-in these days.

tptacek4y ago

It sounds like you don't think AWS is the most secure place to host an application. That's not the argument being made here; the argument stipulates AWS.

nijave4y ago

SSM supports BYO doesn't it? Can't you install the agent on any machine to enroll it in SSM or does that limit what you can do?

j / k navigate · click thread line to collapse