The reason to not include the endpoint by default is because VPCs should be secure by default. Everything is denied and unless you explicitly configure access to the Internet, it's unreachable. An attacker who manages to compromise a system in that VPC now has a means of data exfiltration in an otherwise air gapped set up.
It's annoying because this is by far the more uncommon case for a VPC, but I think it's the right way to structure, permissions and access in general. S3, the actual service, went the other way on this and has desperately been trying to reel it back for years.
Right, I can appreciate that argument - but then the right thing to do is to block S3 access from AWS VPCs until you have explicitly confirmed that you want to pay the big $$$$ to do so, or turn on the VPC endpoint.
A parallel to this is how SES handles permission to send emails. There are checks and hoops to jump through to ensure you can't send out spam. But somehow, letting DevOps folk shoot themselves in the foot (credit card) is ok.
What has been done is the monetary equivalent of "fail unsafe" => "succeed expensively"
s3 access is blocked from an EC2 by default unless you give the attached IAM role access to S3.
Then it is still blocked unless you add a NAT gateway or Internet gateway to the VPC and at a route to them.
If you are doing all of this via IAC, you have to take a lot of steps to make this happen. On the other hand, if I’m using an EC2 instance to run an ETL job from data stored on S3, I’m not putting that EC2 instance in a subnet with internet access in the first place. Why would I?
And no you don’t need internet access to access the EC2 instance ftom your computer even without a VPN. You use System Manager Session Manager.
I do the same with lambda - attach then to a VPC without internet access with the appropriate endpoints. Even if they are serving an API, they are still using an API gateway
There’s zero reason why AWS can’t pop up a warning if it detects this behavior though. It should clearly explain the implications to the end user. I mean EKS has all sorts of these warning flags it pops up on cluster health there’s really no reason why they can’t do the same here.
To be fair, while EKS warnings are useful, I've grown a habit to ignore them completely, since I've seen every single RDS cluster littered with "create a read replica please" and "enable performance insights" bs warnings.
The second someone doesn’t pay attention to that warning and suffers an exfiltration, like the cap1 s3 incident, it’s aws’ fault as far as the media is concerned.
I don't get your argument. If an ec2 needs access to an s3 resource, doesn't it need that role? Or otherwise, couldn't there be some global s3 URL filter that automagically routes same-region traffic appropriately if it is permitted?
My point is that, architecturally, is there ever in the history of AWS an example where a customer wants to pay for the transit of same-region traffic when a check box exists to say "do this for free"? Authorization and transit/path are separate concepts.
The EC2 needs credentials, but not necessarily a role. If someone is able to compromise an EC2 instance that has unrestricted S3 connectivity (no endpoint policies), they could use their own credentials to exfiltrate data to a bucket not associated with the account.
I'll have to dive in and take a look. I'm not arguing, but here is how I naively see it:
It seems there is a gap between "how things are" and "how things should be".
"Transiting the internet" vs. "Cost-free intra-region transit" is an entirely different question than "This EC2 has access to S3 bucket X" or "This EC2 does not have access to S3 bucket X".
Somewhere, somehow, that fact should be exposed in the design of the configuration of roles/permissions/etc. so that enabling cost-free intra-region S3 access does not implicitly affect security controls.
I agree. The real question is why do I need an "VPC endpoint" to save money in the first place?! us-east-1 EC2 isn't actually going over the internet to connect to us-east-1 S3, regardless or whether it's using a NAT gateway or VPC endpoint. AWS knows what routes are on its own network.
It's annoying because this is by far the more uncommon case for a VPC, but I think it's the right way to structure, permissions and access in general. S3, the actual service, went the other way on this and has desperately been trying to reel it back for years.