( Draft) AWS SAA-C03 - Summary

Posted by monikma, 02 June 2024.
Architecture AWS Cloud
Preparation for certification

This is my brain dump for the AWS Certified Solutions Architect - Associate (SAA-C03) certificate preparation, using different sources. The exam passed with 76%. The following practice exams were very useful: Udemy Practice Exams.

Table of contents

General

EC2, ECS, Fargate

EC2 is a server in the cloud, you have access to a lot of stuff. Fargate is similar but serverless and you don’t have access to almost anything. ECS is for running dockerized EC2s or Fargate.

AMI is the predefined instance image, it can have Instance Store - emphemeral storage, or EBS. There are many ready images, but you can add your own.

EBS is just a hard drive for your instance, there are some SSD and some HDD options, some optimized for Throughput and some for I/O. Cold HDD is cheapest. Only SSDs can boot the system.

Instead of EBS you can also use EFS (Linux) or FSx (Windows or Lustre) for centralized drive, there it scales automatically for you. Lustre is something with big data (TODO check).

Security Groups are for controlling traffic to your EC2 instances (TODO Fargate too?). Until you specify your first outbound rule all outbound traffic is allowed. By default, all inbound traffic is forbidden.

It is worth here to mention EMR and AWS Glue for real big data processing. (TODO figure out).

Databases

RDS databases run on EC2 instances I think, at least in the background, but you don’t have access to the instances. Except Oracle something Custom something RDS, where they gave you a bit more control. Theoretically, you could also run your database on EC2 instance, but why would you, if you can use a ready service.

Database scaling. You can scale vertically or horizontally, depends, right. I vaguely remember RDS has some vertical autoscaling, at least for storage (I think only up). For horizontal scaling you use read-replicas. Aurora has many read replicas somehow by default, it has generally a lot of extra stuff by default, without impact on the price, therefore is most recommended (and “best”). Aurora serverless is another thing, there you don’t care about scaling anymore, and AWS claims it is easy to switch between both Auroras - so you could start with serverless to see how much you need. Aurora is based on PostgreSQL dialect.

There are some non RDS databases, mostly for special use cases, except DynamoDB which is the go-to NoSQL database, also key-value pairs it can do. It is important to remember about RedShift RDS, which is something for big data (TODO check).

Caching. ElastiCache has a Memcache and Redis modes, both can be used for caching, but there are differences in capabilities (TODO check). On top, Redis could be an independent DB (TODO check, NoSQL?). FOr DynamoDB you use DAX, but you could also use one of ElastiCache (don’t remember which one), but I don’t know why.

RedShift and DAX are in-memory (TODO, anything else in-memory? what does it even mean?).

Networking

First of all, it is all about VPCs. Some services must be in some VPC, some can be in some VPC but don’t have to (TODO check are any never in a VPC?). Some services when they are not in a VPC, they are in AWS default VPC (TODO check maybe all?), which you don’t have access to. On top you get default VPC for each region (172.31.0.0/16). Generally your EC2 instances, DBs are in a VPC. The best practice would be to have separate subnets per application tier (so webserver, DB, etc). Of course, always remember to give the least permissions.

The default VPC is setup such, that stuff for connecting to the Internet just works (e.g. having an EC2 with a public IP accessible from your browser Internet). However, if you create a Custom VPC, you will have to do some configuration upfront to achieve that.

For a VPC you need to specify the CIDR block, which is IP range. The bigger the number after the slash, the less IP addresses it has. Remember, for VPC Peering, the CIDR ranges of your VPCs cannot overlap. There are some popular address ranges that are typically assigned. And there is a website to decode the CIDR to concrete IP range. And some of those IPs will be reserved, so you always end up with a couple less.

Next, you have subnets, at least one per AZ, and they also have Security Groups assigned (like EC2). A Security Group needs to be attached to a VPC first. A subnet can be private (default) or public (there is one checkbox for it, but that’s not all you have to do to e.g. have an EC2 accessible from the browser). In order to allow one subnet to talk to another other one, you add SSH rule from one SG to the other.

There is at least 1 route table per VPC (one is called “main”, which is kind of fallback for unassigned subnets). A route table can have multiple subnets assigned.

If you want to create a Custom VPC, you will have to add all subnets yourself. If you want a public subnet, don’t forget about Internet Gateway. Assign it to the VPC, and add a route in the public subnet’s route table to it (you just pick the concrete IG from drop down).

To allow traffic one way from e.g. private subnet, create NAT Gateway in public subnet and add a route table route to it from the private one. You can also have a DYI NAT on an EC2 instance, it is then called NAT Instance (cheaper but less available and more maintenance overhead).

An (N)ACL list sits in front of the subnet, after the route table, and denies particular IPs in and/or out. It is stateless, so it won’t automatically allow an out for something that came in and vice versa, while Security Group will.

ENI is basic, ENA is better (I/O), most performance with EFA (I/O + fast).

Bullet points

S3

EBS

EC2

Kinesis

Improving request/response time

Databases

EFS

DirectConnect

Load balancers

Ongoing audit and security monitoring

Audit and security findings

On-Premise integrations/migrations

Big Data

Organizations

Other


Comments


Comments: