AWS Certified Advanced Networking Series: VPC Pt8 (Flow Logs)

10 min readJan 28, 2021

This is the follow-up from Pt1 of flow logs.

Traffic through a NAT gateway

In this example, an instance in a private subnet accesses the internet through a NAT gateway that’s in a public subnet.

On the left is the custom flow log for the NAT gateway network interface captures the following fields in the following order.

Example of the flow log: The flow log shows the flow of traffic from the instance IP address (10.0.1.5) through the NAT gateway
network interface to a host on the internet (203.0.113.5). The NAT gateway network interface is a requester-managed network interface, therefore the flow log record displays a ‘-’ symbol for the instance-id field. The following line shows traffic from the source instance to the NAT gateway network interface. The values for the dstaddr and pkt-dstaddr fields are different. The dstaddr field displays the private IP address of the NAT gateway network interface, and the pkt-dstaddr field displays the final destination IP address of the host on the internet.

The two lines show the traffic from the NAT gateway network interface to the target host on the internet, and the response traffic from the host to the NAT gateway network interface.

For creating flow logs for the network interface for instance in private subnet; the instance-id field returns the ID of the instance that’s associated with the network interface. No difference for the dstaddr, pkt-dstaddr, srcaddr and pkt-srcaddr fields. Unlike the network interface for the NAT gateway, this network interface is not an intermediate network interface for traffic.

Traffic through a transit gateway

In this example, a client in VPC A connects to a web server in VPC B through a transit gateway. The client and server are in different Availability Zones. Therefore, traffic arrives at the server in VPC B using eni 11111111111111111 and leaves VPC B using eni-22222222222222222. create a custom flow log for VPC B with the following format.

Custom Flow Log Format

The lines above shows the flow log records demonstrate the flow of traffic on the network interface for the web server. The first line is the request traffic from the client, and the last line is the response traffic from the web server.

The following line is the request traffic on eni-11111111111111111, a requester-managed network interface for the transit gateway in subnet subnet-11111111aaaaaaaaa. The flow log record therefore displays a ‘-’ symbol for the instance-id field. The srcaddr field displays the private IP address of the transit gateway network interface, and the pkt-srcaddr field displays the source IP address of the client in VPC A.

The following line is the response traffic on eni-22222222222222222, a requester-managed network interface for the transit gateway in subnet subnet-22222222bbbbbbbbb. The dstaddr field displays the private IP address of the transit gateway network interface, and the pkt-dstaddr field displays the IP address of the client in VPC A.

Flow log limitations

To use flow logs, be aware of the following limitations:

· flow logs for network interfaces that are in the EC2-Classic platform cannot be enabled. This includes EC2-Classic instances that have been linked to a VPC through ClassicLink.

· Flow logs cannot be enabled for VPCs that are peered with another VPC unless the peer VPC is in the same account.

· After a flow log is created, its configurations or the flow log record format cannot be changed. For example, you can’t associate a different IAM role with the flow log, or add or remove fields in the flow log record. Instead, you can delete the flow log and create a new one with the required configuration.

· If the network interface has multiple IPv4 addresses and traffic is sent to a secondary private IPv4 address, the flow log displays the primary private IPv4 address in the dstaddr field. To capture the original destination IP address, create a flow log with the pkt-dstaddr field.

· If traffic is sent to a network interface and the destination is not any of the network interface’s IP addresses, the flow log displays the primary private IPv4 address in the dstaddr field. To capture the original destination IP address, create a flow log with the pkt-dstaddr field.

· If traffic is sent from a network interface and the source is not any of the network interface’s IP addresses, the flow log displays the primary private IPv4 address in the srcaddr field. To capture the original source IP address, create a flow log with the pkt-srcaddr field.

· If traffic is sent to or sent by a network interface, the srcaddr and dstaddr fields in the flow log always display the primary private IPv4 address, regardless of the packet source or destination. To capture the packet source or destination, create a flow log with the pkt-srcaddr and pkt-dstaddr fields.

· If creating flow log in a Region introduced after March 20, 2019 (an opt-in Region), such as Asia Pacific (Hong Kong) or Middle East (Bahrain), the destination Amazon S3 bucket must be in the same Region and the same AWS account as the flow log.

· If creating a flow log in a Region introduced before March 20, 2019, the destination Amazon S3 bucket must be in the same Region as the flow log, or in another Region introduced before March 20, 2019. Cannot specify an Amazon S3 bucket that’s in an opt-in Region.

· When network interface is attached to a Nitro-based instance, the aggregation interval is always 1 minute or less, regardless of the specified maximum aggregation interval.

Flow logs do not capture all IP traffic. The following types of traffic are not logged:
• Traffic generated by instances when they contact the Amazon DNS server. If you use your own DNS server, then all traffic to that DNS server is logged.
• Traffic generated by a Windows instance for Amazon Windows license activation.
• Traffic to and from 169.254.169.254 for instance metadata.
• Traffic to and from 169.254.169.123 for the Amazon Time Sync Service.
• DHCP traffic.
• Traffic to the reserved IP address for the default VPC router.

Traffic between an endpoint network interface and a Network Load Balancer network interface.

Publishing flow logs to Amazon S3

When publishing to Amazon S3, flow log data is published to an existing Amazon S3 bucket that is specified. Flow log records all of the monitored network interfaces are published to a series of log file objects that are stored in the bucket. If the flow log captures data for a VPC, the flow log publishes flow log records for all of the network interfaces in the selected VPC.

Flow log files

Flow logs collect flow log records, consolidate them into log files, and then publish the log files to the Amazon S3 bucket at 5-minute intervals. Each log file contains flow log records for the IP traffic recorded in the previous five minutes.

The maximum file size for a log file is 75 MB. If the log file reaches the file size limit within the 5-minute period, the flow log stops adding flow log records to it. Then it publishes the flow log to the Amazon S3 bucket, and creates a new log file.

Log files are saved to the specified Amazon S3 bucket using a folder structure that is determined by the flow log’s ID, Region, and the date on which they are created. The bucket folder structure uses the following format.

Similarly, the log file’s file name is determined by the flow log’s ID, Region, and the date and time that it was created by the flow logs service. File names use the following format.

Note: The timestamp uses the YYYYMMDDTHHmmZ format.

For example, the following shows the folder structure and file name of a log file for a flow log created by AWS account 123456789012, for a resource in the us-east-1 Region, on June 20, 2018 at 16:20 UTC. It includes flow log records for 16:15:00 to 16:19:59.

In Amazon S3, the Last modified field for the flow log file indicates the date and time at which the file was uploaded to the Amazon S3 bucket. This is later than the timestamp in the file name, and differs by the amount of time taken to upload the file to the Amazon S3 bucket.

IAM policy for IAM principals that publish flow logs to Amazon S3

An IAM principal in your account, such as an IAM user, must have sufficient permissions to publish flow logs to the Amazon S3 bucket. This includes permissions to work with specific logs: actions to create and publish the flow logs. The IAM policy must include the following permissions.

Amazon S3 bucket permissions for flow logs

· By default, Amazon S3 buckets and the objects they contain are private. Only the bucket owner can access the bucket and the objects stored in it. However, the bucket owner can grant access to other resources and users by writing an access policy.

· If the user creating the flow log owns the bucket, AWS automatically attach the following policy to the bucket to give the flow log permission to publish logs to it.

If the user creating the flow log does not own the bucket, or does not have the GetBucketPolicy and PutBucketPolicy permissions for the bucket, the flow log creation fails. In this case, the bucket owner must manually add the above policy to the bucket and specify the flow log creator’s AWS account ID.

If the bucket receives flow logs from multiple accounts, add a Resource element entry to the AWSLogDeliveryWrite policy statement for each account. For example, the following bucket policy allows AWS accounts 123123123123 and 456456456456 to publish flow logs to a folder named flow-logs in a bucket named log-bucket.

Note
It’s recommended to grant the AWSLogDeliveryAclCheck and AWSLogDeliveryWrite permissions to the log delivery service principal instead of individual AWS account ARNs.

Required CMK key policy for use with SSE-KMS buckets

If server-side encryption is enabled for the Amazon S3 bucket using AWS KMS-managed keys (SSEKMS) with a customer managed Customer Master Key (CMK), It’s a must to add key policy for the CMK so that flow logs can write log files to the bucket.

Note: Add these elements to the policy for the CMK, not the policy for the bucket.

Amazon S3 log file permissions

In addition to the required bucket policies, Amazon S3 uses access control lists (ACLs) to manage access to the log files created by a flow log. By default, the bucket owner has FULL_CONTROL permissions on each log file. The log delivery owner, if different from the bucket owner, has no permissions. The log delivery account has READ and WRITE permissions.

Processing flow log records in Amazon S3

· The log files are compressed. When opening the log files using the Amazon S3 console, it is decompressed and the flow log records are displayed. When the files are downloaded, it must be decompressed to view the flow log records.

· The flow log records can be queried using Amazon Athena. Amazon Athena is an interactive query service that makes it easier to analyze data in Amazon S3 using standard SQL.

Working with flow logs

work with flow logs using the Amazon EC2, Amazon VPC, CloudWatch, and Amazon S3 consoles.

Controlling the use of flow logs

By default, IAM users do not have permission to work with flow logs. However, by creating an IAM user policy that grants users the permissions to create, describe, and delete flow logs.

The following is an example policy that grants users full permissions to create, describe, and delete flow logs.

Some additional IAM role and permission configuration is required, depending on whether you’re publishing to CloudWatch Logs or Amazon S3.

Creating a flow log

flow logs can be created for VPCs, subnets, or network interfaces. Flow logs can publish data to CloudWatch Logs or Amazon S3.

Security best practices for your VPC

General guidelines and don’t represent a complete security solution.

The following are general best practices:
• Use multiple Availability Zone deployments so you have high availability.
• Use security groups and network ACLs.

• Use IAM policies to control access.
• Use Amazon CloudWatch to monitor your VPC components and VPN connections.
• Use flow logs to capture information about IP traffic going to and from network interfaces in your VPC.

Additional resources

· Control access to AWS resource & APIs by using identity federation, IAM users, and IAM roles. Setup a credential management policies and procedures for creating, distributing, rotating, and revoking AWS access credentials.

At last you have completed reading Flow Logs. I must admit its pretty long going over it but believe me its one of the topic that is tested in the exam.