Product

Protect your AWS VPC Environments with Faster Log Processing

We're thrilled to announce the release of our new AWS VPC Pipeline Pack — a collection of pre-built configurations specifically designed to help you get more from your AWS VPC log data.
Parthiv Mathur
Technical Marketing Engineer
Dec 20, 2024
5 minutes
Subscribe to Our Newsletter
Get weekly product updates and industry news.
Share

See Edge Delta in Action

A staggering 98% of organizations worldwide currently use cloud-computing technologies to streamline their computing operations, typically through popular third-party providers like AWS, Microsoft Azure, and Google Cloud. Of course, cloud environments have their weaknesses. According to a recent IBM report, around 45% of security breaches are cloud-based, which is a significant concern in particular for companies that work with sensitive data. As a result, organizations are constantly looking to implement tools and practices to improve their overall security.

One popular technology teams can leverage to help protect their infrastructure from security threats are Virtual Private Clouds (VPCs). VPCs are isolated cloud segments contained within a public cloud that grant users full control over their cloud-based network infrastructure. Teams can use VPCs to protect their network by configuring their networking infrastructure in a number of different ways, including privatizing subnets, implementing route tables, creating security groups, and much more. 

For organizations using AWS, Amazon has their very own VPC service that allows users to create and manage VPCs hosted within the cloud. Amazon VPC components also generate VPC logs that capture information about the IP traffic going to and from the network interfaces within your VPC. However, these logs aren’t properly structured, making it extremely difficult to ensure your environment is secure via log-based analysis to identify and prevent network issues. 

Edge Delta’s AWS VPC JSON Pack is a specialized collection of processors built specifically for automatically converting AWS VPC logs into a more structured format, enabling you to use platforms like Datadog or Edge Delta to fuel analysis — no instrumentation needed. Our packs are built to easily slot into your Edge Delta Telemetry Pipelines — all you need to do is route the source data into the AWS VPC JSON Pack and let it begin processing.

If you are unfamiliar with Edge Delta’s Telemetry Pipelines, they are an intelligent pipeline product built to handle log, metric, trace, and event data. They are also an on-the-edge pipeline solution that begin processing data as it’s created at the source, providing you with greater control over your telemetry data at far lower costs.

How Does the AWS VPC JSON Pack Work?

The Edge Delta AWS VPC JSON Pack streamlines log transformation by automatically processing AWS VPC logs as they’re ingested. Once the processing is finished, these logs can be easily filtered, aggregated, and analyzed within the observability platforms of your choosing. 

The pack consists of a few different processing steps, each of which play a vital role in allowing teams to use VPC log data to ensure their networks are protected and operating smoothly.

Here’s a quick breakdown of the pack’s internals:

Data Filtering

The AWS VPC JSON Pack begins by using an Edge Delta Regex Filter Node to filter out the extraneous logs that begin their message with the word “version”, by:

  • Utilizing a regular expression ^version to match all log lines that begin with the word version
  • Removing all matched logs from the primary flow

This filter ensures that only the highest-value data continues through the pack’s subsequent processing steps.

  - name: omit_header_data
    type: regex_filter
    pattern: ^version
    negate: true

Field Extraction

The pack then leverages Edge Delta’s Grok Node to convert the unstructured log message into a structured log item, by:

  • Parsing the log message via a Grok expression to locate and extract the associated fields, including but not limited to the vpc_version, aws_account_id, and network_source_ip fields
  • Adding new fields in the attributes section of the log item for each one, and assigning to them the extracted values

Converting to structured data allows for operating directly on the extracted values, which greatly simplifies the log search and analysis processes. 

  - name: grok_extract_fields
    type: grok
    pattern:
      '%{INT:vpc_version} %{NOTSPACE:aws_account_id} (?:%{NOTSPACE:network_interface}|-)
     (?:%{NOTSPACE:network_source_ip}|-) (?:%{NOTSPACE:network_destination_ip}|-)
     (?:%{INT:network_client_port}|-) (?:%{INT:network_destination_port}|-) (?:%{NOTSPACE:network_protocol}|-)
     (?:%{INT:network_packet}|-) (?:%{INT:network_bytes_written}|-) %{INT:vpc_interval_start}
      %{INT:vpc_interval_end} (?:%{WORD:vpc_action}|-) %{WORD:vpc_status}.*'

Data Routing

After the structured format conversion, the AWS VPC JSON Pack then utilizes an Edge Delta Route Node to create two distinct data streams, by:

  • Evaluating the vpc_status field under the attributes section of each log item
  • Establishing two separate data streams by only forwarding logs which have a vpc_statusNODATA for further processing

All logs with a vpc_status of NODATA are still saved, and routed out of the pack in a separate rejected_logs output.

  - name: skip_nodata
    type: route
    paths:
      - path: all_non_nodata
        condition: item["attributes"]["vpc_status"] != "NODATA"
        exit_if_matched: true

Timestamp Transformation

After isolating the desired logs, the pack then performs timestamp normalization using the Log Transform Node by:

  • Updating the vpc_interval_end field to a Unix Milliseconds format 
  • Inserting the updated value back into the timestamp field using the Convert Timestamp Macro

This normalization process enables teams to get an accurate timeline of log events, which is critical for correlating, analyzing, and diagnosing performance issues and other anomalies over time.

  - name: log_transform_timestamp
    type: log_transform
    transformations:
      - field_path: item["timestamp"]
        operation: upsert
        value:
          convert_timestamp(item["attributes"]["vpc_interval_end"], "Unix Second",
          "Unix Milli")

Data Routing

One last routing step is performed, separating the logs into two distinct streams depending on whether their vpc_action field is set to REJECT or ACCEPT. Once separated, the pack routes the logs as follows:

  • Logs with vpc_action set to REJECT are routed to the rejected_l2m and rejected_logs nodes
  • Logs with the vpc_action set to ACCEPT are routed to the accepted_l2m and accepted_logs nodes
  • Logs that do not match either condition are routed to the {other_logs} node

This classification allows you to differentiate between accepted and rejected requests, which helps in isolating and investigating network security events. By routing logs based on specific actions, you can streamline your monitoring processes and focus on the logs that are most relevant to your network’s security, ensuring quicker identification and resolution of issues.

  - name: action_router
    type: route
    paths:
      - path: rejected
        condition: item["attributes"]["vpc_action"] == "REJECT"
        exit_if_matched: true
      - path: accepted
        condition: item["attributes"]["vpc_action"] == "ACCEPT"
        exit_if_matched: true

Log to Metric Conversions

Finally, the AWS VPC JSON Pack derives key metrics from the processed logs, including network bytes written, network packets, and VPC interval duration, via the Edge Delta Log to Metric Node, to enhance further analysis. More specifically, it:

  • Converts logs flowing through the accepted_l2m output into key metrics, facilitating effective reporting and monitoring
  • Converts logs flowing through the rejected_l2m output into key metrics, to pinpoint issues and anomalies

Pack Outputs

In total, there are five outputs from the AWS VPC JSON Pack:

  • accepted_logs – all logs that have vpc_action set to ACCEPT
  • rejected_Logs – all logs that have vpc_action set to REJECT
  • other_logs – all logs that don’t meet the specific conditions for acceptance or rejection
  • accepted_metrics_output – metrics derived from the accepted logs
  • rejected_metrics_output – metrics derived from the rejected logs

For more in-depth coverage of these processors and the AWS VPC JSON Pack, check out our full documentation.

AWS VPC JSON Pack in Practice

Once you’ve added the AWS VPC JSON Pack into your Edge Delta Pipeline, you can route any output to any destination of your choosing.

For instance, you can convert the {other} and {rejected} logs into patterns, and route them along with the derived metrics into Edge Delta’s backend for further analysis, as shown below:

Alternatively, you can use Edge Delta to store your VPC flow logs, and send only the derived metrics into a more expensive downstream destination like Datadog, while sending a full copy of raw data into S3 for compliance:

Getting Started

Ready to see our pack in action? Visit our pipeline sandbox to try it out for free! Already an Edge Delta customer? Check out our packs list and add the AWS VPC JSON pack to any running pipeline!

Stay in Touch

Sign up for our newsletter to be the first to know about new articles.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.