Setup a YARA Scanner in AWS Lambda

Oct. 3, 2022 // theatha


Hello everyone! I've been searching how AWS Lambda can be used for analyzing things for a while. So I could get a YARA Scanner that could work on the remote server without messing with the server side and it's environment. I will explain how I setup a simple YARA Scanner in AWS Lambda in this blog. If you want to discuss or ask me something, you can reach me from twitter.

What is AWS Lambda?

Lambda is a compute service that lets you run code without provisioning or managing servers. Lambda runs your code on a high-availability compute infrastructure and performs all of the administration of the compute resources, including server and operating system maintenance, capacity provisioning and automatic scaling, and logging. With Lambda, you can run code for virtually any type of application or backend service. Check the docs for more information.


The diagram of the structure is given below.


The AWS Lambda function must have a trigger. So, I used an S3 bucket as a trigger. Thus, the uploaded binary will trigger the lambda function. Also, you can use different AWS services as trigger. Check the docs for more information.


If a library other than standard libraries will be used, it must be added under Layers. I added all packages under the layers section for I used the "YARA" library.

Lambda Function

I preferred Python for the lambda function.

Installing YARA Package under Layers Section

The "YARA" library is not one of the standard libraries. So, all packages belonging to the "Yara" library must be added to the layers section.

  1. mkdir -p layer/python/lib/python3.8/site-packages/
  1. pip3 install yara-python -t layer/python/lib/python3.8/site-packages/
  1. cd layer
  1. zip -r *

A new layer is created by uploading the zip file and added to the lambda function.

However, this is not enough for the "YARA" library to work correctly. Here is the screenshot of log.

For solution, file must be added to lib/ directory under code source.

S3 Bucket as a Trigger

After the S3 bucket is created, it can be added as a trigger.

There is the rules directory and a malicious sample for scanning in the S3 bucket.

Policies, Permissions

In order to both download and list files from S3 Bucket, the given Policy file needs to be edited.

  1. {
  2. "Sid": "VisualEditor1",
  3. "Effect": "Allow",
  4. "Action": [
  5. "s3:GetObject",
  6. "s3:ListBucket",
  7. "logs:CreateLogGroup"
  8. ],
  9. "Resource": [
  10. "arn:aws:s3:::*/*"
  11. ]
  12. }

Lambda Function

I will explain the yara scanner code step by step. Importing related libraries.

  1. import yara
  2. import boto3

Adding it as a client because the S3 bucket will be used.

  1. s3_client = boto3.client("s3")

The triggered function is the lambda_handler function. For this reason, the desired operations must be written under this function.

  1. def lambda_handler(event, context):

Getting the names of yara rule files.

  1. # continued
  2. bucket_name = "<bucketname>"
  3. response = s3_client.list_objects_v2(Bucket=bucket_name)
  4. rule_list = []
  5. for content in response['Contents']:
  6. if content['Key'].startswith('rules'):
  7. rule_list.append(content['Key'])
  8. rule_list.pop(0)
  9. print("rules: " , rule_list)

Reading the file uploaded to S3 Bucket.

  1. # continued
  2. uploaded_file = event['Records'][0]['s3']['object']['key']
  3. print('Uploaded file: ', uploaded_file)
  4. response = s3_client.get_object(Bucket=bucket_name, Key=uploaded_file)
  5. uploaded_binary = response['Body'].read()

Compiling and running the yara rules on the uploaded file and outputting the match results.

  1. # continued
  2. match_status = []
  3. for i in rule_list:
  4. response = s3_client.get_object(Bucket=bucket_name, Key=i)
  5. data = response['Body'].read().decode('utf-8')
  6. rule = yara.compile(source=data)
  7. matches = rule.match(data=uploaded_binary)
  8. #print(matches)
  9. if not matches:
  10. match_status.append(f"{i} did not match {uploaded_file}")
  11. else:
  12. match_status.append(f"{i} matched {uploaded_file}")
  13. print(match_status)


CloudWatch was used for output. The following image shows the output after a file has been uploaded.

The Conclusion

A simple Yara Scanner was made with AWS Lambda. I like AWS Lambda because it's pretty simple and anything can be done easily. You can find source codes, lib, and python packages in "this" repo.

Thank you for reading my blog post! Have a nice day absolute legends!