Home Automated Approach for Malware Collection and Analysis

Automated Approach for Malware Collection and Analysis


For the last couple of weeks, I have been poking around with a remarkable open-source project called MWDB-Core, short for Malware Database Core. The beauty of this project is its simplicity!
It was developed in flask; therefore, you can expect it to be API-enabled! It also supports the Karton engine, allowing for analysis automation, integration to other projects, and more. While the project is still in constant development, I find it fully featured and has immense potential for growth. Hence, this blog ;)

Enter TheZoo

TheZoo is a customizable version of MWDB-Core with other plugins that will make a life of a Malware, CTI, Incident responder analyst less difficult. ):
With that said, let’s get technical!
As shown in the screenshot below, TheZoo contains 13 docker containers so far, starting from core and going to plugins:


To make it even more digestible, I have created a topology detailing TheZoo architecture. 2

1- MWDB-Core: This is the core engine for the project, it contains all the API/Underlying working of the projects.
2- MWDB-Web: This is the web GUI for the project.
3- Redis: This is an in-memory data store used by Karton engine
4- Postgres: This is an the database that contain the sample, users, relationships between files, basically all the database of the project with the exception of the samples files
5- Minio: This is an open-source version of AWS S3 service. It store files as objects. This highly recommended for scalability reason rather than storing the files locally on the filesystem.
6- Karton-System: This is an engine written in python that makes task ran on the sample easier and modular, as we will see later in the post.
7- Karton-Dashboard: This is a Karton plugin that serves as a web GUI for all the Karton tasks in the project.
8- Karton-Reporter: This is a Karton plugin that reports all the other plugins outputs back to MWDB-Web, more details later.
9- Karton-clissifier: This is a Karton plugin that clasify malware base on MIME Type.
10- Karton-strings: This is a Karton plugin that runs the strings command and reports the output to MWDB-Web using the reporter plugin.
11- Karton-floss: This is a Karton plugin that runs the flare-floss command and reports the output to MWDB-Web using the reporter plugin.
12- Karton-capa: This is a Karton plugin that runs the flare-capa command and reports the output of the capability of the malware to MWDB-Web using the reporter plugin.
13- Karton-: This is a Karton plugin that runs the yara command on the samples and reports the output to MWDB-Web using the reporter plugin.

How to Get Started?

In order to get started, there are three steps:
- Clone the project with the recursive flag as follows: git clone https://github.com/sh1dow3r/TheZoo --recursive
- Run the script main.py to add all the needed project variables, such as passwords, configuration variables, etc..
The script will generate a file called mwdb-vars.env; feel free to review the file and adjust it to your needs. I’ll include the resource in the last section to help de-obfuscate some of the variables. If you’re anxious to get it up and running, leave the file as-is:
python3 main.py
- Lastly, run the docker-compose file using the following command: docker-compose up -d
NOTE: This might take a while in the initial install, so be patient :)

How to use it?

Hopefully, by now, you have TheZoo up and running. You can check by using the command docker-compose ps. you should see all the containers state to be up as show in the first figure.
If all goes well, navigate to http://localhost:8080 and log in with the user admin and the password placed under MWDB_ADMIN_PASSWORD in the mwdb-vars.env file.

After logging in, the web interface should relatively be intuitive. feel free to poke around at your leisure!

For now, navigate to the upload tab in the top bar and upload a sample of your favorite malware. For me, it had to be Wannacry, not sure why!
An example of the what it should look like it shown below:

After uploading the sample, you will be redirected to the sample page, notice the url: each malware is stored at its sha256 value in the url, which is kind of neat way of storing uniq sample!

Moving on to our sample, you will notice a few things I highlighted in colored-boxes, depicted in the screenshot below: 5

Now let’s explain what’s going on the screenshot:
Each sample you upload will have a page that looks like this, with the power of Karton and Karton plugins, it should hopefully gives an insight of what the malware is doing without executing it (AKA Static Analysis!). In the screenshot each box represent a specific task ran in the background: - Red: Represents the output of the classifier container .
- Green: Represents the output of any yara_rule that matched the sample, please note that you have to write the Yara rule yourself and add to TheZoo_volume in order for it to get added matched with samples.
- Orange: Represents the output of the flare-floss Mandiant script found in github.
- Pink: Represents the output of the flare-capa Mandiant script found in github.
- Brown: Represents the output of the strings command of the executable.

Another awesome feature the MWDB-Core has is the relationship tab under each sample, which can be really valuable on the long run for CTI! 6

Last but not least, I’d like to show an example of what an output should look like of a malicious file, let’s take the flare-cape plugin, for instance, it other words if we click on the Pink color, what would it show? 7

Sounds interesting; I wanna help?

Ideas in this kind domain are rich, if you have an idea and would like to share, feel free to drop me an issue on Github, also PR are always appreciated!


In this blog post, I went over my approach to automating the collection and analysis of malware using the MWDB-Core platform. Before closing, I’d like to point out that I have only scratched the surface of what the platform can do, so I would highly recommend reading through their docs for further details. I hope you enjoyed the reading and as always, keep killing it!


MWDB docs

Karton Docs

Karton Playground

Karton Plugins

This post is licensed under CC BY 4.0 by the author.