White Hot Chocolate, Mathematics and Computer Science

Friday, April 5, 2019

Alienvault and its database

A few weeks ago, I discovered the hard way that Alienvault 5.7.1 does not clean old events from the database. And like many others, I found out that the patch to fix that, 5.7.2, would not install if there were more than 200 million events in the database.

Ensued a week-long marathon of cleaning the database, erasing files, removing the old events, splitting the queries into several sub queries as the database fought back. Or more precisely, there were so many things to delete that the DELETE queries failed due to the LOCK table being full.

Then I installed 5.7.2 and I thought my ordeal was over. It is not.

A couple of days ago, I got the dreaded email "disk space CRITICAL." Less than 10% disk space is available, which meant I had only a few hours before the system would start running into issues. So, clean-up time it was. Again.

Several people have had the same issue, but I found no information on how to solve the issue for a longer period of time, so I looked into the MySQL database to see if something could be done to avoid the system from crashing every couple of weeks.

Of all the databases, the one that takes the most space is alienvault_siem, and specifically three tables: acid_event, extra_data and po_acid_event. Each weighted more than 70GB on the disk, with extra_data occupying 174GB for about 196 million rows. My limit is set to 80 million and 14 days of retention in Alienvault which we now know is not correctly enforced.

DELETE-ing from a table does not reduce its size: this frees slots for new rows, but the size of the backing file will stay the same. To shrink the file to the minimum, the command OPTIMIZE TABLE is required: due to InnoDB being used, in-place operation is not possible: a temporary table is created, then it replaces the original table. This means that a fair amount of free space is required on the disk as one needs up to twice the size of the table to optimize. It also means that this may run into all kinds of fun problems if the system is really active.

But there was still the issue of locking too many rows at once and having the whole query fails. Time for some SQL.

By experimenting, I found that deleting about 5 million rows at once was fine, so I started by issuing commands such as DELETE FROM extra_data ORDER BY event_id LIMIT 2500000 and repeat it a certain number of times. This had drawbacks, for example I needed to be present to submit more queries once the set of current ones was exhausted. I could have created a source file for that tough, but that was not very dynamic and still required me to get a count of the rows: new rows are added all the time to this table, making 80 million a moving target.

Instead, I opted to use a SQL stored procedure for two reasons: I could set the target in a WHILE condition and, more importantly, I could run it automatically on a schedule to prevent the database to fill up the disk in the first place, and do it consistently without me needing to manually intervene.

Two tables need clean up: extra_data and acid_event. I saw that periodic queries are run that clean po_acid_event, though the way it is done may lead to the query failing to complete if you have more than 10 million rows added over the course of one day. Unfortunately, the WHERE clause is a simple timestamp < ......, and if it fails once, it may start failing consistently, leading again to the disk filling up.

The first procedure removes rows from extra_data until there are less than 85 million rows in the table by blocks of 2.5 million rows. Interestingly, it is faster to delete 2.5 million rows than it is to get the count of all the rows in the table.

DELIMITER $$
DROP PROCEDURE IF EXISTS clean_extra_data$$
CREATE PROCEDURE clean_extra_data()
BEGIN
DECLARE x INT;
DECLARE xtarget INT;
SET x = 0;
SELECT count(event_id) - 85000000 INTO xtarget FROM extra_data;
WHILE xtarget > x DO
DELETE FROM extra_data ORDER BY event_id LIMIT 2500000;
SET x = x + 2500000;
END WHILE;
END $$
DELIMITER ;

The second procedure removes the entries from acid_event by blocks of 2.5 million rows as long as the time the procedure is started minus 16 days is greater than the minimum timestamp present in the database.

DELIMITER $$
DROP PROCEDURE IF EXISTS clean_acid_event$$
CREATE PROCEDURE clean_acid_event()
BEGIN
DECLARE xlimit DATETIME;
DECLARE xmin DATETIME;
SELECT NOW() - INTERVAL 16 DAY INTO xlimit;
SELECT MIN(timestamp) INTO xmin FROM acid_event;
WHILE xlimit > xmin DO
DELETE FROM acid_event WHERE timestamp < xlimit LIMIT 2500000;
SELECT MIN(timestamp) INTO xmin FROM acid_event;
END WHILE;
END $$
DELIMITER ;

Initially, I thought adding the OPTIMIZE TABLE statement to shrink the files. After consideration, this is a potential source of problems: for example I have had MySQL shutting down under very low disk space, and as the OPTIMIZE TABLE query requires loads of additional space, this was just begging for more troubles.

Now, let's schedule these. It should run often enough so the execution of each does not take too long and the backing file does not grow out of control, but not too often to avoid impacting MySQL. I have settled to every Saturday around 10pm: my instance collects a maximum of 5 million events on a normal day, which totals to a maximum of 35 million rows to delete from each table every week. On my system, deleting 2.5 million rows from extra_data takes about 2 minutes, and counting the number of rows in the table around 15 minutes. For 35 million rows, that should run in under an hour. Removing entries from acid_event takes more time, around 10 minutes per block of 2.5 million rows: deleting 35 million rows will take around three hours, still doable.

DELIMITER $$
DROP PROCEDURE IF EXISTS clean_old_data$$
CREATE PROCEDURE clean_old_data()
BEGIN
CALL clean_acid_event()
CALL clean_extra_data()
END $$
DELIMITER ;
CREATE EVENT alienvault_is_worse_than_a_teen
ON SCHEDULE EVERY 1 WEEK
STARTS '2019-04-06 22:00:00'
ENDS '2019-06-30 00:00:00'
DO
CALL clean_old_data()

Lastly, the words of caution: this works for me as a temporary fix until Alienvault proposes a real solution to this problem. I am pretty certain this band-aid may create a lot of other issues down the road such as orphaned entries or events without data. There is no guarantee that this won't break something or won't render your system unusable, you use at your own risks.

Monday, December 31, 2018

Where does XP lurk?

Recently at a station waiting for a train, a screen showed something different than the usual. Here is the picture.

Good old Windows XP, lurking in the shadows. While this is not likely to be a critical system, this raises the question of how many of these old OS are still out there, running important services such as healthcare, government or financial applications.

Monday, September 10, 2018

The Computer As Crucible

The Computer as Crucible: An Introduction to Experimental Mathematics by Jonathan M. Borwein
My rating: 4 of 5 stars

Mathematics is that domain of science that is usually thought to be purely of the mind, but increasingly computers are helping mathematicians by identifying sequences, providing insights into complex systems, assisting with proofs and much more. This book gives a small peek into how the machines are, nowadays, been used and how.

Written in a clear style with numerous examples, the book reads easily and quickly and makes frequent references to other publications, often written by one of or both the authors. In addition, each chapter ends with a few exercises, some of which I have found to be rather challenging.

While this is probably not a book you will read to get a good grasp on "computer assisted mathematics", it is an interesting introduction to it.

View all my reviews

Tuesday, May 1, 2018

Spammer can ruin my life! Or can he now?

Recently, I got a message from a spammer (order (at) creativegrowers.com) who claimed he or she could ruin my life. Here is the message for your enjoyment.

Good day...

Do not regard on my English, Im from Japan.I installed the virus onto your device.At present I thiefted all privy background from your device. Furthermore I got some more evidence.The most important evidence that I received- its a videotape with your self-abusing.I put virus on a porn web site and after you downloaded it. As soon as you picked the video and clicked on a play, my malware at once set up on your system.
After adjusting, your web camera made the video with you wanking, additionally software captured exactly the video you chose. In next few days my deleterious soft found all your social media and email contacts.

If you need to delete the records- pay me 540 usd in BTC(cryptocurrency).

I provide you my Bitcoin wallet address - 1Ph5bArH1nN2HVKLLnKE3UB4ZstoEb8Gfc

You have 24 h. to go after reading. As soon as I receive transfer I will destroy the compromising forever. Other way I will send the video to all your colleagues and friends.

The source of the message is 167.99.44.161, which is on Digital Ocean. Furthermore, the message headers tend to indicate that the site "creativegrowers[.]com" got compromised and is spewing spam.

Blockchain.info shows that at least one payment was made to that address, for a total of 0.03 BTC on 2018-04-30, which represents a bit short of $300. Has someone fell for that scam? Is this something else? Anyway, I will keep an eye on the transaction to and from that address.

Thursday, April 12, 2018

Alienvault and Squid-Access logs

While playing with OSSIM and Squid, I found that the logs were not processed: though they were correctly received by the sensor and they appeared in /var/log/alienvault/agent/agent.log, I did not see any event being created or appearing in the server.

Whenever ossim-agent was restarted on the sensor, a message appeared in /var/log/alienvault/agent/agent.log at the first event received. This message ended with "Plugin sid not a number". In the line above that one, which contained the event as parsed by ossim-agent, indeed the plugin sid value was "TCP_TUNNEL".

In /etc/ossim/agent/plugins/squid.cfg, there is a translation table between the status (TCP_HIT, TCP_MISS, ...) and a numerical value. This translation does not exist for TCP_TUNNEL.

After adding it with the next available value to the translation table in squid.cfg and restarting the agent, the TCP_TUNNEL events generated by Squid appear as "Generic event" in Alienvault OSSIM. The rest of the data (source IP, destination IP, hostname et al.)

The same happened with the message TAG_NONE. Adding it to the corresponding plugin fixed the issue.

Thursday, January 18, 2018

Landscape

Brand Luther (book)

Here is a review by Jean: https://www.goodreads.com/review/show/2137840523